Hinrik Örn Sigurðsson > Perl6-Doc > share/Synopsis/S02-bits.pod

Download:
Perl6-Doc-0.47.tar.gz

Annotate this POD

Website

CPAN RT

Open  0
View/Report Bugs
Source  

TITLE ^

Synopsis 2: Bits and Pieces

AUTHORS ^

    Larry Wall <larry@wall.org>

VERSION ^

    Created: 10 Aug 2004

    Last Modified: 19 Nov 2010
    Version: 230

This document summarizes Apocalypse 2, which covers small-scale lexical items and typological issues. (These Synopses also contain updates to reflect the evolving design of Perl 6 over time, unlike the Apocalypses, which are frozen in time as "historical documents". These updates are not marked--if a Synopsis disagrees with its Apocalypse, assume the Synopsis is correct.)

One-pass parsing ^

To the extent allowed by sublanguages' parsers, Perl is parsed using a one-pass, predictive parser. That is, lookahead of more than one "longest token" is discouraged. The currently known exceptions to this are where the parser must:

Lexical Conventions ^

Whitespace and Comments ^

Built-In Data Types ^

Native types

Values with these types autobox to their uppercase counterparts when you treat them as objects:

    bit         single native bit
    int         native signed integer
    uint        native unsigned integer (autoboxes to Int)
    buf         native buffer (finite seq of native ints or uints, no Unicode)
    rat         native rational
    num         native floating point
    complex     native complex number
    bool        native boolean

Since native types cannot represent Perl's concept of undefined values, in the absence of explicit initialization, native floating-point types default to NaN, while integer types (including bit) default to 0. The complex type defaults to NaN + NaN\i. A buf type of known size defaults to a sequence of 0 values. If any native type is explicitly initialized to * (the Whatever type), no initialization is attempted and you'll get whatever was already there when the memory was allocated.

If a buf type is initialized with a Unicode string value, the string is decomposed into Unicode codepoints, and each codepoint shoved into an integer element. If the size of the buf type is not specified, it takes its length from the initializing string. If the size is specified, the initializing string is truncated or 0-padded as necessary. If a codepoint doesn't fit into a buf's integer type, a parse error is issued if this can be detected at compile time; otherwise a warning is issued at run time and the overflowed buffer element is filled with an appropriate replacement character, either U+FFFD (REPLACEMENT CHARACTER) if the element's integer type is at least 16 bits, or U+007f (DELETE) if the larger value would not fit. If any other conversion is desired, it must be specified explicitly. In particular, no conversion to UTF-8 or UTF-16 is attempted; that must be specified explicitly. (As it happens, conversion to a buf type based on 32-bit integers produces valid UTF-32 in the native endianness.)

The Mu type

Among other things, Mu is named after the eastern concept of "Mu" or 無 (see http://en.wikipedia.org/wiki/MU, especially the "Mu_(negative)" entry), so in Perl 6 it stands in for Perl 5's concept of "undef" when that is used as a noun. However, Mu is also the "nothing" from which everything else is derived via the undefined type objects, so it stands in for the concept of "Object" as used in languages like Java. Or think of it as a "micro" or µ-object that is the basis for all other objects, something atomic like a Muon. Or if acronyms make you happy, there are a variety to pick from:

    Most Universal
    More Undefined
    Modern Undef
    Master Union
    Meta Ur
    Mega Up
    ...

Or just think of it as a sound a cow makes, which simultaneously means everything and nothing.

Undefined types

Perl 6 does not have a single value representing undefinedness. Instead, objects of various types can carry type information while nevertheless remaining undefined themselves. Whether an object is defined is determined by whether .defined returns true or not. These typed objects typically represent uninitialized values. Failure objects are also officially undefined despite carrying exception information; these may be created using the fail function, or by direct construction of an exception object of some sort. (See S04 for how failures are handled.)

    Mu          Most Undefined
    Failure     Failure (lazy exceptions, thrown if not handled properly)

Whenever you declare any kind of type, class, module, or package, you're automatically declaring a undefined prototype value with the same name, known as the type object. The name itself returns that type object:

    Mu          Perl 6 object (default block parameter type, Any, Junction, or Each)
    Any         Perl 6 object (default routine parameter type, excludes junction)
    Cool        Perl 6 Convenient OO Loopbacks
    Whatever    Wildcard (like Any, but subject to do-what-I-mean via MMD)
    Int         Any Int object
    Widget      Any Widget object

Type objects stringify to their name with empty parens concatenated. Note that type objects are not classes, but may be used to name classes:

    Widget.new()        # create a new Widget

Whenever a Failure value is put into a typed container, it takes on the type specified by the container but continues to carry the Failure role. Use fail to return specific failures. Use Mu for the most generic non-failure undefined value. The Any type, derived from Mu, is also undefined, but excludes Junction and Each types so that autothreading may be dispatched using normal multiple dispatch rules. All user-defined classes derive from the Any class by default. The Whatever type is derived from Any but nothing else is derived from it.

Immutable types

Objects with these types behave like values, i.e. $x === $y is true if and only if their types and contents are identical (that is, if $x.WHICH eqv $y.WHICH).

    Str         Perl string (finite sequence of Unicode characters)
    Bit         Perl single bit (allows traits, aliasing, undefinedness, etc.)
    Int         Perl integer (allows Inf/NaN, arbitrary precision, etc.)
    Num         Perl number (approximate Real, generally via floating point)
    Rat         Perl rational (exact Real, limited denominator)
    FatRat      Perl rational (unlimited precision in both parts)
    Complex     Perl complex number
    Bool        Perl boolean
    Exception   Perl exception
    Block       Executable objects that have lexical scopes
    Seq         A list of values (can be generated lazily)
    Range       A pair of Ordered endpoints
    Set         Unordered collection of values that allows no duplicates
    Bag         Unordered collection of values that allows duplicates
    Enum        An immutable Pair
    EnumMap     A mapping of Enums with no duplicate keys
    Signature   Function parameters (left-hand side of a binding)
    Parcel      Arguments in a comma list
    LoL         Arguments in a semicolon list
    Capture     Function call arguments (right-hand side of a binding)
    Blob        An undifferentiated mass of ints, an immutable Buf
    Instant     A point on the continuous atomic timeline
    Duration    The difference between two Instants
    HardRoutine A routine that is committed to not changing

Set values may be composed with the set listop or method. Bag values may be composed with the bag listop or method.

Instants and Durations are measured in atomic seconds with fractions. Notionally they are real numbers which may be implemented in any Real type of sufficient precision, preferably a Rat or FatRat. (Implementations that make fixed-point assumptions about the available subsecond precision are discouraged; the user interface must act like real numbers in any case.) Interfaces that take Duration arguments, such as sleep(), may also take Real arguments, but Instant arguments must be explicitly created via any of various culturally aware time specification APIs. A small number of Instant values that represent common epoch instant values are also available.

In numeric context a Duration happily returns a Rat or FatRat representing the number of seconds. Instant values, on the other hand, are largely opaque, numerically speaking, and in particular are epoch agnostic. (Any epoch is just a particular Instant, and all times related to that epoch are really Instant ± Duration, which returns a new Instant.) In order to facilitate the writing of culturally aware time modules, the Instant type provides Instant values corresponding to various commonly used epochs, such as the 1958 TAI epoch, the POSIX epoch, the Mac epoch, and perhaps the year 2000 epoch as UTC thinks of it. There's no reason to exclude any useful epoch that is well characterized in atomic seconds. All normal times can be calculated from those epoch instants using addition and subtraction of Duration values. Note that the Duration values are still just atomic time without any cultural deformations; in particular, the Duration formed of by subtracting Instant::Epoch::POSIX from the current instant will contain more seconds than the current POSIX time() due to POSIX's abysmal ignorance of leap seconds. This is not the fault of the universe, which is not fooled (neglecting relativistic considerations). Instants and Durations are always linear atomic seconds. Systems which cannot officially provide a steady time base, such as POSIX systems, will simply have to make their best guess as to the correct atomic time when asked to interconvert between cultural time and atomic time. Alternately, they may use some other less-official time mechanism to achieve steady clock behavior. Most Unix systems can count clock ticks, even if POSIX time types get confused.

Although the conceptual type of an Instant resembles FatRat, with arbitrarily large size in either numerator or denominator, the internal form may of course be optimized internally for "nearby" times, so that, if we know the year as an integer, the instant within the year can just be a Rat representing the offset from the beginning of the year. Calculations that fall within the same year can then be done in Rat rather than FatRat, or a table of yearly offsets can find the difference in integer seconds between two years, since (so far) nobody has had the nerve to propose fractional leap seconds. Or whatever. Instant is opaque, so we can swap implementations in and out without user-visible consequences.

The term now returns the current time as an Instant. As with the rand and self terms, it is not a function, so don't put parens after it. It also never looks for arguments, so the next token should be an operator or terminator.

    now + 300   # the instant five minutes from now

Basic math operations are defined for instants and durations such that the sum of an instant and a duration is always an instant, while the difference of two instants is always a duration. Math on instants may only be done with durations (or numbers that will be taken as durations, as above); you may not add two instants.

    $instant + $instant      # WRONG
    $instant - $instant      # ok, returns a duration
    $instant + $duration     # ok, returns an instant

Numeric operations on durations return Duration where that makes sense (addition, subtraction, modulus). The type returned for other numeric operations is unspecified; they may return normal numeric types or they may return other dimensional types that attempt to assist in dimensional analysis. (The latter approach should likely require explicit declaration for now, until we can demonstrate that it does not adversely impact the average programmer, and that it plays well with the concept of gradual typing.)

The Blob type is like an immutable buffer, and therefore responds both to array and (some) stringy operations. Note that, like a Buf, its size is measured in whatever the base unit is, which is not always bytes. If you have a my Blob[bit] $blob, then $blob.elems returns the number of bits in it. As with buffers, various native types are automatically derived from native unsigned int types:

    blob1       Blob[bit], a bit string
    blob2       Blob[uint2], a DNA sequence?
    blob3       Blob[uint[3]], an octal string
    blob4       Blob[uint4], a hex string
    blob8       Blob[uint8], a byte string
    blob16      Blob[uint16]
    blob32      Blob[uint32]
    blob64      Blob[uint64]

These types do (at least) the following roles:

    Class       Roles
    =====       =====
    Str         Stringy
    Bit         Numeric Boolean Integral
    Int         Numeric Integral
    Num         Numeric Real
    Rat         Numeric Real Rational
    FatRat      Numeric Real Rational
    Complex     Numeric
    Bool        Boolean
    Exception   Failure
    Block       Callable
    Seq         Iterable
    Range       Iterable
    Set         Associative[Bool] Iterable
    Bag         Associative[UInt] Iterable
    Enum        Associative
    EnumMap     Associative Positional Iterable
    Signature   
    Parcel      Positional
    Capture     Positional Associative
    Blob        Stringy Positional
    Instant     Real
    Duration    Real
    HardRoutine Routine

[Conjecture: Stringy may best be split into 2 roles where both Str and Blob compose the more general one and just Str composes a less general one. The more general of those would apply to what is common to any dense sequence ("string") that Str and Blob both are (either of characters or bits or integers etc), and the string operators like catenation (~) and replication (x, xx) would be part of the more general role. The more specific role would apply to Str but not Blob and includes any specific operators that are specific to characters and don't apply to bits or integers etc. The other alternative is to more clearly distance character strings from bit strings, keeping ~/etc for character strings only and adding an analogy for bit strings.]

The Iterable role indicates not that you can iterate the type directly, but that you can request the type to return an iterator. Iterable types may have multiple iterators (lists) running across them simultaneously, but an iterator/list itself has only one thread of consumption. Every time you do get on an iterator, a value disappears from its list.

Note that Set and Bag iterators return only keys, not values. You must explicitly use c<.pairs> to get key/value pairs.

Mutable types

Objects with these types have distinct .WHICH values that do not change even if the object's contents change. (Routines are considered mutable because they can be wrapped in place.)

    Iterator    Perl list
    SeqIter     Iterator over a Seq
    RangeIter   Iterator over a Range
    Scalar      Perl scalar
    Array       Perl array
    Hash        Perl hash
    KeySet      KeyHash of Bool (does Set in list/array context)
    KeyBag      KeyHash of UInt (does Bag in list/array context)
    Pair        A single key-to-value association
    PairSeq     A Seq of Pairs
    Buf         Perl buffer (array of integers with some stringy features)
    IO          Perl filehandle
    Routine     Base class for all wrappable executable objects
    Sub         Perl subroutine
    Method      Perl method
    Submethod   Perl subroutine acting like a method
    Macro       Perl compile-time subroutine
    Regex       Perl pattern
    Match       Perl match, usually produced by applying a pattern
    Stash       A symbol table hash (package, module, class, lexpad, etc)
    SoftRoutine A routine that is committed to staying mutable

The KeyHash role differs from a normal Associative hash in how it handles default values. If the value of a KeyHash element is set to the default value for the KeyHash, the element is deleted. If undeclared, the default default for a KeyHash is 0 for numeric types, False for boolean types, and the null string for string and buffer types. A KeyHash of an object type defaults to the undefined prototype for that type. More generally, the default default is whatever defined value a Nil would convert to for that value type. A KeyHash of Scalar deletes elements that go to either 0 or the null string. A KeyHash also autodeletes keys for normal undefined values (that is, those undefined values that do not contain an unthrown exception).

A KeySet is a KeyHash of booleans with a default of False. If you use the Hash interface and increment an element of a KeySet its value becomes true (creating the element if it doesn't exist already). If you decrement the element it becomes false and is automatically deleted. Decrementing a non-existing value results in a False value. Incrementing an existing value results in True. When not used as a Hash (that is, when used as an Array or list or Set object) a KeySet behaves as a Set of its keys. (Since the only possible value of a KeySet is the True value, it need not be represented in the actual implementation with any bits at all.)

A KeyBag is a KeyHash of UInt with default of 0. If you use the Hash interface and increment an element of a KeyBag its value is increased by one (creating the element if it doesn't exist already). If you decrement the element the value is decreased by one; if the value goes to 0 the element is automatically deleted. An attempt to decrement a non-existing value results in a Failure value. When not used as a Hash (that is, when used as an Array or list or Bag object) a KeyBag behaves as a Bag of its keys, with each key replicated the number of times specified by its corresponding value. (Use .kv or .pairs to suppress this behavior in list context.)

As with Hash types, Pair and PairSeq are mutable in their values but not in their keys. (A key can be a reference to a mutable object, but cannot change its .WHICH identity. In contrast, the value may be rebound to a different object, just as a hash element may.)

The following roles are supported:

    Iterator    List
    Scalar      
    Array       Positional Iterable
    Hash        Associative
    KeySet      KeyHash[Bool]
    KeyBag      KeyHash[UInt]
    KeyHash     Associative
    Pair        Associative
    PairSeq     Associative Postional Iterable
    Buf         Stringy
    IO          
    Routine     Callable
    Sub         Callable
    Method      Callable
    Submethod   Callable
    Macro       Callable
    Regex       Callable
    Match       Positional Associative
    Stash       Associative
    SoftRoutine Routine

Types that do the List role are generally hidden from casual view, since iteration is typically triggered by context rather than by explicit call to the iterator's .get method. Filehandles are a notable exception.

See "Wrapping" in S06 for a discussion of soft vs. hard routines.

Value types

Explicit types are optional. Perl variables have two associated types: their "value type" and their "implementation type". (More generally, any container has an implementation type, including subroutines and modules.) The value type is stored as its of property, while the implementation type of the container is just the object type of the container itself. The word returns is allowed as an alias for of.

The value type specifies what kinds of values may be stored in the variable. A value type is given as a prefix or with the of keyword:

    my Dog $spot;
    my $spot of Dog;

In either case this sets the of property of the container to Dog.

Subroutines have a variant of the of property, as, that sets the as property instead. The as property specifies a constraint (or perhaps coercion) to be enforced on the return value (either by explicit call to return or by implicit fall-off-the-end return). This constraint, unlike the of property, is not advertised as the type of the routine. You can think of it as the implicit type signature of the (possibly implicit) return statement. It's therefore available for type inferencing within the routine but not outside it. If no as type is declared, it is assumed to be the same as the of type, if declared.

    sub get_pet() of Animal {...}       # of type, obviously
    sub get_pet() returns Animal {...}  # of type
    our Animal sub get_pet() {...}      # of type
    sub get_pet() as Animal {...}       # as type

A value type on an array or hash specifies the type stored by each element:

    my Dog @pound;  # each element of the array stores a Dog

    my Rat %ship;   # the value of each entry stores a Rat

The key type of a hash may be specified as a shape trait--see S09.

Implementation types

The implementation type specifies how the variable itself is implemented. It is given as a trait of the variable:

    my $spot is Scalar;             # this is the default
    my $spot is PersistentScalar;
    my $spot is DataBase;

Defining an implementation type is the Perl 6 equivalent to tying a variable in Perl 5. But Perl 6 variables are tied directly at declaration time, and for performance reasons may not be tied with a run-time tie statement unless the variable is explicitly declared with an implementation type that does the Tieable role.

However, package variables are always considered Tieable by default. As a consequence, all named packages are also Tieable by default. Classes and modules may be viewed as differently tied packages. Looking at it from the other direction, classes and modules that wish to be bound to a global package name must be able to do the Package role.

Hierarchical types

A non-scalar type may be qualified, in order to specify what type of value each of its elements stores:

    my Egg $cup;                       # the value is an Egg
    my Egg @carton;                    # each elem is an Egg
    my Array of Egg @box;              # each elem is an array of Eggs
    my Array of Array of Egg @crate;   # each elem is an array of arrays of Eggs
    my Hash of Array of Recipe %book;  # each value is a hash of arrays of Recipes

Each successive of makes the type on its right a parameter of the type on its left. Parametric types are named using square brackets, so:

    my Hash of Array of Recipe %book;

actually means:

    my Hash:of(Array:of(Recipe)) %book;

Because the actual variable can be hard to find when complex types are specified, there is a postfix form as well:

    my Hash of Array of Recipe %book;           # HoHoAoRecipe
    my %book of Hash of Array of Recipe;        # same thing

The as form may be used in subroutines:

    my sub get_book ($key) as Hash of Array of Recipe {...}

Alternately, the return type may be specified within the signature:

    my sub get_book ($key --> Hash of Array of Recipe) {...}

There is a slight difference, insofar as the type inferencer will ignore a as but pay attention to --> or prefix type declarations, also known as the of type. Only the inside of the subroutine pays attention to as, and essentially coerces the return value to the indicated type, just as if you'd coerced each return expression.

You may also specify the of type as the of trait (with returns allowed as a synonym):

    my Hash of Array of Recipe sub get_book ($key) {...}
    my sub get_book ($key) of Hash of Array of Recipe {...}
    my sub get_book ($key) returns Hash of Array of Recipe {...}

Polymorphic types

Anywhere you can use a single type you can use a set of types, for convenience specifiable as if it were an "or" junction:

    my Int|Str $error = $val;              # can assign if $val~~Int or $val~~Str

Fancier type constraints may be expressed through a subtype:

    subset Shinola of Any where {.does(DessertWax) and .does(FloorTopping)};
    if $shimmer ~~ Shinola {...}  # $shimmer must do both interfaces

Since the terms in a parameter could be viewed as a set of constraints that are implicitly "anded" together (the variable itself supplies type constraints, and where clauses or tree matching just add more constraints), we relax this to allow juxtaposition of types to act like an "and" junction:

    # Anything assigned to the variable $mitsy must conform
    # to the type Fish and either the Squirrel or Dog type...
    my Squirrel|Dog Fish $mitsy = new Fish but { Bool.pick ?? .does Squirrel
                                                           !! .does Dog };

[Note: the above is a slight lie, insofar as parameters are currently restricted for 6.0.0 to having only a single main type for the formal variable until we understand MMD a bit better.]

Parameter types

Parameters may be given types, just like any other variable:

    sub max (int @array is rw) {...}
    sub max (@array of int is rw) {...}

Generic types

Within a declaration, a class variable (either by itself or following an existing type name) declares a new type name and takes its parametric value from the actual type of the parameter it is associated with. It declares the new type name in the same scope as the associated declaration.

    sub max (Num ::X @array) {
        push @array, X.new();
    }

The new type name is introduced immediately, so two such types in the same signature must unify compatibly if they have the same name:

    sub compare (Any ::T $x, T $y) {
        return $x eqv $y;
    }

Return types

On a scoped subroutine, a return type can be specified before or after the name. We call all return types "return types", but distinguish two kinds of return types, the as type and the of type, because the of type is normally an "official" named type and declares the official interface to the routine, while the as type is merely a constraint on what may be returned by the routine from the routine's point of view.

    our sub lay as Egg {...}            # as type
    our Egg sub lay {...}               # of type
    our sub lay of Egg {...}            # of type
    our sub lay (--> Egg) {...}         # of type

    my sub hat as Rabbit {...}          # as type
    my Rabbit sub hat {...}             # of type
    my sub hat of Rabbit {...}          # of type
    my sub hat (--> Rabbit) {...}       # of type

If a subroutine is not explicitly scoped, it defaults to my scoping. Any return type must go after the name:

    sub lay as Egg {...}                # as type
    sub lay of Egg {...}                # of type
    sub lay (--> Egg) {...}             # of type

On an anonymous subroutine, any return type can only go after the sub keyword:

    $lay = sub as Egg {...};            # as type
    $lay = sub of Egg {...};            # of type
    $lay = sub (--> Egg) {...};         # of type

but you can use the anon scope declarator to introduce an of prefix type:

    $lay = anon Egg sub {...};            # of type
    $hat = anon Rabbit sub {...};         # of type

The return type may also be specified after a --> token within the signature. This doesn't mean exactly the same thing as as. The of type is the "official" return type, and may therefore be used to do type inferencing outside the sub. The as type only makes the return type available to the internals of the sub so that the return statement can know its context, but outside the sub we don't know anything about the return value, as if no return type had been declared. The prefix form specifies the of type rather than the as type, so the return type of

    my Fish sub wanda ($x) { ... }

is known to return an object of type Fish, as if you'd said:

    my sub wanda ($x --> Fish) { ... }

not as if you'd said

    my sub wanda ($x) as Fish { ... }

It is possible for the of type to disagree with the as type:

    my Squid sub wanda ($x) as Fish { ... }

or equivalently,

    my sub wanda ($x --> Squid) as Fish { ... }

This is not lying to yourself--it's lying to the world. Having a different inner type is useful if you wish to hold your routine to a stricter standard than you let on to the outside world, for instance.

The Cool class (and package) ^

The Cool type is derived from Any, and contains all the methods that are "cool" (as in, "I'm cool with an argument of that type.").

More specifically, these are the methods that are culturally universal, insofar as the typical user will expect the name of the method to imply conversion to a particular built-in type that understands the method in question. For instance, $x.abs implies conversion to an appropriate numeric type if $x is "cool" but doesn't already support a method of that name. Conversely, $x.substr implies conversion to a string or buffer type.

The Cool module also contains all multisubs of last resort; these are automatically searched if normal multiple dispatch does not find a viable candidate. Note that the Cool package is mutable, and both single and multiple dispatch must take into account changes there for the purposes of run-time monkey patching. However, since the multiple dispatcher uses the Cool package only as a failover, compile-time analysis of such dispatches is largely unaffected for any arguments with an exact or close match. Likewise any single dispatch a method that is more specific than the Cool class is not affected by the mutability of Cool. User-defined classes don't derive from Cool by default, so such classes are also unaffected by changes to Cool.

Names and Variables ^

Names ^

Literals ^

Context ^

Lists ^

Files ^

Properties ^

Grammatical Categories ^

Lexing in Perl 6 is controlled by a system of grammatical categories. At each point in the parse, the lexer knows which subset of the grammatical categories are possible at that point, and follows the longest-token rule across all the active alternatives, including those representing any grammatical categories that are ready to match. See S05 for a detailed description of this process.

To get a list of the current categories, grep 'token category:' from STD.pm6.

Category names are used as the short name of both various operators and the rules that parse them, though the latter include an extra "sym":

    infix:<cmp>           # the infix cmp operator
    infix:sym<cmp>        # the rule that parses cmp

As you can see, the extention of the name uses colon pair notation. The :sym typically takes an argument giving the string name of the operator; some of the "circumfix" categories require two arguments for the opening and closing strings. Since there are so many match rules whose symbol is an identifier, we allow a shorthand:

    infix:cmp             # same as infix:sym<cmp> (not infix:<cmp>)

Conjecturally, we might also have other kinds of rules, such as tree rewrite rules:

    infix:match<cmp>      # rewrite a match node after reducing its arguments
    infix:ast<cmp>        # rewrite an ast node after reducing its arguments

Within a grammar, matching the proto subrule <infix> will match all visible rules in the infix category as parallel alteratives, as if they were separated by '|'.

Here are some of the names of parse rules in STD:

    category:sym<prefix>                           prefix:<+>
    circumfix:sym<[ ]>                             [ @x ]
    dotty:sym<.=>                                  $obj.=method
    infix_circumfix_meta_operator:sym['»','«']     @a »+« @b
    infix_postfix_meta_operator:sym<=>             $x += 2;
    infix_prefix_meta_operator:sym<!>              $x !~~ 2;
    infix:sym<+>                                   $x + $y
    package_declarator:sym<role>                   role Foo;
    postcircumfix:sym<[ ]>                         $x[$y] or $x.[$y]
    postfix_prefix_meta_operator:sym('»')          @array »++
    postfix:sym<++>                                $x++
    prefix_circumfix_meta_operator:sym<[ ]>       [*]
    prefix_postfix_meta_operator:sym('«')          -« @magnitudes
    prefix:sym<!>                                  !$x (and $x.'!')
    quote:sym<qq>                                  qq/foo/
    routine_declarator:sym<sub>                    sub foo {...}
    scope_declarator:sym<has>                      has $.x;
    sigil:sym<%>                                   %hash
    special_variable:sym<$!>                       $!
    statement_control:sym<if>                      if $condition { 1 } else { 2 }
    statement_mod_cond:sym<if>                     .say if $condition
    statement_mod_loop:sym<for>                    .say for 1..10
    statement_prefix:sym<gather>                   gather for @foo { .take }
    term:sym<!!!>                                  $x = { !!! }
    trait_mod:sym<does>                            my $x does Freezable
    twigil:sym<?>                                  $?LINE
    type_declarator:sym<subset>                    subset Nybble of Int where ^16

Note that some of these produce correspondingly named operators, but not all of them. When they do correspond (such as in the cmp example above), this is by convention, not by enforcement. (However, matching <sym> within one of these rules instead of the literal operator makes it easier to set up this correspondence in subsequent processing.)

The STD::Regex grammar also adds these:

    assertion:sym<!>                         /<!before \h>/
    backslash:sym<w>                         /\w/ and /\W/
    metachar:sym<.>                          /.*/
    mod_internal:sym<P5>                     m:/ ... :P5 ... /
    quantifier:sym<*>                        /.*/
syntax highlighting: