The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

TITLE

Synopsis 11: Modules

AUTHORS

    Larry Wall <larry@wall.org>

VERSION

    Created: 27 Oct 2004

    Last Modified: 25 Oct 2010
    Version: 34

Overview

This synopsis discusses those portions of Apocalypse 12 that ought to have been in Apocalypse 11.

Modules

As in Perl 5, a module is just a kind of package. Unlike in Perl 5, modules and classes are declared with separate keywords, but they're still just packages with extra behaviors.

A module is declared with the module keyword. There are two basic declaration syntaxes:

    module Foo; # rest of scope is in module Foo
    ...

    module Bar {...}    # block is in module Bar

A named module declaration can occur as part of an expression, just like named subroutine declarations.

Since there are no barewords in Perl 6, module names must be predeclared, or use the sigil-like ::ModuleName syntax. The :: prefix does not imply globalness as it does in Perl 5. (Use GLOBAL:: for that.)

A bare (unscoped) module declarator declares a nested our module name within the current package. However, at the start of the file, the current package is GLOBAL, so the first such declaration in the file is automatically global.

You can use our module to explicitly declare a module in the current package (or module, or class). To declare a lexically scoped module, use my module. Module names are always searched for from innermost scopes to outermost. As with an initial ::, the presence of a :: within the name does not imply globalness (unlike in Perl 5).

The default namespace for the main program is GLOBAL. (Putting module GLOBAL; at the top of your program is redundant, except insofar as it tells Perl that the code is Perl 6 code and not Perl 5 code. But it's better to say "use v6" for that.)

Module traits are set using is:

    module Foo is bar {...}

An anonymous module may be created with either of:

    module {...}
    module :: {...}

The second form is useful if you need to apply a trait:

    module :: is bar {...}

Exportation

Exportation is now done by trait declaration on the exportable item:

    module Foo;                                # Tagset...
    sub foo is export                   {...}  #  :DEFAULT, :ALL
    sub bar is export(:DEFAULT :others) {...}  #  :DEFAULT, :ALL, :others
    sub baz is export(:MANDATORY)       {...}  #  (always exported)
    sub bop is export(:ALL)             {...}  #  :ALL
    sub qux is export(:others)          {...}  #  :ALL, :others

Declarations marked as is export are bound into the EXPORT inner modules, with their tagsets as inner module names within it. For example, the sub bar above will bind as &Foo::EXPORT::DEFAULT::bar, &Foo::EXPORT::ALL::bar, and &Foo::EXPORT::others::bar.

Tagset names consisting entirely of capitals are reserved for Perl.

Inner modules automatically add their export list to modules in all their outer scopes:

    module Foo {
        sub foo is export {...}
        module Bar {
            sub bar is export {...}
            module Baz {
                sub baz is export {...}
            }
        }
    }

The Foo module will export &foo, &bar and &baz by default; calling Foo::Bar::.EXPORTALL will export &bar and &baz at runtime to the caller's package.

Any proto declaration that is not declared my is exported by default. Any multi that depends on an exported proto is also automatically exported. Any autogenerated proto is assumed to be exported by default.

Dynamic exportation

The default EXPORTALL handles symbol exports by removing recognized export items and tagsets from the argument list, then calls the EXPORT subroutine in that module (if there is one), passing in the remaining arguments.

If the exporting module is actually a class, EXPORTALL will invoke its EXPORT method with the class itself as the invocant.

Compile-time Importation

[Note: the :MY forms are being rethought currently.]

Importing via use binds into the current lexical scope by default (rather than the current package, as in Perl 5).

    use Sense <common @horse>;

You can be explicit about the desired namespace:

    use Sense :MY<common> :OUR<@horse>;

That's pretty much equivalent to:

    use Sense;
    my &common ::= Sense::<&common>;
    our @horse ::= Sense::<@horse>;
    $*sensitive ::= Sense::<$sensitive>

It is also possible to re-export the imported symbols:

    use Sense :EXPORT;                  # import and re-export the defaults
    use Sense <common> :EXPORT;         # import "common" and re-export it
    use Sense <common> :EXPORT<@horse>; # import "common" but export "@horse"

In the absence of a specific scoping specified by the caller, the module may also specify a different scoping default by use of :MY or :OUR tags as arguments to is export. (Of course, mixing incompatible scoping in different scopes is likely to lead to confusion.)

The use declaration is actually a composite of two other declarations, need and import. Saying

    use Sense <common @horse>;

breaks down into:

    need Sense;
    import Sense <common @horse>;

These further break down into:

    BEGIN {
      my $target ::= OUTER;
      for <Sense> {
        my $scope = load_module(find_module_defining($_));
        # install the name of the type
        $target.install_alias($_, $scope{$_}) if $scope.exists{$_};
        # get the package declared by the name in that scope,
        my $package_name = $_ ~ '::';
        # if there isn't any, then there's just the type...
        my $loaded_package = $scope{$package_name} or next;
        # get a copy of the package, to avoid action-at-a-distance
        # install it in the target scope
        $target{$package_name} := $loaded_package.copy;
        # finally give the chance for the module to install
        # the selected symbols
        $loaded_package.EXPORTALL($target, <common @horse>);
      }
    }

Loading without importing

The need declarator takes a list of modules and loads them (at compile time) without importing any symbols. It's good for loading class modules that have nothing to export (or nothing that you want to import):

    need ACME::Rocket;
    my $r = ACME::Rocket.new;

This declaration is equivalent to Perl 5's:

    use ACME::Rocket ();

Saying

    need A,B,C;

is equivalent to:

    BEGIN {
      my $target ::= OUTER;
      for <A B C> {
        my $scope = load_module(find_module_defining($_));
        # install the name of the type
        $target.install_alias($_, $scope{$_}) if $scope.exists{$_};
        # get the package declared by the name in that scope,
        my $package_name = $_ ~ '::';
        # if there isn't any, then there's just the type...
        my $loaded_package = $scope{$package_name} or next;
        # get a copy of the package, to avoid action-at-a-distance
        # install it in the target scope
        $target{$package_name} := $loaded_package.copy;
      }
    }

Importing without loading

The importation into your lexical scope may also be a separate declaration from loading. This is primarily useful for modules declared inline, which do not automatically get imported into their surrounding scope:

    my module Factorial {
        sub fact (Int $n) is export { [*] 1..$n }
    }
    ...
    import Factorial 'fact';   # imports the multi

The last declaration is syntactic sugar for:

    BEGIN Factorial.WHO.EXPORTALL(MY, 'fact');

This form functions as a compile-time declarator, so that these notations can be combined by putting a declarator in parentheses:

    import (role Silly {
        enum Ness is export <Dilly String Putty>;
    }) <Ness>;

This really means:

    BEGIN (role Silly {
        enum Ness is export <Dilly String Putty>;
    }).WHO.EXPORTALL(MY, <Ness>)

Without an import list, import imports the :DEFAULT imports.

Runtime Importation

Importing via require also installs names into the current lexical scope by default, but delays the actual binding till runtime:

    require Sense <common @horse>;

This means something like:

    BEGIN MY.declare_stub_symbols('Sense', <common @horse>);
    # run time!
    MY.import_realias(:from(load_module(find_module_defining('Sense'))), 'Sense');
    MY.import_realias(:from(Sense), <common @horse>);

(The .import_realias requires that the symbols to be imported already exist; this differs from .import_alias, which requires that the imported symbols not already exist in the target scope.)

Alternately, a filename may be mentioned directly, which installs a package that is effectively anonymous to the current lexical scope, and may only be accessed by whatever global names the module installs:

    require "/home/non/Sense.pm" <common @horse>;

which breaks down to:

    BEGIN MY.declare_stub_symbols(<common @horse>);
    MY.import_realias(:from(load_module("/home/non/Sense.pm")), <common @horse>);

Only explicitly mentioned names may be so imported. In order to protect the run-time sanctity of the lexical pad, it may not be modified by require. Tagsets are assumed to be unknown at compile time, hence tagsets are not allowed in the default import list to :MY, but you can explicitly request to put names into the :OUR scope, since that is modifiable at run time:

    require Sense <:ALL>    # does not work
    require Sense :MY<ALL>  # this doesn't work either
    require Sense :OUR<ALL> # but this works

If the import list is omitted, then nothing is imported. Since you may not modify the lexical pad, calling an importation routine at runtime cannot import into the lexical scope, and defaults to importation to the package scope instead:

    require Sense;
    Sense.EXPORTALL;   # goes to the OUR scope by default, not MY

(Such a routine may rebind existing lexicals, however.)

Importing from a pseudo-package

You may also import symbols from the various pseudo-packages listed in S02. They behave as if all their symbols are in the :ALL export list:

    import PROCESS <$IN $OUT $ERR>;
    import CALLER <$x $y>;

    # Same as:
    #     my ($IN, $OUT, $ERR) := PROCESS::<$IN $OUT $ERR>
    #     my ($x, $y) := ($CALLER::x, $CALLER::y)

[Conjecture: this section may go away, since the aliasing forms are not all that terrible, and it's not clear that we want the overhead of emulating export lists.]

Versioning

When at the top of a file you say something like

    module Squirrel;

or

    class Dog;

you're really only giving one part of the name of the module. The full name of the module or class includes other metadata, in particular, the author, and the version.

Modules posted to CPAN or entered into any standard Perl 6 library are required to declare their full name so that installations can know where to keep them, such that multiple versions by different authors can coexist, all of them available to any installed version of Perl. (When we say "modules" here we don't mean only modules declared with the module declarator, but also classes, roles, grammars, etc.)

Such modules are also required to specify exactly which version (or versions) of Perl they are expecting to run under, so that future versions of Perl can emulate older versions of Perl (or give a cogent explanation of why they cannot). This will allow the language to evolve without breaking existing widely used modules. (Perl 5 library policy is notably lacking here; it would induce massive breakage even to change Perl 5 to make strictness the default.) If a CPAN module breaks because it declares that it supports future versions of Perl when it doesn't, then it must be construed to be the module's fault, not Perl's. If Perl evolves in a way that does not support emulation of an older version (at least, back to 6.0.0), then it's Perl's fault (unless the change is required for security, in which case it's the fault of the insensitive clod who broke security :).

The internal API for package names is always case-sensitive, even if the library system is hosted on a system that is not case-sensitive. Likewise internal names are Unicode-aware, even if the filesystem isn't. This implies either some sort of name mangling capability or storage of intermediate products into a database of some sort. In any event, the actual storage location must be encapsulated in the library system such that it is hidden from all language level naming constructs. (Provision must be made for interrogating the library system for the actual location of a module, of course, but this falls into the category of introspection.) Note also that distributions need to be distributed in a way that they can be installed on case-insensitive systems without loss of information. That's fine, but the language-level abstraction must not leak details of this mechanism without the user asking for the details to be leaked.

The syntax of a versioned module or class declaration has multiple parts in which the non-identifier parts are specified in adverbial pair notation without intervening spaces. Internally these are stored in a canonical string form which you should ignore. You may write the various parts in any order, except that the bare identifier must come first. The required parts for library insertion are the short name of the class/module, a URI identifying the author (or authorizing authority, so we call it "auth" to be intentionally ambiguous), and its version number. For example:

    class Dog:auth<cpan:JRANDOM>:ver<1.2.1>;
    class Dog:auth<http://www.some.com/~jrandom>:ver<1.2.1>;
    class Dog:auth<mailto:jrandom@some.com>:ver<1.2.1>;

Since these are somewhat unwieldy to look at, we allow a shorthand in which a bare subscripty adverb interprets its elements according to their form:

    class Dog:<cpan:JRANDOM 1.2.1>

The pieces are interpreted as follows:

  • Anything matching [<ident> '::']* <ident> is treated as a package name

  • Anything matching <alpha>+ \: \S+ is treated as an author(ity)

  • Anything matching v? [\d+ '.']* \d+ is treated as a version number

These declarations automatically alias the full name of the class (or module) to the short name. So for the rest of the lexical scope, Dog refers to the longer name. The real library name can be specified separately as another adverb, in which case the identifier indicates only the alias within the current lexical scope:

    class Pooch:name<Dog>:auth<cpan:JRANDOM>:ver<1.2.1>

or

    class Pooch:<Dog cpan:JRANDOM 1.2.1>

for short.

Here the real name of the module starts Dog, but we refer to it as Pooch for the rest of this file. Aliasing is handy if you need to interface to more than one module named Dog

If there are extra classes or modules or packages declared within the same file, they implicitly have a long name including the file's version and author, but you needn't declare them again.

Since these long names are the actual names of the classes as far as the library system is concerned, when you say:

    use Dog;

you're really wildcarding the unspecified bits:

    use Dog:auth(Any):ver(Any);

And when you say:

    use Dog:<1.2.1>;

you're really asking for:

    use Dog:auth(Any):ver<1.2.1>;

Saying 1.2.1 specifies an exact match on that part of the version number, not a minimum match. To match more than one version, put a range operator as a selector in parens:

    use Dog:ver(v1.2.1..v1.2.3);
    use Dog:ver(v1.2.1..^v1.3);
    use Dog:ver(v1.2.1..*);

When specifying the version of your own module, 1.2 is equivalent to 1.2.0, 1.2.0.0, and so on. However use searches for modules matching a version prefix, so the subversions are wildcarded, and in this context :ver<1.2> really means :ver<1.2.*>. If you say:

    use v6;

which is short for:

    use Perl:ver<6.*>;

you're asking for any version of Perl 6. You need to say something like

    use Perl:<6.0>;
    use Perl:<6.0.0>;
    use Perl:<6.2.7.1>;

if you want to lock in a particular set of semantics at some greater degree of specificity. And if some large company ever forks Perl, you can say something like:

    use Perl:auth<cpan:TPF>

to guarantee that you get the unembraced Perl. :-)

When it happens that the same module is available from more than one authority, and the desired authority is not specified by the use, the version lineage that was created first wins, unless overridden by local policy or by official abandonment by the original authority (as determined either by the author or by community consensus in case the author is no longer available or widely regarded as uncooperative). An officially abandoned lineage will be selected only if it is the only available lineage of locally installed modules.

Once the authority is selected, then and only then is any version selection done; the version specification is ignored until the authority is selected. This implies that all official modules record permanently when they were first installed in the official library, and this creation date is considered immutable.

For wildcards any valid smartmatch selector works:

    use Dog:auth(/:i jrandom/):ver(v1.2.1 | v1.3.4);
    use Dog:auth({ .substr(0,5) eq 'cpan:'}):ver(Any);

In any event, however you select the module, its full name is automatically aliased to the short name for the rest of your lexical scope. So you can just say

    my Dog $spot .= new("woof");

and it knows (even if you don't) that you mean

    my Dog:<cpan:JRANDOM 1.3.4> $spot .= new("woof");

The use statement allows an external language to be specified in addition to (or instead of) an authority, so that you can use modules from other languages. The from adverb also parses any additional parts as short-form arguments. For instance:

    use Whiteness:from<perl5>:name<Acme::Bleach>:auth<cpan:DCONWAY>:ver<1.12>;
    use Whiteness:from<perl5 Acme::Bleach cpan:DCONWAY 1.12>;  # same thing

The string form of a version recognizes the * wildcard in place of any position. It also recognizes a trailing +, so

    :ver<6.2.3+>

is short for

    :ver(v6.2.3 .. v6.2.*)

And saying

    :ver<6.2.0+>

specifically rules out any prereleases.

If two different modules in your program require two different versions of the same module, Perl will simply load both versions at the same time. For modules that do not manage exclusive resources, the only penalty for this is memory, and the disk space in the library to hold both the old and new versions. For modules that do manage an exclusive resource, such as a database handle, there are two approaches short of requiring the user to upgrade. The first is simply to refactor the module into a stable supplier of the exclusive resource that doesn't change version often, and then the outer wrappers of that resource can both be loaded and use the same supplier of the resource.

The other approach is for the module to keep the management of its exclusive resource, but offer to emulate older versions of the API. Then if there is a conflict over which version to use, the new one is used by both users, but each gets a view that is consistent with the version it thinks it is using. Of course, this depends crucially on how well the new version actually emulates the old version.

To declare that a module emulates an older version, declare it like this:

    class Dog:<cpan:JRANDOM 1.2.1> emulates :<1.2.0>;

Or to simply exclude use of the older module and (presumably) force the user to upgrade:

    class Dog:<cpan:JRANDOM 1.2.1> excludes :<1.2.0>;

The name is parsed like a use wildcard, and you can have more than one, so you can say things like:

    class Dog:<cpan:JRANDOM 1.2.1>
        emulates Dog:auth(DCONWAY|JCONWAY|TCONWAY):ver<1.0+>
        excludes Fox:<http://oreillymedia.com 3.14159>
        emulates Wolf:from<C# 0.8..^1.0>;

Forcing Perl 6

To get Perl 6 parsing rather than the default Perl 5 parsing, we said you could force Perl 6 mode in your main program with:

    use v6;

Actually, if you're running a parser that is aware of Perl 6, you can just start your main program with any of:

    use v6;
    module;
    class;

Those all specify the latest Perl 6 semantics, and are equivalent to

    use Perl:auth(Any):ver(v6..*);

To lock the semantics to 6.0.0, say one of:

    use Perl:ver<6.0.0>;
    use :<6.0.0>;
    use v6.0.0;

In any of those cases, strictures and warnings are the default in your main program. But if you start your program with a bare version number or other literal:

    v6.0.0;
    v6;
    6;
    "Coolness, dude!";

it runs Perl 6 in "lax" mode, without strictures or warnings, since obviously a bare literal in a sink (void) context ought to have produced a "Useless use of..." warning. (Invoking perl with -e '6;' has the same effect.)

In the other direction, to inline Perl 5 code inside a Perl 6 program, put use v5 at the beginning of a lexical block. Such blocks can nest arbitrarily deeply to switch between Perl versions:

    use v6;
    # ...some Perl 6 code...
    {
        use v5;
        # ...some Perl 5 code...
        {
            use v6;
            # ...more Perl 6 code...
        }
    }

It's not necessary to force Perl 6 if the interpreter or command specified already implies it, such as use of a "#!/usr/bin/perl6" shebang line. Nor is it necessary to force Perl 6 in any file that begins with the "class" or "module" keywords.

Tool use vs language changes

In order that language processing tools know exactly what language they are parsing, it is necessary for the tool to know exactly which variant of Perl 6 is being parsed in any given scope. All Perl 6 compilation units that are complete files start out at the top of the file in the Standard Dialect (which itself has versions that correspond to the same version of the official Perl test suite). Eval strings, on the other hand, start out in the language variant in use at the point of the eval call, so that you don't suddenly lose your macro definitions inside eval.

All language tweaks from the start of the compilation unit must be tracked. Tweaks can be specified either directly in your code as macros and such, or such definitions may be imported from a module. As the compiler progresses through the compilation unit, other grammars may be substituted in an inner lexical scope for an outer grammar, and parsing continues under the new grammar (which may or may not be a derivative of the standard Perl grammar).

Language tweaks are considered part of the interface of any module you import. Version numbers are assumed to represent a combination of interface and patch level. We will use the term "interface version" to represent that part of the version number that represents the interface. For typical version number schemes, this is the first two numbers (where the third number usually represents patch level within a constant interface). Other schemes are possible though. (It is recommended that branches be reflected by differences in authority rather than differences in version, whenever that makes sense. To make it make sense more often, some hierarchical authority-naming scheme may be devised so that authorities can have temporary subauthorities to hold branches without relinquishing overall naming authority.)

So anyway, the basic rule is this: you may import language tweaks from your own private (user-library) code as you like; however, all imports of language tweaks from the official library must specify the exact interface version of the module.

Such officially installed interface versions must be considered immutable on the language level, so that once any language-tweaking module is in circulation, it may be presumed to represent a fixed language change. By examination of these interface versions a language processing tool can know whether it has sufficient information to know the current language.

In the absence of that information, the tool can choose either to download and use the module directly, or the tool can proceed in ignorance. As an intermediate position, if the tool does not actually care about running the code, the tool need not actually have the complete module in question; many language tweaks could be stored in a database of interface versions, so if the tool merely knows the nature of the language tweak on the basis of the interface version it may well be able to proceed with perfect knowledge. A module that uses a well-behaved macro or two could be fairly easily emulated based on the version info alone.

But more realistically, in the absence of such a hypothetical database, most systems already come with a kind of database for modules that have already been installed. So perhaps the most common case is that you have downloaded an older version of the same module, in which case the tool can know from the interface version whether that older module represesents the language tweak sufficiently well that your tool can use the interface definition from that module without bothering to download the latest patch.

Note that most class modules do no language tweaking, and in any case cannot perform language tweaks unless these are explicitly exported.

Modules that exported multis are technically language tweaks on the semantic level, but as long as those new definitions modify semantics within the existing grammar (by avoiding the definition of new macros or operators), they do not fall into the language tweak category. Modules that export new operators or macros are always considered language tweaks. (Unexported macros or operators intended only for internal use of the module itself do not count as language tweaks.)

The requirement for immutable interfaces extends transitively to any modules imported by a language tweak module. There can be no indeterminacy in the language definition either directly or indirectly.

It must be possible for any official module to be separately compiled without knowledge of the lexical or dynamic context in which it will be embedded, and this separate compilation must be able to produce a deterministic profile of the interface. It must be possible to extract out the language tweaking part of this profile for use in tools that wish to know how to parse the current language variant deterministically.