=encoding utf8 =head1 TITLE Synopsis 12: Objects =head1 VERSION Created: 27 Oct 2004 Last Modified: 26 Oct 2014 Version: 134 =head1 Overview This synopsis summarizes Apocalypse 12, which discusses object-oriented programming. =head1 Classes A class is a module declared with the C keyword. As with modules, the public storage, interface, and name of the class is represented by a package and its name, which is usually (but not necessarily) a global name. A class is a module and thus can export stuff, but a class adds even more behavior to support Perl 6's standard class-based OO. Taken as a type object, a class name represents all of the possible values of its type, and the type object can thus be used as a proxy for any "real" object of that type in calculating what a generic object of that type can do. The class object is an object, but it is not a Class, because there is no mandatory Class class in Perl 6, and because type objects in Perl 6 are considered undefined. We wish to support both class-based and prototype-based OO programming. So all metaprogramming is done through the current object's C object, which can delegate metaprogramming to any metamodel it likes. However, by default, objects derived from C support a fairly standard class-based model. There are two basic class declaration syntaxes: unit class Foo; # rest of file is class definition has $.foo; class Bar { has $.bar } # block is class definition The first form is allowed only as the first declaration in a compilation unit (that is, file or C string). If the class body begins with a statement whose main operator is a single C<< prefix:<...> >> (yada) listop, the class name is introduced without a definition, and a second declaration of that class in the same scope does not complain about redefinition. (Statement modifiers are allowed on such a C<...> operator.) Thus you may forward-declare your classes: class A {...} # introduce A as a class name without definition class B {...} # introduce B as a class name without definition my A $root .= new(:a(B)); class A { has B $.a; } class B { has A $.b; } As this example demonstrates, this allows for mutually recursive class definitions (though, of course, it can't allow recursive inheritance). It is also possible to extend classes via the C declarator, but that is considered somewhat antisocial and should not be used for forward declarations. [Conjecture: we may also allow the C and C declarator modifiers on class definitions to explicitly declare classes with multiple bodies participating in a single definition intentionally.] A named class declaration can occur as part of an expression, just like named subroutine declarations. Classes are primarily for instance management, not code reuse. Consider using roles when you simply want to factor out common code. Perl 6 supports multiple inheritance, anonymous classes, and autoboxing. All public method calls are "virtual" in the C++ sense. You may derive from any built-in type, but the derivation of a low-level type like C may only add behaviors, not change the representation. Use composition and/or delegation to change the representation. Since there are no barewords in Perl 6, bare class names must be predeclared. You can predeclare a stub class and fill it in later just as you would a subroutine. You can force interpretation of a name as a class or type name using the C<::> prefix. In an rvalue context the C<::> prefix is a no-op, but in a declarational context, it binds a new type name within the declaration's scope along with anything else being declared by the declaration. Without a C or other scoping declarator, a bare C declarator declares an C declarator, that is, a name within the current package. Since class files begin parsing in the C package, the first class declaration in the file installs itself as a global name, and subsequent declarations then install themselves into the current class rather than the global package. Hence, to declare an inner class in the current package (or module, or class), use C or just C. To declare a lexically scoped class, use C. Class names are always searched for from innermost scopes to outermost. As with an initial C<::>, the presence of a C<::> within the name does not imply globalness (unlike in Perl 5). So the outward search can look in children of the searched namespaces. An inner class or role in a generic context must be lexically scoped if it depends on any generic parameter or type; and such an inner class or role is also a generic type. [Conjecture: it is erroneous to assume that any generic type is uniquely associated with a package.] =head2 Class traits Class traits are set using C: class MyStruct is rw {...} =head3 Single inheritance An "isa" is just a trait that happens to be another class: class Dog is Mammal {...} =head3 Multiple inheritance Multiple inheritance is specified with multiple C modifiers: class Dog is Mammal is Pet {...} =head3 Composition Roles use C instead of C: class Dog is Mammal does Pet {...} =head3 The C declarator You may put these inside as well by use of the C declarator: class Dog { also is Mammal; also does Pet; ... } (However, the C declarator is primarily intended for use in roles, to distinguish class traits that might not be properly understood as generic when placed in the role header, which tends to communicate the false impression that the trait in question is to be applied directly to the role rather than to the composed class.) =head2 Metaclasses Every object (including any class-based object) delegates to an instance of its metaclass. You can get at the metaclass of any object via the C method, which returns an instance of the metaclass. A "class" object is just considered an "empty" instance in Perl 6, more properly called a "prototype" or "generic" object, or just "type object". Perl 6 doesn't really have any classes named C. Types of all kinds are instead named via these undefined type objects, which are considered to have exactly the same type as an instantiated version of themselves. But such type objects are inert, and do not manage the state of class instances. The actual object that manages instances is the metaclass object pointed to by the C syntax. So when you say "C", you're referring to both a package and a type object, the latter of which points to the object representing the class via C. The type object differs from an instance object not by having a different type but rather in the extent to which it is defined. Some objects may tell you that they are defined, while others may tell you that they are undefined. That's up to the object, and depends on how the metaclass chooses to dispatch the C<.defined> method. =head2 Closed classes Classes are open and non-final by default, but may easily be closed or finalized not by themselves but by the entire application, provided nobody issued an explicit compile-time request that the class stay open or non-final. (Or a site policy could close any applications that use the policy.) Platforms that do dynamic loading of sub-applications probably don't want to close or finalize classes wholesale, however. Roles take on some of the compile-time function of closed classes, so you should probably use those instead anyway. =head2 Private classes A private class can be declared using C; most privacy issues are handled with lexical scoping in Perl 6. The fact that importation is lexical by default also means that any names your class imports are also private by default. In grammars, one cannot use grammar attributes so one can call a grammar rule from an unrelated grammar. One can emulate that behavior with lexically scoped grammars created within a closure. The lexical variables captured by the closure can then be used where grammar attributes would be. =head2 Class composition Class declarations (in particular, role composition) are strictly compile time statements. In particular, if a class declaration appears inside a nested scope, the class declaration is constrained to compose in exactly the same way on any possible execution. All named roles and superclasses must be bound as non-rebindable readonly values; any parameters to traits will be evaluated only in a non-cloning context. Names bound by the class declaration are made non-rebindable and read only so they may be used as superclasses. =head2 Anonymous class declaration In an anonymous class declaration, C<::> by itself may represent the anonymous class name if desired: class {...} # ok class is Mammal {...} # WRONG class :: is Mammal {...} # ok class { also is Mammal; ...} # also ok =head1 Methods Methods are routines declared in a class with the C keyword: method doit ($a, $b, $c) { ... } method doit ($self: $a, $b, $c) { ... } method doit (MyName $self: $a, $b, $c) { ... } method doit (::?CLASS $self: $a, $b, $c) { ... } =head2 Invocants Declaration of the invocant is optional. You may always access the current invocant using the keyword C. You need not declare the invocant's type, since the lexical class of the invocant is known in any event because methods must be declared in the class of the invocant, though of course the actual (virtual) type may be a derived type of the lexical type. You could declare a more restrictive type, but that would probably be a bad thing for proper polymorphism. You may explicitly type the invocant with the lexical type, but any check for that will be optimized away. (The current lexically-determined class may always be named as C<::?CLASS> even in anonymous classes or roles.) To mark an explicit invocant, just put a colon after it: method doit ($x: $a, $b, $c) { ... } If you declare an explicit invocant for an Array type using an array variable, you may use that directly in list context to produce its elements method push3 (@x: $a, $b, $c) { ... any(@x) ... } Note that the C term refers directly to the object the method was invoked on, and therefore: class A is Array { method m() { .say for self } } A.new(1, 2, 3).m; Will produce 3 lines of output. =head2 Private methods Private methods are declared using C: method !think (Brain $self: $thought) (Such methods are completely invisible to ordinary method calls, and are in fact called with a different syntax that uses C in place of the C<.> character. See below.) =head2 Method scoping Unlike with most other declarations, C declarations do not default to C semantics, or even C semantics, but rather C semantics. So instead of installing a symbol into a lexical or package symbol table, they merely install a public or private method in the current class or role via calls to its metaobject. (Likewise for C declarations--see L below.) Use of an explicit C declarator has no effect on the declaration. You may install additional aliases to the method in the lexical scope using C or in the current package using C. These aliases are named with C<&foo> notation and return a C object that may be called as a subroutine, in which case you must supply the expected invocant as the first argument. =head2 Method calls To call an ordinary method with ordinary method-dispatch semantics, use either the dot notation or indirect object notation: $obj.doit(1,2,3) doit $obj: 1,2,3 Indirect object notation now requires a colon after the invocant, even if there are no arguments after the colon: $handle.close; close $handle:; To reject method call and only consider subs, simply omit the colon from the invocation line: close($handle); close $handle; However, here the built-in B class defines C, which puts a C in scope by default. Thus if the C<$handle> evaluates to an C object, then the two subroutine calls above are still translated into method calls. Dot notation can omit the invocant if it's in C<$_>: .doit(1,2,3) Method calls use the C3 method resolution order. =head3 Fancy method calls Note that there is no corresponding notation for private methods. !doit(1,2,3) # WRONG, would be parsed as not(doit(1,2,3)) self!doit(1,2,3) # okay There are several forms of indirection for the method name. You can replace the identifier with a quoted string, and it will be evaluated as a quote and then the result of that is used as the method name. $obj."$methodname"(1,2,3) # use contents of $methodname as method name $obj.'$methodname'(1,2,3) # no interpolation; call method with $ in name! $obj!"$methodname"() # indirect call to private method name As an aid to catching Perl 5 brainos, this quoted form always requires a parenthesized argument list to distinguish it from code that looks like a Perl 5 concatenation. Within an interpolation, the double-quoted form may not contain whitespace. This does what the user expects in the common case of a quoted string ending with a period: say "Foo = $foo."; If you really want to call a method with whitespace, you may work around this restriction with a closure interpolation: say "Foo = {$foo."a method"()}"; # OK [Note: to help catch the mistaken use of C<< infix:<.> >> as a string concatenation operator, Perl 6 will warn you about "useless use of quotes" at compile time if the string inside quotes is an identifier. (It does not warn about non-identifier strings, but such strings are likely to produce missing method errors at run time in any case.) Also, if there is whitespace around an intended C<.> concatenation, it cannot be parsed as a method call at all; instead it fails at compile time because standard Perl 6 has a pseudo C<< infix:<.> >> operator that always fails at compile time.] For situations where you already have a method located, you can use a simple scalar variable in place of method name: $methodobj = $foo ?? &bar !! &baz; $obj.$methodobj(1,2,3) or more succinctly but less readably: $obj.$($foo ?? &bar !! &baz)(1,2,3) The variable must contain a C object (usually of type C), that is, a closure of some sort. Regardless of whether the closure was defined as a method or a sub or a block, the closure is called directly without any class dispatch; from the closure's point of view, however, it is always called as a method, with the object as its first argument, and the rest of the arguments second, third, and so on. For instance, such a closure may be used to abstract a "navigational" path through a data structure without specifying the root of the path till later: $locator = -> $root, $x, $y { $root.[$x]{$y}[3] } $obj.$locator(42,"baz") # $obj[42][3] $locator = { . } $obj.$locator # $obj As a convenient form of documentation, such a closure may also be written in the form of an anonymous method: $locator = method ($root: $x, $y) { $root.[$x]{$y}[3] } $obj.$locator(42,"baz") # $obj[42][3] $locator = method { self. } $obj.$locator # $obj Note however that, like any anonymous closure, an anonymous method can only be dispatched to directly, like a sub. You may, of course, bind an anonymous method to the name of a method in a class's public interface, in which case it is no longer anonymous, and may be dispatched to normally via the class. (And in fact, when the normal method dispatcher is calling individual candidates in its candidate list, it calls each candidate as a sub, not as a method, or you'd end up with recursive dispatchers.) But fundamentally, there's no such thing as a method closure. The C declarator on an anonymous method has the primary effect of making the declaration of the invocant optional. (It also makes it an official C that can be returned from, just as if you'd used C to declare it.) Instead of a scalar variable, an array variable may also be used: $obj.@candidates(1,2,3) As with the scalar variant, string method names are not allowed, only C objects, The list is treated as a list of candidates to call. After the first successful call the rest of the candidates are discarded. Failure of the current candidate is indicated by calling C or C (see L below). Note also that the $obj.$candidates(1,2,3) form may dispatch to a list of candidates if C<$candidates> is either a list or a special C object representing a partial dispatch to a list of candidates. If C<$candidates> (or any element of C<@candidates>) is an iterable object it is expanded out recursively until C candidates are found. The call fails if it hits a candidate that is not C, C, or C. Another form of indirection relies on the fact that operators are named using a variant on pair notation, which gives you these forms: $x.infix:[$op]($y) $x.prefix:[$op] $x.postfix:[$op] Generally you see these with the literal angle bracket form of subscript: $a.infix:<*>($b) # equivalent to $a * $b $a.prefix:<++> # equivalent to ++$a $a.postfix:<++> # equivalent to $a++ If you omit the syntactic category, the call will be dispatched according to the number of arguments either as "prefix" or as "infix": $a.:<+>($b) # equivalent to $a + $b $a.:<++> # equivalent to ++$a $a.: # equivalent to !$a @a.:<[*]> # equivalent to [*] @a But it's probably better to spell out the syntactic category when the actual operator is not obvious: $x.infix:[$op]($y) $x.prefix:[$op] You must use a special syntax to call a private method: $mybrain!think($pinky) self!think($pinky) For a call on your own private method, you may also use the attribute-ish form: $!think($pinky) # short for $(self!think($pinky)) Parentheses (or a colon) are required on the dot/bang notations if there are any arguments (not counting adverbial arguments). There may be no space between the method name and the left parenthesis unless you make use of "unspace": .doit # okay, no arguments .doit() # okay, no arguments .doit () # ILLEGAL (two terms in a row) .doit\ () # okay, no arguments, same as .doit() (unspace form) Note that the named method call forms are special and do not use the dot form of postfix. If you attempt to use the postfix operator form, it will assume you want to call the method with no arguments and then call the result of I: .doit.() # okay, no arguments *twice*, same as .doit().() .doit\ .() # okay, no arguments *twice*, same as .doit.().() (unspace form) However, you can turn any of the named forms above into a list operator by appending a colon: .doit: 1,2,3 # okay, three arguments .doit(1): 2,3 # okay, one argument plus list .doit (): 1,2,3 # ILLEGAL (two terms in a row) In particular, this allows us to pass a final closure in addition to the "normal" arguments: .doit: { $^a <=> $^b } # okay .doit(): { $^a <=> $^b } # okay .doit(1,2,3): { $^a <=> $^b } # okay Normally a space is required after the colon to disambiguate what follows from a pair that extends the previous name. However, names may not be extended with the C<:{}> pair notation, and therefore it is allowed to drop the space after the colon if the first argument to the method is a closure. Hence, any of the above may be written without the space after the colon: .doit:{ $^a <=> $^b } # okay .doit():{ $^a <=> $^b } # okay .doit(1,2,3):{ $^a <=> $^b } # okay These are parsed as if there were a space there, so the argument list may continue if the closure is followed by a comma. In case of ambiguity between indirect object notation and dot form, the nearest thing wins: dothis $obj.dothat: 1,2,3 means dothis ($obj.dothat(1,2,3)) and you must say dothis ($obj.dothat): 1,2,3 or $obj.dothat.dothis: 1,2,3 if you mean the other thing. Also note that if any term in a list is a bare closure or pointy block, it will be considered to be the final argument of its list if the closure's right curly is followed by a newline. If instead the closure's right curly is followed by a method call, the closure is the invocant: @list.map:{ "'$^x $^y'".say }.assuming: 'got:' To call the method of the result of the former method call, add parens: @list.map({ "got: $^x" }).say Even when the colon of a method call does not require whitespace when followed by a block, it will look odd, so it may be clearer to add the space to make the method calls on the right look more like they attach to the block itself instead of the term on the left: @list.map: { "'$^x $^y'".say }.assuming: 'got:' This will also visually distinguish between a method call introducing colon and an object pair constructor. =head2 Lvalue methods Methods (and subs) may be declared as lvalues with C. You can use an argumentless C method anywhere you can use a variable, including in C and C statements. (In fact, you can use an C method with arguments as a variable as long as the arguments are used only to identify the actual value to change, and don't otherwise have strange side effects that differ between rvalue and lvalue usage. Setter methods that expect the new value as an argument do not fall into the well-behaved category, however.) =head2 Scalar container indirection Method calls on mutable scalars always go to the object contained in the scalar (autoboxing value types as necessary): $result = $object.doit(); $length = "mystring".codes; Method calls on non-scalar variables just calls the C, C or C object bound to the variable: $elems = @array.elems; @keys = %hash.keys; $sig = &sub.signature; Use the prefix C macro on a scalar variable to get at its underlying C object: if VAR($scalar).readonly {...} C is a no-op on a non-scalar variables and values: VAR(1); # 1 VAR(@x); # @x There's also a corresponding C<< postfix:<.VAR> >> macro that can be used as if it were a method: if $scalar.VAR.readonly {...} (But since it's a macro, C is not dispatched as a real method. To dispatch to a real C<.VAR> method, use the indirect C<$obj."VAR"> form.) You can also get at the container through the appropriate symbol table: if MY::<$scalar>.readonly {...} =head2 FALLBACK methods If your class defines a method with the special name C, that method will be called if all other attempts to locate a method fail, including normal method dispatch as well as delegation (see below). The first argument to the method will be the method name (as a string) that was unsuccessfully searched for. The original call's arguments are passed in as rest of the C's argument list. It is legal for the C method to be a proto method that dispatches to multi methods. =head1 Class methods Other OO languages give you the ability to declare "class" methods that either don't need or actively prohibit calls on instances. Perl 6 gives you a choice. If you declare an ordinary method, it can function as a "class" method when you pass it a type object such as "C" regardless of how defined the prototype object is, as long as the method body doesn't try to access any information that is undefined in the current instance. Alternatively, a method can use the C<:U> type modifier on the invocant: method oh_so_static(::?CLASS:U:) { } This will cause the method to actively refuse invocations on instances, and only permit invocation through the type object. =head1 Submethods Submethods are for declaring infrastructural methods that shouldn't be inherited by subclasses, such as initializers: submethod BUILD (:$arg) { $!attr = $arg; } Apart from the keyword, submethod declaration and call syntax is identical to method syntax. You may mix methods and submethods of the same name within the class hierarchy, but only the methods are visible to derived classes via inheritance. A submethod is called only when a method call is dispatched directly to the current class. Conjecture: in order to catch spelling errors it is a compile-time warning to define a submethod in any class that does not inherit the corresponding method name from some base class. More importantly, this would help safeguard Liskov substitutability. (But note that the standard C class already supplies a default C and C.) =head1 Attributes Attributes are stored in an opaque datatype, not in a hash. Not even the class has to care how they're stored, since they're declared much like ordinary variables. Instead of C, use C: class Dog is Mammal { has $.name = "fido"; has $.tail is rw; has @.legs; has $!brain; ... } Public attributes have a secondary sigil of "dot", indicating the automatic generation of an accessor method of the same name (unless the class declares an explicit method of that name before the closing bracket). Private attributes use an exclamation to indicate that no public accessor is generated. has $!brain; The "true name" of the private variable always has the exclamation, but much like with C variables, you may declare a lexically scoped alias to the private variable by saying: has $brain; # also declares $!brain; As with the C declaration, no accessor is generated. And any later references to the private variable within the same block may either use or omit the exclamation, as you wish to emphasize or ignore the privacy of the variable. Outside the block, you must use the C form. If you declare with the C form, you must use that form consistently everywhere. If you declare with the C<.> form, you also get the private C form as a non-virtual name for the actual storage location, and you may use either C or C<.> form anywhere within the class, even if the class is reopened. Outside the class you must use the public C<.> form, or rely on a method call (which can be a private method call, but only for trusted classes). For public attributes, some traits are copied to the accessor method. The C trait causes the generated accessor to be declared C, making it an lvalue method. The default is a read-only accessor. If you declare the class as C, then all the class's attributes default to C, much like a C struct. You may write your own accessors to override any or all of the autogenerated ones. The attribute variables may be used within instance methods to refer directly to the attribute values. Outside the instance methods, the only access to attributes is through the accessors since an object has to be specified. The dot form of attribute variables may be used in derived classes because the dot form always implies a virtual accessor call. Every I declaration also declares a corresponding private I storage location, and the exclamation form may be used only in the actual class, not in derived classes. Reference to the internal storage location via C<$!foo> should generally be restricted to submethods. Ordinary methods should stick to the C<$.foo> form. In fact, within submethods, use of the C<$.foo> form on attributes that are available as C<$!foo> (that is, that are declared directly by this class) is illegal and produces a dire compile-time warning (which may be suppressed). Within a submethod the C<$.foo> form may only be used on attributes from parent classes, because only the parent classes' part of the object is guaranteed to be in a consistent state (because C calls the C routines of the parent classes before the child classes). If you attempt to get around this by declaring C as a method rather than a submethod, that will also be flagged as a dire (but suppressible) compile-time warning. (It is I to define an inheritable C routine if you have access to all the metadata for the current class, but it's not easy, and it certainly doesn't happen by accident just because you change C to C.) Because C<$.foo>, C<@.foo>, C<%.foo>, C<&.foo> are just shorthands of C with different contexts, the class does not need to declare any of those as an attribute -- a C declaration can work just as well. As with the normal method call forms, only dotless parentheses may contain arguments. If you use the C<.()> form it will perform an extra level of indirection after the method call: self.foo(1,2,3); # a regular method call self.foo.(1,2,3); # self.foo().(1,2,3), call .() on closure returned by .foo $.foo(1,2,3); # calls self.foo under $ context $.foo.(1,2,3); # $.foo().(1,2,3), call .() on closure returned by .foo &.foo(1,2,3); # calls self.foo under & context &.foo.(1,2,3); # &.foo().(1,2,3), call .() on closure returned by .foo =head2 Attribute default values Pseudo-assignment to an attribute declaration specifies the default value. The value on the right is treated as an implicit closure and evaluated at object build time, that is, when the object is being constructed, not when class is being composed. To refer to a value computed at compilation or composition time, you can either use a temporary or a temporal block of some sort: has $.r = rand; # each object gets different random value constant $random = rand; has $.r = $random; # every object gets same value has $.r = BEGIN { rand }; has $.r = INIT { rand }; has $.r = ENTER { rand }; has $.r = FIRST { rand }; has $.r = constant $myrand = rand; When it is called at C time, the topic of the implicit closure will be the attribute being initialized, while "self" refers to the entire object being initialized. The closure will be called at the end of the C only if the attribute is not otherwise initialized in either the signature or the body of the C. The closure actually defines the body of an anonymous method, so C is available with whatever attributes are constructed by that point in time (including all parent attributes). The initializers are run in order of declaration within the class, so a given initializer may refer back to an attribute defined in a preceding C declaration. =head2 Class attributes Class attributes are declared with either C or C. The only difference from ordinary C or C variables is that an accessor is generated according to the secondary sigil: our $.count; # generates a public read-only .count accessor our %!cache is rw; # generates no public accessor our @items; # generates no public accessor my $.count; # generates a public read-only .count accessor my %!cache is rw; # generates no public accessor my @items; # generates no public accessor Unlike attributes declared with C, class attributes are shared between the undefined type, all instances of the class, and all subclasses. =head1 Construction and Initialization All classes inherit a default C constructor from C. It expects all arguments to be named parameters initializing attributes of the same name. You may write your own C to override the default, or write constructors with any other name you like. As in Perl 5, a constructor is any routine that calls C. Unlike in Perl 5, you call it as a method on the class object (though any object may be used as a class object), passing the arguments to be used in building the object. The representation of the class determines how to create the object, so it's not longer necessary for you to supply a candidate to C. For example, a P5Hash object would give you an object representation that uses hashes just like P5 does. The default C representation doesn't tell you what it's going to use for its representation, since that's why it's called "opaque", after all. The C method allows one or more positional arguments representing autovivifying type objects. Such an object looks like a type name followed by a hash subscript (see "Autovivifying objects" below). These are used to initialize superclasses. Other than a list of autovivifying type objects, all arguments to C must be named arguments, not positional. Hence, the main purpose of custom constructors is to turn positional arguments into named arguments for C. The C method allows an object to be used for its class invocant. (Your constructor need not allow this). In any case, the object is not used as a prototype. Use C<.clone> instead of C<.bless> if that's what you mean. =head2 Semantics of C Any named arguments to C are automatically passed to the C routines. For normal user classes, C is the default representation. Other possibilities are C, C, C, C, C, etc. If you wish to pass special options to the representation layer for creating the object, that's between you and the representation. (A representation might look for additional class traits, for instance, telling it bit sizes and such.) The C method automatically calls all appropriate C routines for the current class, which initializes the object in least-derived to most-derived order. (C submethods work the same way, only in reverse.) The default C semantics are inherited from C, so you need to write initialization routines only if you wish to modify the default behavior. The C method automatically passes the appropriate argument list to the C of its various parent classes. If the type of the parent class corresponds to one of the type objects passed to bless, that type object's argument list is used. Otherwise all the arguments to bless are passed to the parent class's C. For the final C of the current object, all the arguments to C are passed to the C, so it can deal with any type objects that need special handling. (It is allowed to pass type objects that don't correspond to any parent class.) class Dog is Animal {...} my $pet = Dog.new( :name, Animal{ :blood, :legs(4) } ); Here we are using an autovivifying C type object to specify what the arguments to C's C routine should look like. (It does not actually autovivify an C apart from the one being created.) You can write your own C submethod to control initialization. If you name an attribute as a parameter, that attribute is initialized directly, so submethod BUILD (:$!tail, :$!legs) {} is equivalent to submethod BUILD (:$tail is copy, :$legs is copy) { $!tail := $tail; $!legs := $legs; } Whether you write your own C or not, at the end of the C, any default attribute values are implicitly copied into any attributes that haven't otherwise been initialized. Note that the default C will only initialize public attributes; you must write your own C (as above) in order to present private attributes as part of your initialization API. =head2 Cloning You can clone an object, changing some of the attributes: $newdog = $olddog.clone(:trick); =head1 Mutating methods You can call an in-place mutator method like this: @array .= sort; One handy place for an in-place mutator is to call a constructor on a variable of a known type: my Dog $spot .= new(:tail, :legs); =head1 Calling sets of methods For any method name, there may be some number of candidate methods that could handle the request: typically, inherited methods or multi variants. The ordinary "dot" operator dispatches to a method in the standard fashion. There are also "dot" variants that call some number of methods with the same name: $object.meth(@args) # calls one method or dies $object.?meth(@args) # calls method if there is one, otherwise Nil $object.*meth(@args) # calls all methods (0 or more, () if none) $object.+meth(@args) # calls all methods (1 or more, die if none) The method name may be quoted when disambiguation is needed: $object."+meth"(@args) $object.'VAR'(@args) As with ordinary calls, the identifier supplying the literal method name may be replaced with an interpolated quote to specify the method name indirectly. It may also be replaced with an array to specify the exact list of candidates to be considered: my @candidates := $object.WALK(:name, :breadth, :omit($?CLASS)); $object.*@candidates(@args); The C method takes these arguments: :canonical # canonical dispatch order :ascendant # most-derived first, like destruction order :descendant # least-derived first, like construction order :preorder # like Perl 5 dispatch :breadth # like multi dispatch :super # only immediate parent classes :name # only classes containing named method declaration :omit(Selector) # only classes that don't match selector :include(Selector) # only classes that match selector Any method can defer to the next candidate method in the list by the special functions C, C, C, and C. The "same" variants reuse the original argument list passed to the current method, whereas the "with" variants allow a new argument list to be substituted for the rest of the candidates. The "call" variants dispatch to the rest of the candidates and return their values to the current method for subsequent processing, whereas while the "next" variants don't return, but merely defer to the rest of the candidate list: callsame; # call with the original arguments (return here) callwith(); # call with no arguments (return here) callwith(1,2,3); # call with a new set of arguments (return here) nextsame; # redispatch with the original arguments (no return) nextwith(); # redispatch with no arguments (no return) nextwith(1,2,3); # redispatch with a new set of arguments (no return) samewith(1,2,3); # same dispatcher with new arguments (no return) For dispatches using C<.> and C<.?>, the return value is the C returned by the first method completed without deferring. (Such a return value may in fact be failure, but it still counts as a successful call from the standpoint of the dispatcher.) Likewise the return value of C<.*> and C<.+> is a list of C returned by those methods that ran to completion without deferring to next method. It is also possible to trim the candidate list so that the current call is considered the final candidate. (This is implicitly the case already for the dispatch variants that want a single successful call.) For the multiple call variants, C will cause the dispatcher to throw away the rest of the candidate list, and the subsequent return from the current method will produce the final C in the returned list. (If you were already on the last call of the candidate list, no candidates are thrown away, only the list. So you can't accidentally throw away the wrong list by running off the end, since the candidate list is ordinarily not thrown away by the dispatcher until after the last call.) Since it's possible to be dispatching within more than one candidate list at a time, these control flow calls are defined to apply only to the dynamically innermost dispatcher. If, for instance, you have a single dispatch to a C method that then calls into a multiple dispatch on the C methods within a class, C within one of those Cs would go to the next best C method within the class, not the next method candidate in the original single dispatch. This is not a bad limitation, since dispatch loops are dynamically scoped; to get to the outermost lists you can "pop" unwanted candidate lists using C: lastcall; nextsame; # call next in grandparent dispatcher loop [Conjecture: if necessary, C could have an argument or invocant to specify which kind of a dispatch loop we think we're throwing away, in case we're not sure about our context. This confusion could arise since we use C semantics at least three different ways: single dispatch, multiple dispatch, and routine wrapper dispatch.] The C redispatches the method call using the current dispatcher: this is mainly intended if you have one "worker" method, and several "frontend" methods in the same class to avoid code duplication. The frontend methods then mangle the parameters before sending them off to the worker method with C. =head1 Parallel dispatch Any of the method call forms may be turned into a hyperoperator by treating the method call as a postfix: @object».meth(@args) # calls one method on each @object».?meth(@args) # calls method if there is one on each @object».*meth(@args) # calls all methods (0 or more) on each @object».+meth(@args) # calls all methods (1 or more) on each @object».=meth(@args) # calls mutator method on each @object»!meth(@args) # calls private method on each The return value is a list with exactly the same number of elements as C<@object>. Each such return value is a C or C of C as specified above for the non-hyper "dot" variants. Hyperoperators treat a junction as a scalar value, so saying: $junction».meth(@args); is just like: $junction.meth(@args); As with other forms of method call, the "meth" above may be replaced with a quoted string or variable to do various forms of indirection. Note that, as with any hyper operator, the methods may be evaluated in any order (although the method results are always returned in the same order as the list of invocants). Use an explicit loop if you want to do something with ordered side effects, such as I/O. =head1 Multisubs and Multimethods The "long name" of a subroutine or method includes the type signature of its invocant arguments. The "short name" doesn't. =head2 C Declarations If you put C in front of any sub declaration, it allows multiple long names to share a short name, provided all of them are declared C, or there is a single prior or outer C in the same file that causes all unmarked subs to default to multi in that lexical scope. If a sub is not marked with C and it is not governed within that same file by a C of the same short name, it is considered unique, an I sub. (An imported C can function as such a governing declaration.) For method declarations, the C, C, and C declarations work similarly but not identically. The explicit declarations work the same, except that calculation of governance and candidate sets proceeds via the inheritance tree rather than via lexical scoping. The other difference is that a proto method of a given short name forcing all unmarked method declarations to assume multi in all subclasses regardless of which file they are declared in, unless explicitly overridden via C. =head2 C Declarations An C sub (or method) doesn't share with anything outside of it or declared prior to it. Only one such sub (or method) can inhabit a given namespace (lexical scope or class), and it hides any outer subs (or less-derived methods) of the same short name. It is illegal for a C or C declaration to share the same scope with an C declaration of the same short name. Since they come from a different file, the default C declarations provided by Perl from the setting scope do I automatically set the defaults in the user's scope unless explicitly imported, so a C declaration there that happens to be the same as a setting C is considered C unless explicitly marked C. (This allows us to add new C declarations in the setting without breaking the user's old code.) In the absence of such an explicit C declaration, however, the C from the innermost outer lexical scope is used by the compiler in the analysis of any calls to that short name. (Since only list operators may be post-declared, as soon as the compiler sees a non-listop operator it is free to apply the setting's C since any user-defined C version of it must of necessity be declared or imported earlier in the user's file or not at all.) =head2 C Declarations A C always functions as a dispatcher around any Cs declared after it in the same scope, More specifically, it is the generic prototype of a dispatcher, which must be instantiated anew in each scope that has a different candidate list. (This works much like type punning from roles to classes. Or you can think of this dispatcher as a priming of the proto's code with the candidate list appropriate to the scope.) For the sake of discussion, let us say that there is a declarator equivalent to C that is instead spelled C. Generally a user never writes a C sub (it might not even be allowed); a C is always instantiated from the governing C. A new C sub or method is autogenerated in any scope that needs one, that is, in any scope that can see a different set of multi declarations than its parent scope (or scopes, in the case of multiple inheritance). More precisely, for any given C and a point of call, there is a candidate set of routines (functions or methods) that is the intersection of two sets: the set of routines governed downward by the C and the set of routines visible upward from the point of a call. It is allowed to reuse the C of a parent scope if and only if it would result in the same candidate list in the current scope (the scope at the point of call). Since C is nearly identical to C, saying C<&foo> always refers to the innermost visible C or C sub, never to a C or C. Likewise, C<$obj.can('foo')> will return the most-derived C or C method. =head2 C Signatures Within its scope, the signature of a C also nails down the presumed order and naming of positional parameters, so that any call to that short name with named arguments in that scope can presume to rearrange those arguments into positional parameters based on that information. (Unrecognized names remain named arguments.) Any other type information or traits attached to the C may also be passed along to the routines within its scope, so a C definition can be used to factor out common traits. This is particularly useful for establishing grammatical categories in a grammar by declaring a C C or C C. (Perl 6's grammar does this, for instance.) =head2 C Variables You can have multiple C variables of the same name in the same scope, and they all share the same storage location and type. These are declared by one C declaration at the top, in which case you may leave the C implicit on the rest of the declarations in the same scope. You might do this when you suspect you'll have multiple declarations of the same variable name (such code might be produced by a macro or by a code generator, for instance) and you wish to suppress any possible warnings about redefinition. =head2 C Routines In contrast, C routines can have only one instance of the long name in any namespace, and that instance hides any outer (or less-derived) routines with the same long name. It does not hide any routines with the same short name but a different long name. In other words, Cs with the same short name can come from several different namespaces provided their long names differ and their short names aren't hidden by an C or C declaration in some intermediate scope. =head2 Multisub Resolution When you call a routine with a particular short name, if there are multiple visible long names, they are all considered candidates. They are sorted into an order according to how close the run-time types of the arguments match up with the declared types of the parameters of each candidate. The best candidate is called, unless there's a tie, in which case the tied candidates are redispatched using any additional tiebreaker strategies (see below). For the purpose of this nominal typing, no constrained type is considered to be a type name; instead the constrained type is unwound into its base type plus constraint. Only the base type upon which the constrained type is based is considered for the nominal type match (along with the fact that it is constrained). That is, if you have a parameter: subset Odd of Int where { $_ % 2 } proto foo {*} multi foo (Odd $i) {...} it is treated as if you'd instead said: multi foo (Int $i where { $_ % 2 }) {...} Any constrained type is considered to have a base type that is "epsilon" narrower than the corresponding unconstrained type. The compile-time topological sort takes into account the presence of at least one constraint, but nothing about the number or nature of any additional constraints. If we think of Int' as any constrained version of Int, then Int' is always tighter nominally than Int. (Int' is a meta-notation, not Perl 6 syntax.) The order in which candidates are considered is defined by a topological sort based on the "type narrowness" of each candidate's long name, where that in turn depends on the narrowness of each parameter that is participating. Identical types are considered tied. Parameters whose types are not comparable are also considered tied. A candidate is considered narrower than another candidate if at least one of its parameters is narrower and all the rest of its parameters are either narrower or tied. Also, if the signature has any additional required parameters not participating in the long name, the signature as a whole is considered epsilon tighter than any signature without extra parameters. In essence, the remaining arguments are added to the longname as if the user had declared a capture parameter to bind the rest of the arguments, and that capture parameter has a constraint that it must bind successfully to the additional required parameters. All such signatures within a given rank are considered equivalent, and subject to tiebreaker B below. This defines the partial ordering of all the candidates. If the topological sort detects a circularity in the partial ordering, all candidates in the circle are considered tied. A warning will be issued at C time if this is detected and there is no suitable tiebreaker that could break the tie. =head3 Candidate Tiebreaking There are three tiebreaking modes, in increasing order of desperation: A) inner or derived scope B) run-time constraint processing C) use of a candidate marked with "is default" Tiebreaker A simply prefers candidates in an inner or more derived scope over candidates in an outer or less derived scope. For candidates in the same scope, we proceed to tiebreaker B. In the absence of any constraints, ties in tiebreaker A immediately failover to tiebreaker C; if not resolved by C, they warn at compile time about an ambiguous dispatch. If there are any tied candidates with constraints, it follows from our definitions above that all of them are considered to be constrained. In the presence of longname parameters with constraints, or the implied constraint of extra required arguments, tiebreaker B is applied. Candidates which are tied nominally but have constraints are considered to be a completely different situation, insofar as it is assumed the user knows exactly why each candidate has the extra constraints it has. Thus, constrained signatures are considered to be much more like a switch defined by the user. So for tiebreaker B the candidates are simply called in the order they were declared, and the first one that successfully binds (and completes without calling nextsame or nextwith) is considered the winner, and all the other tied candidates are ignored. If all the constrained candidates fail, we throw out the rank of constrained variants and proceed to the next tighter rank, which may consist of the unconstrained variants without extra arguments. For ranks that are not decided by constraint (tiebreaker B), tiebreaker C is used: only candidates marked with the C trait are considered, and the best matching default routine is used. If there are no default routines, or if two or more of the defaults are tied for best, the dispatch fails. =head3 Parameter Constraint Exclusion Ordinarily all the parameters of a multi sub are considered for dispatch. Here's a declaration for an integer range operator with two parameters in its long name: multi sub infix:<..>(Int $min, Int $max) {...} Sometimes you want to have parameters that aren't counted as part of the long name. For instance, if you want to allow an optional "step" parameter to your range operator, but not consider it for multi dispatch, then put a double semicolon instead of a comma before it: multi sub infix:<..>(Int $min, Int $max;; Int $by = 1) {...} The double semicolon, if any, determines the complete long name of a C. (In the absence of that, a double semicolon is assumed after the last declared argument, but before any return signature.) Note that a call to the routine must still be compatible with subsequent arguments. Note that the C<$by> is not a required parameter, so doesn't impose the kind of constraint that allows tiebreaker B. If the default were omitted, it would be a required parameter, and subject to tiebreaker B. Likewise an ordinary named parameter does not participate as a tiebreaker, but you can mark named parameters as required to effectively make a switch based on named binding: multi foo (Int $a;; :$x!) {...} # constrained multi foo (Int $a;; :$y!) {...} # constrained multi foo (Int $a;; :$z!) {...} # constrained multi foo (Int $a;; *%_) {...} # unconstrained The first three are dispatched under tiebreaker B as a constrained rank. If none of them can match, the final one is dispatched as an unconstrained rank, since C<*%_> is not considered a required parameter. =head3 Constrained Type Candidates Likewise, constrained types sort before unconstrained: multi bar (Even $a) {...} # constrained multi bar (Odd $a) {...} # constrained multi bar (Int $a) {...} # unconstrained And values used as subset types also sort first, and are dispatched on a first-to-match basis: multi baz (0) {...} # constrained multi baz (1) {...} # constrained multi baz (Int $x) {...} # unconstrained If some of the constrained candidates come by import from other modules, they are all considered to be declared at the point of importation for purposes of tiebreaking; subsequent tiebreaking is provided by the original order in the used module. [Conjecture: However, a given C may advertise multiple long names, some of which are shorter than the complete long name. This is done by putting a semicolon after each advertised long name (replacing the comma, if present). A semicolon has the effect of inserting two candidates into the list. One of them is inserted with exactly the same types, as if the semicolon were a comma. The other is inserted as if all the types after the semicolon were of type C, which puts it later in the list than the narrower actual candidate. This merely determines its sort order; the candidate uses its real type signature if the dispatcher gets to it after rejecting all earlier entries on the candidate list. If that set of delayed candidates also contains ties, then additional semicolons have the same effect within that sublist of ties. Note, however, that semicolon is a no-op if the types after it are all C. (As a limiting case, putting a semicolon after every parameter produces dispatch semantics much like Common Lisp. And putting a semicolon after only the first argument is much like ordinary single-dispatch methods.) Note: This single-semicolon syntax is merely to be considered reserved until we understand the semantics of it, and more importantly, the pragmatics of it (that is, whether it has any valid use case). Until then only the double-semicolon form will be implemented in the standard language.] A C or C doesn't ordinarily participate in any subroutine-dispatch process. However, they can be made to do so if prefixed with a C or C declarator. =head3 Multi Submethods et Cetera Multi submethods work just like multi methods except they are constrained to an exact type match on the invocant, just as ordinary submethods are. Perl 6.0.0 is not required to support multiple dispatch on named parameters, only on positional parameters. Note however that any dispatcher derived from C will map named arguments to known declared positional parameters and call the C candidates with positionals for those arguments rather than named arguments. Within a multiple dispatch, C means to try the next best match, or next best default in case of tie. The C keyword is optional immediately after a C, C, or C keyword, but the C keyword is not. A C declaration may not occur after a C declaration in the same scope. =head2 Method call vs. Subroutine call The caller indicates whether to make a method call or subroutine call by the call syntax. The "dot" form and the indirect object form default to method calls. All other prefix calls default to subroutine calls. This applies to prefix unary operators as well: !$obj; # same as $obj.prefix: A method call considers only methods (including multi-methods and submethods) from the class hierarchy of its invocant, and fails if none is found. The object in question is in charge of interpreting the meaning of the method name, so if the object is a foreign object, the name will be interpreted by that foreign runtime. A subroutine call considers only visible subroutines (including submethods) of that name. The object itself has no say in the dispatch; the subroutine dispatcher considers only the types the arguments involved, along with the name. Hence foreign objects passed to subroutines are forced to follow Perl semantics (to the extent foreign types can be coerced into Perl types, otherwise they fail). There is no fail-over either from subroutine to method dispatch or vice versa. However, you may use C on a method definition to make it available also as a C sub. As with indirect object syntax, the first argument is still always the invocant, but the export allows you to use a comma after the invocant instead of a colon, or to omit the colon entirely in the case of a method with no arguments other than the invocant. Many standard methods (such as C and C) are automatically exported to the C namespace by default. For other exported methods, you will not see the C sub definition unless you C the class in your scope, which will import the C (and associated C subs) lexically, after which you can call it using normal subroutine call syntax. In the absence of an explicit type on the method's invocant, the exported C sub's first argument is implicitly constrained to match the class in which it was defined or composed, so for instance the C version of C requires its first argument to be of type C or one of its subclasses. If the invocant is explicitly typed, that will govern the type coverage of the corresponding C's first argument, whether that is more specific or more general than the class's invocant would naturally be. (But be aware that if it's more specific than C<::?CLASS>, the binding may reject an otherwise valid single dispatch as well as a multi dispatch.) In any case, it does no good to overgeneralize the invocant if the routine itself cannot handle the broader type. In such a situation you must write a wrapper to coerce to the narrower type. =head1 Trusts Attributes are tied to a particular class definition, so a method can only directly access the attributes of a class it's defined within when the invocant is the "self" of that attribute. However, it may call the private attribute accessors from a different class if that other class has indicated that it trusts the class the multi method is defined in: class MyClass { trusts YourClass; ... } The trust really only applies to C, not to possible subclasses thereof. The syntax for calling back to C is C<$obj!MyClass::meth()>. Note that private attribute accessors are always invoked directly, never via a dispatcher, since there is never any question about which object is being referred to. Hence, the private accessor notation may be aggressively inlined for simple attributes, and no simpler notation is needed for accessing another object's private attributes. =head1 Delegation Delegation lets you pretend that some other object's methods are your own. Delegation is specified by a C trait verb with an argument specifying one or more method names that the current object and the delegated object will have in common: has $tail handles 'wag'; Since the method name (but nothing else) is known at class construction time, the following C<.wag> method is autogenerated for you: method wag (|args) { $!tail.wag(|args) } You can specify multiple method names: has $.legs handles ; It's illegal to call the outer method unless the attribute has been initialized to an object of a type supporting the method, such as by: has Tail $.tail handles 'wag' .= new(|%_); Note that putting a C type on the attribute does not necessarily mean that the method is always delegated to the C class. The dispatch is still based on the I type of the object, not the declared type. Any other kind of argument to C is considered to be a smartmatch selector for method names. All such selectors establish a failover to be used only if normal method dispatch fails, so it cannot be used to override any method in the normal ancestry of the object. So you can say: has $.fur is rw handles /^get_/; If you say has $.fur is rw handles Groomable; then you get only those methods available via the C role or class. To delegate everything, use the C matcher: has $the_real_me handles *; Wildcard matches are evaluated only after it has been determined that there's no exact match to the method name anywhere in this object or in any of its parents. When you have multiple wildcard delegations to different objects, it's possible to have a conflict of method names. Wildcard method matches are evaluated in order, so the earliest one wins. (Non-wildcard method conflicts can be caught at class composition time.) If the wildcards for this class find nothing, then wildcards are checked for each of the ancestral classes in standard method resolution order. The form with C<*> checks only for existing methods in the delegate's class (or its parents). It will not call any kind of a fallback via the delegate. (This allows you to call a C routine of your own if the delegation would fail, since your own C always runs after delegation, even wildcard delegation.) If instead you want to delegate completely and utterly, including a search of the delegate for its own fallback methods, with abject failure if the delegate can't handle it, then use the "HyperWhatever" instead: has $the_real_me handles **; If, where you would ordinarily specify a string, you put a pair, then the pair maps the method name in this class to the method name in the other class. If you put a hash, each key/value pair is treated as such a mapping. Such mappings are not considered wildcards. has $.fur handles { :shakefur, :scratch }; You I do a wildcard renaming, but not with pairs. Instead do smartmatch with a substitution: has $.fur handles (s/^furget_/get_/); Ordinarily delegation is based on an attribute holding an object, but it can also be based on the return value of a method: method select_tail handles {...} =head1 Types and Subtypes The type system of Perl consists of roles, classes, and subtypes. You can declare a subtype like this: my subset Str_not2b of Str where /^[isnt|arent|amnot|aint]$/; or this: my Str subset Str_not2b where /^[isnt|arent|amnot|aint]$/; An anonymous subtype looks like this: Str where /^[isnt|arent|amnot|aint]$/; A C clause implies future smartmatching of some kind: the as-yet unspecified object of the type on the left must match the selector on the right. Our example is roughly equivalent to this closure: { $_.does(Str) and $_ ~~ /^[isnt|arent|amnot|aint]$/; } except that a subtype knows when to call itself. A subtype is not a subclass. Subclasses add capabilities, whereas a subtype adds constraints (takes away capabilities). A subtype is primarily a handy way of sneaking smartmatching into multiple dispatch. Just as a role allows you to specify something more general than a class, a subtype allows you to specify something more specific than a class. A subtype specifies a subset of the values that the original type specified, which is why we use the C keyword for it. While subtypes are primarily intended for restricting parameter types for multiple dispatch, they also let you impose preconditions on assignment. If you declare any container with a subtype, Perl will check the constraint against any value you might try to bind or assign to the container. subset Str_not2b of Str where /^[isnt|arent|amnot|aint]$/; subset EvenNum of Num where { $^n % 2 == 0 } my Str_not2b $hamlet; $hamlet = 'isnt'; # Okay because 'isnt' ~~ /^[isnt|arent|amnot|aint]$/ $hamlet = 'amnt'; # Bzzzzzzzt! 'amnt' !~~ /^[isnt|arent|amnot|aint]$/ my EvenNum $n; $n = 2; # Okay $n = -2; # Okay $n = 0; # Okay $n = 3; # Bzzzzzzzt It's legal to base one subtype on another; it just adds an additional constraint. That is, it's a subset of a subset. You can use an anonymous subtype in a signature: sub check_even (Num where { $^n % 2 == 0 } $even) {...} That's a bit unwieldy, but by the normal type declaration rules you can turn it around to get the variable out front: sub check_even ($even of Num where { $^n % 2 == 0 }) {...} and just for convenience we also let you write it: sub check_even (Num $even where { $^n % 2 == 0 }) {...} since all the type constraints in a signature parameter are just anded together anyway. You can leave out the block when matching against a literal value of some kind: proto sub fib (Int $) {*} multi sub fib (Int $n where 0|1) { return $n } multi sub fib (Int $n) { return fib($n-1) + fib($n-2) } In fact, you can leave out the C declaration altogether: multi sub fib (0) { return 0 } multi sub fib (1) { return 1 } multi sub fib (Int $n) { return fib($n-1) + fib($n-2) } Subtype constraints are used as tiebreakers in multiple dispatch: use Rules::Common :profanity; multi sub mesg ($mesg of Str where // is copy) { $mesg ~~ s:g//[expletive deleted]/; print $MESG_LOG: $mesg; } multi sub mesg ($mesg of Str) { print $MESG_LOG: $mesg; } For multi dispatch, a long name with a matching constraint is preferred over an equivalent one with no constraint. So the first C above is preferred if the constraint matches; otherwise the second is preferred. To export a subset type, put the export trait just before the C: subset Positive of Int is export where * > 0; Note that the declaration of the C type for a subset doesn't really mean what an C type means elsewhere in the language. It is merely declaring the universe of input values for a boolean function. In fact, when used as a coercion, a subset type returns a C based on its condition, so it can be used directly as a predicate without the overhead of smartmatching: if Even($x) { ... } .say if .Even for 1..10; =head2 Abstract vs Concrete types For any named type, certain other subset types may automatically be derived from it by appending an appropriate adverbial to its name: Int:_ Allow either defined or undefined Int values Int:D Allow only defined (concrete) Int values Int:U Allow only undefined (abstract) Int values That is, these mean something like: Int:D Int:_ where DEFINITE($_) Int:U Int:_ where not(DEFINITE($_)) where C is a boolean macro that says whether the object in question has a valid concrete representation (see L below). In standard Perl 6, C is generally assumed to mean C, except for invocants, where the default is C. (The default C method has a prototype whose invocant is C<:U> instead, so all new methods all default to allowing type objects.) These defaults may be changed within a lexical scope by various pragmas. In particular, use parameters :D; will cause non-invocant parameters to default to C<:D>. Conjecturally, use variables :D; would do the same for types used in variable declarations. In such lexical scopes you may use the C<:_> form to get back to the standard behavior. In particular, since invocants default to defined, use invocant :_; will make invocants allow any sort of defined or undefined invocant. =head2 Multiple constraints [Conjecture: This entire section is considered a guess at our post-6.0.0 direction. For 6.0.0 we will allow only a single constraint before the variable, and post constraints will all be considered "epsilon" narrower than the single type on the left. The single constraint on the left may, however, be a value like 0 or a named subset type. Such a named subset type may be predeclared with an arbitrarily complex C clause; for 6.0.0 any structure type information inferable from the C clause will be ignored, and the declared subset type will simply be considered nominally derived from the C type mentioned in the same declaration.] More generally, a parameter can have a set of constraints, and the set of constraints defines the formal type of the parameter, as visible to the signature. (No one constraint is privileged as the storage type of the actual argument, unless it is a native type.) All constraints are considered in type narrowness. That is, these are equivalently narrow: Foo Bar @x Bar Foo @x The constraint implied by the sigil also counts as part of the official type. The sigil is actually a constraint on the container, so the actual type of the parameter above is something like: Positional[subset :: of Any where Foo & Bar] Static C clauses also count as part of the official type. A C clause is considered static if it can be applied to the types to the left of it at compile time to produce a known finite set of values. For instance, a subset of an enum type is a static set of values. Hence Day $d where 'Mon'..'Fri' is considered equivalent to subset Weekday of Day where 'Mon'..'Fri'; Weekday $d Types mentioned in a dynamic C class are not considered part of the official type, except insofar as the type includes the notion: "is also constrained by a dynamic C clause", which narrows it by epsilon over the equivalent type without a C clause. Foo Bar @x # type is Foo & Bar & Positional Foo Bar @x where Baz # slightly tighter than Foo Bar Positional The set of constraints for a parameter creates a subset type that implies some set of allowed values for the parameter. The set of allowed values may or may not be determinable at compile time. When the set of allowed values is determinable at compile time, we call it a static subtype. Type constraints that resolve to a static subtype (that is, with a fixed set of elements knowable (if not known) at compile time) are considered to be narrower than type constraints that involve run-time calculation, or are otherwise intractable at compile time. Note that all values such as 0 or "foo" are considered singleton static subtypes. Singleton values are considered narrower than a subtype with multiple values, even if the subtype contains the value in question. This is because, for enumerable types, type narrowness is defined by doing set theory on the set of enumerated values. So assuming: my enum Day ['Sun','Mon','Tue','Wed','Thu','Fri','Sat']; subset Weekday of Day where 'Mon' .. 'Fri'; # considered static subset Today of Day where *.today; we have the following pecking order: Parameter # Set of possible values ========= ======================== Int $n # Int Int $n where Today # Int plus dynamic where Int $n where 1 <= * <= 5 # Int plus dynamic where Day $n # 0..6 Day $n where Today # 0..6 plus dynamic where Day $n where 1 <= * <= 5 # 1..5 Int $n where Weekday # 1..5 Day $n where Weekday # 1..5 Weekday $n # 1..5 Tue # 2 Note the difference between: Int $n where 1 <= * <= 5 # Int plus dynamic where Day $n where 1 <= * <= 5 # 1..5 The first C is considered dynamic not because of the nature of the comparisons but because C is not finitely enumerable. Our C subset type can calculate the set membership at compile time because it is based on the C enum, and hence is considered static despite the use of a C. Had we based C on C it would have been considered dynamic. Note, however, that with "anded" constraints, any enum type governs looser types, so Int Day $n where 1 <= * <= 5 is considered static, since C is an enum, and cuts down the search space. The basic principle we're trying to get at is this: in comparing two parameter types, the narrowness is determined by the subset relationships on the sets of possible values, not on the names of constraints, or the method by which those constraints are specified. For practical reasons, we limit our subset knowledge to what can be easily known at compile time, and consider the presence of one or more dynamic constraints to be epsilon narrower than the same set of possible values without a dynamic constraint. As a first approximation for 6.0.0, subsets of enums are static, and other subsets are dynamic. We may refine this in subsequent versions of Perl. =head1 Enumerations An enumeration is a type that facilitates the use of a set of symbols to represent a set of constant values. Its most obvious use is the translation of those symbols to their corresponding values. Each enumeration association is a constant pair known as an I, which is of type C. Each enum associates an I with an I. Semantically therefore, an enumeration operates like a constant hash, but since it uses a package C to hold the entries, it presents itself to the user's namespace as a typename package containing a set of constant declarations. That is, enum E ; is largely syntactic sugar for: package E { constant a = 0; constant b = 1; constant c = 2; } (However, the C declaration supplies extra semantics.) Such constant declarations allow the use of the declared names to stand in for the values where a value is desired. In addition, since a constant declaration introduces a name that behaves as a subtype matching a single value, the enum key can function as a typename in certain capacities where a typename is required. The name of the enumeration as a whole is also considered a typename, and may be used to represent the set of values. (Note that when we wish to verbally distinguish the enumeration as a whole from each individual enum pair, we use the long term "enumeration" for the former, despite the fact that it is declared using the C keyword.) In the C declaration, the keys are specified as a parenthesized list, or an equivalent angle bracket list: my enum Day ('Sun','Mon','Tue','Wed','Thu','Fri','Sat'); my enum Day ; =head2 Value Generation The values are generated implicitly by default, but may also be specified explicitly. If the first value is unspecified, it defaults to 0. To specify the first value, use pair notation (see below). If the declared enumeration typename begins with an uppercase letter, the enum values will be derived from C or C as appropriate. If the enumeration typename is lowercase, the enumeration is assumed to be representing a set of native values, so the default value type is C or C. The base type can be specified if desired: my bit enum maybe ; my Int enum day ('Sun','Mon','Tue','Wed','Thu','Fri','Sat'); our enum day of uint4 ; The declared base type automatically distributes itself to the individual constant values. For non-native types, the enum objects are guaranteed only to be derived from and convertible to the specified type. The actual type of the enum object returned by using the symbol is the enumeration type itself. Fri.WHAT # Day, not Int. +Fri # 5 Fri.Numeric # 5 Fri ~~ Int # True, because derived from Int Fri.perl # 'Day::Fri' ~Fri # 'Fri' (only for numeric enums) Fri.Stringy # 'Fri' (only for numeric enums) Fri.Str # 'Fri' (only for numeric enums) Fri.gist # 'Fri' (used by say) Fri.key # 'Fri' Fri.value # 5 Fri.pair # :Fri(5) Fri.kv # 'Fri', 5 Fri.defined # True Other than that, number valued enums act just like numbers, while string valued enums act just like strings. C is true because its value is 5 rather than 0. C is false. Enums based on native types may be used only for their value, since a native value doesn't know its own type. Since methods on native types delegate to their container's type, a variable typed with a native type will know which method to call: my day $d = 3; $d.key # returns "Wed" Such declarational forms are not always convenient; to translate native enum values back to their names operationally, you can pull out the enum type's C and invert it: constant %dayname := day.enums.invert; %dayname{3} # Wed =head2 The Enumeration Type The enumeration type itself is an undefined type object, but supplies convenient methods: Day.defined # False 3 ~~ Day($_) # True, in range 8 ~~ Day($_) # False, *not* in range Day.enums # map of key/value pairs The C<.enums> method returns an C that may be used either as a constant hash value or as a list of pairs: my enum CoinFace ; CoinFace.enums.keys # ('Heads', 'Tails') CoinFace.enums.values # (0, 1) CoinFace.enums.kv # ('Heads', 0, 'Tails', 1) CoinFace.enums.invert # (0 => 'Heads', 1 => 'Tails') CoinFace.enums.[1] # Tails => 1 The enumeration typename itself may be used as a coercion operator from either the key name or a value. First the argument is looked up as a key; if that is found, the enum object is returned. If the key name lookup fails, the value is looked up using an inverted mapping table (which might have duplicates if the mapping is not one-to-one): Day('Tue') # Tue constant, found as key Day::('Tue') # (same thing) Day(3) # Wed constant, found as value Day.enums.invert{3} # (same thing) =head2 Anonymous Enumerations An anonymous C just makes sure each string turns into a pair with sequentially increasing values, so: %e = enum < ook! ook. ook? >; is equivalent to: %e = (); %e = 0; %e = 1; %e = 2; The return value of an anonymous enumeration is an C. The C keyword is still a declarator here, so the list is evaluated at compile time. Use a coercion to C to get a run-time map. =head2 Composition from Pairs The enumeration composer inspects list values for pairs, where the value of the pair sets the next value explicitly. Non-pairs C<++> the previous value. (Str and buf types increment like Perl 5 strings.) Since the C<«...»> quoter automatically recognizes pair syntax along with interpolations, we can simply say: my enum DayOfWeek «:Sun(1) Mon Tue Wed Thu Fri Sat»; our Str enum Phonetic «:Alpha Bravo Charlie Delta Echo Foxtrot Golf Hotel India Juliet Kilo Lima Mike November Oscar Papa Quebec Romeo Sierra Tango Uniform Victor Whiskey X-ray Yankee Zulu»; enum roman (i => 1, v => 5, x => 10, l => 50, c => 100, d => 500, m => 1000); my Item enum hex «:zero(0) one two three four five six seven eight nine :ten eleven twelve thirteen fourteen fifteen»; Note that enumeration declaration evaluates its list at compile time, so any interpolation into such a list may not depend on run-time values. Otherwise enums wouldn't be constants. (If this isn't what you want, try initializing an ordinary declaration using C<::=> to make a scoped readonly value.) You may import enum types; only non-colliding symbols are imported. Colliding enum keys are hidden and must be disambiguated with the type name. Any attempt to use the ambiguous name will result in a fatal compilation error. (All colliding values are hidden, not just the new one, or the old one.) Any explicit sub or type definition hides all imported enum keys of the same name but will produce a warning unless C is included. =head2 Anonymous Mixin Roles using C or C Since non-native C values know their enumeration type, they may be used to name a desired property on the right side of a C or C. So these: $x = "Today" but Tue; $y does True; expand to: $x = "Today" but Day::Tue; $y does Bool::True; The C and C operators expect a role on their right side. An enum type is not in itself a role type; however, the C and C operators know that when a user supplies an enum type, it implies the generation of an anonymous mixin role that creates an appropriate accessor, read-write if an attribute is being created, and read-only otherwise. It depends on whether you mix in the whole or a specific enum or the whole enumeration: $x = "Today" but Tue; # $x.Day is read-only $x = "Today" but Day; # $x.Day is read-write Mixing in a specific enum object implies only the readonly accessor. $x = "Today" but Tue; really means something like: $x = "Today".clone; $x does anon role { method Day { Day::Tue } }; The fully qualified form does the same thing, and is useful in case of enum collision: $x = "Today" but Day::Tue; Note that the method name is still C<.Day>, however. If you wish to mix in colliding method names, you'll have to mixin your own anonymous role with different method names. Since an enumeration supplies the type name as a coercion, you can also say: $x = "Today" but Day(Tue); $x = "Today" but Day(2); After any of those $x.Day returns C (that is, the constant object representing 2), and both the general and specific names function as typenames in normal constraint and coercion uses. Hence, $x ~~ Day $x ~~ Tue $x.Day == Tue Day($x) == Tue $x.Tue all return true, and $x.Wed $x.Day == Wed 8 ~~ Day($_) all return false. Mixing in the full enumeration type produces a read-write attribute: $x = "Today" but Day; # read-write .Day really means something like: $x = "Today".clone; $x does anon role { has Day $.Day is rw } except that nothing happens if there is already a C attribute of that name. Note that the attribute is not initialized. If that is desired you can supply a C closure: $x = "Today" but Day{ :Day(Tue) } $x = "Today" but Day{ Tue } # conjecturally, for "simple" roles =head2 Adding Traits To add traits to an enumeration declaration, place them after the declared name but before the list: enum Size is silly ; =head2 Exporting To export an enumeration, place the export trait just before the list: enum Maybe is export ; =head2 Implying a Role To declare that an enumeration implies a particular role, supply a C in the same location enum Maybe does TristateLogic ; =head2 Built-in Enumerations Two built-in enumerations are: our enum Bool does Boolean ; our enum Taint does Tainting ; Note that C and C are not role names themselves but imply roles, and the enum values are really subset types of C, though the constant objects themselves know that they are of type C or C, and can therefore be used correctly in multimethod dispatch. You can call the low-level C<.Bool> coercion on any built-in type, because all built-in types do the C role, which requires a C<.Bool> method. Hence, there is a great difference between saying $x does Boolean; # a no-op, since $x already does Boolean $x does Bool; # create a $.Bool attribute, also does Boolean Conditionals evaluate the truth of a boolean expression by testing the return value of C<.Bool>; how they do this is a mystery, except that they must do something mysterious and platform dependent to avoid calling C<.Bool> recursively on the results of C<.Bool>. Never compare a value to "C". Just use it in a boolean context. Well, almost never... If you wish to be explicit about a boolean context, use the high-level C function or C prefix operator, which are underlying based on the C<.Bool> method. Since C<.Bool> always collapses junctions, so do these functions. (Hence if you really need to autothread a bunch of boolean values, you'll have to convert them to some other type such as C that can be used as a boolean value later. Generally it makes no sense to autothread booleans, so we have a policy of collapsing them sooner rather than later.) =head2 Miscellaneous Rules Like other type names and constant names, enum keynames are parsed as standalone tokens representing scalar values, and don't look for any arguments. Unlike type names but like constant names, enum keynames return defined values. Also unlike types and unlike the enum type as a whole, individual keynames do not respond to C<.()> unless you mix in C somehow. (That is, it makes no sense to coerce Wednesday to Tuesday by saying C.) Enumerations may not be post-declared. our enum Maybe ; sub OK is redefined {...} $x = OK; # certainly the enum value $x = OK() # certainly the function Since there is an enum C, the function C may only be called using parentheses, never in list operator form. (If there is a collision on two enum values that cancels them both, the function still may only be called with parentheses, since the enum key is "poisoned".) =head2 The C<.pick> Method Enumeration types (and perhaps certain other finite, enumerable types such as finite ranges) define a C<.pick> method on the type object of that type. Hence: my enum CoinFace ; CoinFace.pick returns C or C with equal probability, and Month.pick(*) will return the months in random order. Presumably StandardPlayingCards.pick(5) might return a Royal Flush, but a Full House is much more likely. It can never return Five Aces, since the pick is done without replacement. (If it I return Five Aces, it's time to walk away. Or maybe run.) To pick from the list of keynames or values, derive them via the C<.enums> method described above. =head1 Open vs Closed Classes By default, all classes in Perl are non-final, which means you can potentially derive from them. They are also open, which means you can add more methods to them, though you have to be explicit that that is what you're doing: augment class Mu { method wow () { say "Wow, I'm in the Cosmic All." } } Otherwise you'll get a class redefinition error. (Also, to completely replace a definition, use "C" instead of "C"...but don't do that, since the compiler may have already committed to optimizations based on the old definition.) In order to discourage casual misuse of these declarators, they are not allowed on global classes unless you put a special declaration at the top: use MONKEY-TYPING; For optimization purposes, Perl 6 gives the top-level application the right to close and finalize classes by the use of C, a pragma for selecting global semantics of the underlying object-oriented engine: use oo :closed :final; This forces the optimizer to consider the current file to represent the top-level application; however, the optimizer is also allowed to assume these semantics when it can determine that it is linking an entire application, such as when the current file is being run from the command line or from a mouse click. These pragmatics (whether explicit or assumed) merely change the application's default to closed and final, which means that at the end of the main compilation (C time) the optimizer is allowed to look for candidate classes to close or finalize. But anyone (including the main application) can request that any class stay open or nonfinal, and the class closer/finalizer must honor that. use class :open :nonfinal These properties may also be specified on the class definition: class Mammal is open {...} class Insect is open {...} class Str is nonfinal {...} or by lexically scoped pragma around the class definition: { use class :open; class Mammal {...} class Insect {...} } { use class :nonfinal; class Str {...} } There is I syntax for declaring individual classes closed or final. The application may only request that the optimizer close and finalize unmarked classes. =head1 Representations By default Perl 6 assumes that all objects have a representation of C. This may be overridden with a trait: class Mammal is repr(P6Hash) {...} Whether implicit or explicit, the representation is considered to be fixed for the class after declaration, and the optimizer is free to optimize based on this guarantee. It is illegal to create an object of the same type with any other representation. If you wish to allow objects to be created with run-time specified representations, you must specifically pessimize the class: class Mammal is repr(*) {...} An C is allowed to do this as long as it is before the main C time, at which point the compiler commits to its optimization strategies. Compilers are not required to support run-time pessimizations (though they may). Compilers may also generate both optimal and pessimal code paths and choose which to run based on run-time information, as long as correct semantics are maintained. All non-native representations are required to support undefined type objects that may contain unthrown exceptions (C objects); while this can be implemented using an alternate representation, Perl 6 doesn't think of it that way. All normal objects in Perl 6 may be used as a specific object (proper noun) if they are defined, or as a generic object (common noun) whether or not they are defined. You get this representation polymorphism for free independently of the restriction above. =head1 Interface Consistency By default, all methods and submethods that do not declare an explicit C<*%> parameter will get an implicit C<*%_> parameter declared for them whether they like it or not. In other words, all methods allow unexpected named arguments, so that C semantics work consistently. If you mark a class "C