The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
=for rcs $Id$

=head1 NAME

gencls - class interface compiler for Prima core modules

=head1 SYNOPSIS

  gencls --h --inc --tml -O -I<name> --depend --sayparent filename.cls


=head1 DESCRIPTION

Creates headers with C macros and structures for Prima
core module object definitions.

=head1 ARGUMENTS

gencls accepts the following arguments:

=over 4

=item --h

Generates .h file ( with declarations to be included in one or more files )

=item --inc

Generates .inc file ( with declarations to be included in only file )

=item -O

Turns optimizing algorithm for .inc files on. Algorithm is
based on an assumption, that some functions are declared identically,
therefore the code piece that handles the parameter and result conversion
can be shared. With C<-O>
flag on, a thunk body is replaced to a call to
a function, which name is made up from all method parameters plus result.
Actual function is not written in .inc file, but in .tml file.
All duplicate declarations from a set of .tml files can be removed
and the reminder written to one file by L<tmlink> utility.

=item --tml

Generates .tml file. Turns C<-O> automatically on.

=item -Idirname

Adds a directory to a search path, where the utility searches for
.cls files. Can be specified several times.

=item --depend

Prints out dependencies for a given file.

=item --sayparent

Prints out the immediate parent of a class inside given file.

=back

=head1 SYNTAX

In short, the syntax of a .cls file can be described by the following scheme:

  [ zero or more type declarations ]
  [ zero or one class declaration ]

Gencls produces .h, .inc or .tml files, with a base name of
the .cls file, if no object or package name given, or
with a name of the object or the package otherwise.

=head2 Basic scalar data types

Gencls has several built-in scalar data types, that it knows how to deal
with. To 'deal' means that it can generate a code that transfers
data of these types between C and perl, using XS ( see L<perlguts> )
library interface.

The types are:

   int
   Bool
   Handle
   double
   SV*
   HV*
   char *
   string ( C declaration is char[256] )

There are also some derived built-in types, which are

   long 
   short 
   char  
   Color 
   U8    

that are mapped to int. The data undergo no conversion to int in transfer
process, but it is stored instead to perl scalar using newSViv() function,
which, in turn, may lose bits or a sign.

=head2 Derived data types

The syntax for a new data types definition is as follows:

   <scope> <prefix> <id> <definition>

A scope can be one of two pragmas, C<global> or C<local>.
They hint the usage of a new data type, whether the type
will be used only for one or more objects. Usage of
C<local> is somewhat resembles C pragma static.
Currently the only difference is that a function 
using a complex local type in the parameter list or
as the result is not a subject for C<-O> optimization.

=head2 Scalar types

New scalar types may only be aliased to the existing ones,
primarily for C coding convenience.
A scalar type can be defined in two ways:

=over

=item Direct aliasing

Syntax:

  <scope> $id => <basic_scalar_type>;

Example:

  global $Handle => int;

The new type id will not be visible in C files, but the type
will be substituted over all .cls files that include
this definition.

=item C macro 

Syntax:

  <scope> id1 id2

Example:

  global API_HANDLE UV

Such code creates a C macro definition in
.h header file in form

  #define id1 id2

C macros with parameters are not allowed. id1 and id2 are
not required to be present in .cls name space, and
no substitution during .cls file processing is made.
This pragma usage is very limited.

=back

=head2 Complex types

Complex data types can be arrays, structs and hashes.
They can be a combination or a vector of scalar ( but
not complex) data types.

Gencls allows several combinations of complex data types
that C language does not recognize. These will be described
below.

Complex data types do not get imported into perl code.
A perl programmer must conform to the data type used
when passing parameters to a function.

=over

=item Arrays

Syntax:

  <scope> @id <basic_scalar_type>[dimension]; 

Example:

  global @FillPattern U8[8]; 

Example of functions using arrays:

  Array * func( Array a1, Array * a2);

Perl code:

  @ret = func( @array1, @array2); 

Note that array references are not used, and
the number of items in all array parameters must
be exactly as the dimensions of the arrays.

Note: the following declaration will not compile
with C compiler, as C cannot return arrays. However 
it is not treated as an error by gencls:

  Array func();

=item Structs

Syntax:

  <scope> @id {
     <basic_scalar_type> <id>;
     ...
     <basic_scalar_type> <id>;
  };

Example:

  global @Struc {
     int    number;
     string id;
  }


Example of functions using structs:

  Struc * func1( Struc a1, Struc * a2);
  Struc   func2( Struc a1, Struc * a2);

Perl code:

  @ret = func1( @struc1, @struc2); 
  @ret = func2( @struc1, @struc2); 

Note that array references are not used, and
both number and order of items in all array parameters 
must be set exactly as dimensions and order 
of the structs. Struct field names are not used
in perl code as well.

=item Hashes

Syntax:

  <scope> %id {
     <basic_scalar_type> <id>;
     ...
     <basic_scalar_type> <id>;
  };

Example:

  global %Hash {
     int    number;
     string id;
  }


Example of functions using hashes:

  Hash * func1( Hash a1, Hash * a2);
  Hash   func2( Hash a1, Hash * a2);

Perl code:

  %ret = %{func1( \%hash1, \%hash2)};
  %ret = %{func2( \%hash1, \%hash2)};

Note that only hash references are used and returned.
When a hash is passed from perl code it might have
some or all fields unset. The C structure is filled
and passed to a C function, and the fields that were
unset are assigned to a corresponding C_TYPE_UNDEF
value, where TYPE is one of NUMERIC, STRING and POINTER
literals.

Back conversion does not count on these values
and always returns all hash keys with a corresponding
pair.

=back

=head2 Namespace section

Syntax:

   <namespace> <ID> {
      <declaration>
      ...
      <declaration>
   }

A .cls file can have zero or one namespace sections,
filled with function descriptions. Functions described here
will be exported to the given ID during initialization
code. A namespace can be either C<object> or C<package>.

The package namespace syntax allows only declaration
of functions inside a C<package> block.

    package <Package ID> {
        <function description>
        ...
    }

The object namespace syntax includes variables and properties
as well as functions ( called methods in the object syntax ).
The general object namespace syntax is

    object <Class ID> [(Parent class ID)] {
       <variables>
       <methods>
       <properties>
    }


Within an object namespace the inheritance syntax
can be used:

    object <Class ID> ( <Parent class ID>)  { ... }

or a bare root object description ( with no ancestor )

    object <Class ID> { ... } 

for the object class declaration.

=head2 Functions

Syntax:

    [<prefix>] <type> <function_name> (<parameter list>) [ => <alias>];

Examples:

        int   package_func1( int a, int b = 1) => c_func_2; 
        Point package_func2( Struc * x, ...);
 method void  object_func3( HV * profile);

A prefix is used with object functions ( methods ) only.
More on the prefix in L<Methods> section.

A function can return nothing ( void ), a scalar ( int, string, etc )
or a complex ( array, hash ) type. It can as well accept
scalar and complex parameters, with type conversion that 
corresponds to the rules described above in L<Basic scalar data types>
section.

If a function has parameters and/or result of a type that
cannot be converted automatically between C and perl,
it gets declared but not exposed to perl namespace.
The corresponding warning is issued.
It is not possible using gencls syntax to declare 
a function with custom parameters or result data.
For such a purpose the explicit C declaration 
of code along with C<newXS> call must be made.

Example: ellipsis (...) cannot be converted by gencls,
however it is a legal C construction.

  Point package_func2( Struc * x, ...); 


The function syntax has several convenience additions:

=over

=item Default parameter values

Example:

  void func( int a = 15);

A function declared in such way can be called both
with 0 or 1 parameters. If it is called with 0 parameters,
an integer value of 15 will be automatically used. 
The syntax allows default parameters for types int,
pointer and string and their scalar aliases.

Default parameters can be as many as possible, but
they have to be in the end of the function parameter list.
Declaration C<func( int a = 1, int b)> is incorrect.

=item Aliasing

In the generated C code, a C function has to be called
after the parameters have been parsed. Gencls expects
a conformant function to be present in C code, with
fixed name and parameter list. However, if the task of
such function is a wrapper to an identical function
published under another name, aliasing can be preformed
to save both code and speed.

Example:

   package Package {
      void func( int x) => internal;
   }

A function declared in that way will not
call Package_func() C function, but internal()
function instead. The only request is that internal()
function must have identical parameter and result declaration
to a func().

=item Inline hash

A handy way to call a function with a hash
as a parameter from perl was devised. If
a function is declared with the last parameter
or type C<HV*>, then parameter translation
from perl to C is performed as if all the parameters passed
were a hash. This hash is passed to a C function
and it's content returned then back to perl as a hash again.
The hash content can be modified inside the C function.

This declaration is used heavily in constructors,
which perl code is typical

   sub init 
   {
      my %ret = shift-> SUPER::init( @_);
      ...
      return %ret; 
   }

and C code is usually

   void Obj_init ( HV * profile) {
       inherited init( profile);
       ... [ modify profile content ] ...
   }

=back

=head2 Methods

Methods are functions called in a context of an object.
Virtually all methods need to have an access to an object
they are dealing with. Prima objects are visible in C
as Handle data type. Such Handle is actually a pointer
to an object instance, which in turn contains a pointer
to the object virtual methods table ( VMT ).
To facilitate an OO-like syntax, this Handle parameter
is almost never mentioned in all methods of an object description
in a cls file, although being implicit counted, so every
cls method declaration

   method void a( int x)

for an object class Object is reflected in C as 

   void Object_a( Handle self, int x)

function declaration. Contrary to package functions, that gencls
is unable to publish if it is unable to deal with the 
unsupported on unconvertible parameters, there is a way
to issue such a declaration with a method. The primary use for that
is the method name gets reserved in the object's VMT.

Methods are accessible in C code by the direct name
dereferencing of a C<Handle self> as a corresponding
structure:

    ((( PSampleObject) self)-> self)-> sample_method( self, ...);

A method can have one of six prefixes that govern C code
generation: 

=over

=item method

This is the first and the most basic method type.
It's prefix name, C<method> is therefore was chosen as the most
descriptive name. Methods are expected to be coded in C,
the object handle is implicit and is not included into a .cls description.

   method void a()

results in

   void Object_a( Handle self)

C declaration. A published method automatically
converts its parameters and a result between C and perl.

=item public

When the methods that have parameters and/or result that 
cannot be automatically converted between C and perl need to be declared,
or the function declaration does not fit into C syntax,
a C<public> prefix is used. The methods declared with C<public>
is expected to communicate with perl by means of XS ( see L<perlxs> 
) interface. It is also expected that a C<public> method creates both
REDEFINED and FROMPERL functions ( see L<Prima::internals>  for
details). Examples are many throughout Prima source, and
will not be shown here. C<public> methods usually have
void result and no parameters, but that does not matter much,
since gencls produces no conversion for such methods.

=item import

For the methods that are unreasonable to code in C but in
perl instead, gencls can be told to produce the corresponding
wrappers using C<import> prefix. This kind of a method can
be seen as C<method> inside-out. C<import> function does not
need a C counterpart, except the auto-generated code.


=item static

If a method has to be able to work both with and without
an object instance, it needs to be prepended with C<static> prefix.
C<static> methods are all alike C<method> ones, except that
C<Handle self> first parameter is not implicitly declared.
If a C<static> method is called without an object ( but with
a class ), like

   Class::Object-> static_method();

its first parameter is not a object but a "Class::Object" string.
If a method never deals with an object, it is enough to use
its declaration as

   static a( char * className = "");

but is if does, a

   static a( SV * class_or_object = nil);

declaration is needed. In latter case C code itself has to determine
what exactly has been passed, if ever. Note the default parameter
here: a C<static> method is usually legible to call as

  Class::Object::static_method();

where no parameters are passed to it. Without the default parameter
such a call generates an 'insufficient parameters passed' runtime error.

=item weird

We couldn't find a better name for it. C<weird> prefix
denotes a method that combined properties both from C<static>
and C<public>. In other words, gencls generates no conversion
code and expects no C<Handle self> as a first parameter for
such a method. As an example Prima::Image::load can be depicted,
which can be called using a wide spectrum of calling semantics
( see L<Prima::image-load> for details).

=item c_only

As its name states, C<c_only> is a method that is present on a VMT
but is not accessible from perl. It can be overloaded from
C only. Moreover, it is allowed to register a perl function with a name
of a C<c_only> method, and still these entities will be wholly
independent from each other - the overloading will not take place.

NB: methods that have result and/or parameters data types that
can not be converted automatically, change their prefix to C<c_only>.
Probably this is the wrong behavior, and such condition have to signal 
an error.

=back

=head2 Properties

Prima toolkit introduces an entity named property,
that is expected to replace method pairs whose function
is to acquire and assign some internal object variable,
for example, an object name, color etc. Instead of 
having pair of methods like Object::set_color and Object::get_color,
a property Object::color is devised. A property is
a method with the special considerations, in particular,
when it is called without parameters, a 'get' mode
is implied. In contrary, if it is called with one parameter,
a 'set' mode is triggered. Note that on both 'set' and 'get'
invocations C<Handle self> first implicit parameter is
always present.

Properties can operate with different, but fixed amount
of parameters, and perform a 'set' and 'get' functions
only for one. By default the only parameter is the implicit
C<Handle self>:

   property char * name

has C counterpart

   char * Object_name( Handle self, Bool set, char * name)

Depending on a mode, C<Bool set> is either C<true> or C<false>.
In 'set' mode a C code result is discarded, in 'get' mode
the parameter value is undefined.

The syntax for multi-parameter property is

   property long pixel( int x, int y);

and C code

   long Object_pixel( Handle self, Bool set, int x, int y, long pixel)

Note that in the multi-parameter case the parameters declared
after property name are always initialized, in both 'set' and 'get' modes.

=head2 Instance variables

Every object is characterized by its unique internal state.
Gencls syntax allows a variable declaration, for variables that are
allocated for every object instance. Although data type
validation is not performed for variables, and their declarations
just get copied 'as is', complex C declarations involving
array, struct and function pointers are not recognized.
As a workaround, pointers to typedef'd entities are used.
Example:

   object SampleObject {
      int x;
      List list;
      struct { int x } s; # illegal declaration
   }

Variables are accessible in C code by direct name
dereferencing of a C<Handle self> as a corresponding
structure:

    (( PSampleObject) self)-> x;


=head1 AUTHORS

Dmitry Karasik, E<lt>dmitry@karasik.eu.orgE<gt>.
Anton Berezin, E<lt>tobez@tobez.orgE<gt>.

=head1 SEE ALSO

L<Prima::internals>, L<tmlink>

=head1 COPYRIGHT

This program is distributed under the BSD License.