Paweł Murias > SMOP > Base


Annotate this POD

View/Report Bugs


Base - SMOP basic structures



SMOP__Object ^

In SMOP, every single value must be binary-compatible with the SMOP__Object struct. This even includes core level constructs such as the interpreter and the native types. This idea comes directly from how perl5 works, with the SV struct.

Unlike p5, however, the SMOP__Object struct is absolutely minimalist; It defines no type, no flags, and no introspection information. It defines only that every SMOP__Object has a "responder interface" (.RI), so the structure is merely:

  struct SMOP__Object {
    SMOP__ResponderInterface* RI;
    /* Maybe there is something here, maybe there is nothing here.
     * Only the responder interface knows.

The value in the .RI member is not unique to the object. For all but singleton classes, one responder interface will be used by multiple object structs. As such, the object is identified only by the memory address at which the struct SMOP__Object is stored.

This means that you can't really do anything to the object yourself, you can only talk to its responder interface. The object serves as both a way to find the correct responder interface, and a way to tell the responder interface which instance data to operate on -- and that is all.

There may be additional data below the .RI member, but if so, only the responder interface knows how to use it. The data for the object instance may, in fact, not be stored in the structure at all -- it could be looked up using the object's address in a completely separate data store.

As such, it is incorrect to attempt to copy or move a SMOP_Object struct using a simple memory copy like C's memcpy(). Even if you lucked out and got all the data in the object, you would have changed its address, and it would not be the same object anymore. This point is especially important to note if an object may exist in multiple address spaces -- only one address will be valid without special handling.

SMOP__ResponderInterface ^

The responder interface (which, of course, is also binary-compatible with SMOP__Object) implements the low-level part of the meta object protocol. It is through the responder interface that you can perform actions on the object.

Using the responder interface, arbitrary methods may be invoked on the object. It's important to realize that this method invocation happens at the same level that any high-level language might call. This means that there's no distinction between native operators and high-level operators, nor between native values and high-level values.

The structure of a responder interface is as follows:

  struct SMOP__ResponderInterface {
    SMOP__ResponderInterface* RI;
    SMOP__Object* (*MESSAGE)  (SMOP__Object* interpreter,
                               SMOP__ResponderInterface* self,
                               SMOP__Object* identifier,
                               SMOP__Object* capture);
    SMOP__Object* (*REFERENCE)(SMOP__Object* interpreter,
                               SMOP__ResponderInterface* self,
                               SMOP__Object* object);
    SMOP__Object* (*RELEASE)  (SMOP__Object* interpreter,
                               SMOP__ResponderInterface* self,
                               SMOP__Object* object);
    SMOP__Object* (*WEAKREF)  (SMOP__Object* interpreter,
                               SMOP__ResponderInterface* self,
                               SMOP__Object* object);
    char* id;
    /* Maybe there is something here, maybe there is nothing here.
     * Only the responder interface in member .RI knows.

However, the SMOP base defines a few macros that should be used when interacting with SMOP Objects. While in theory, the use of those macros is optional, it's strongly advised that you stick with them, to make transitions to newer versions easier.

As such, each of the function hooks defined in the above structure will be described along with the macros which should be used to access them.

    SMOP_DISPATCH(interpreter, object, identifier, capture)

This macro (and all its parameters) correspond with the MESSAGE function hook member. This is the function that handles method invocation for the objects which this responder interface oversees:

  SMOP__Object* (*MESSAGE)  (
      SMOP__Object* interpreter,      /* gets interpreter */
      SMOP__ResponderInterface* self, /* gets (responder) object */
      SMOP__Object* identifier,       /* gets identifier */
      SMOP__Object* capture           /* gets capture (instance object inside) */

As you might have noticed, it receives objects as arguments and returns, of course, an object.

SMOP_DISPATCH uses the .MESSAGE function in the responder found at object to invoke a method with a name found at identifier. It invokes that method in the context of the interpeter found at interpreter using the capture found at capture to pass data to the method's parameters.

Each of these macro arguments are expanded upon in other documentation, however, you may notice that something appears to be missing. Methods usually have an "invocant" -- which would be a SMOP__Object that was used to find the responder that is being pointed to in object above. If there is one, it is tucked away inside the capture.

    SMOP_REFERENCE(interpreter, object)
    SMOP_RELEASE(interpreter, object)

SMOP_REFERENCE and SMOP_RELEASE call, respectively, the .REFERENCE and .RELEASE functions in a responder interface. The responder interface used is the one that is pointed to by the .RI member of the object structure pointed to by object. The object pointer itself is also passed to the REFERENCE or RELEASE function:

  SMOP__Object* (*REFERENCE)(
      SMOP__Object* interpreter,       /* gets interpreter */
      SMOP__ResponderInterface* self,  /* gets the RI member found at object */
      SMOP__Object* object             /* gets object itself */

These functions increment or decrement the reference count of object in the context of interpreter. The reference count is used to handle automatic cleanup of objects when they are no longer needed -- more on this subject later.

The macros both return the same value that was passed into object, so you can use the macro in most places where you would use an object pointer, much like you would use i++ to postincrement an integer in-place. This is handy in keeping code terse, but take care, you should do nothing like SMOP_RELEASE(interp,current++) nor SMOP_RELEASE(interp,current)++ when working with arrays of objects.

    SMOP_WEAKREF(interpreter, object)

SMOP_WEAKREF calls the .WEAKREF function in a responder interface. It works much the same way as the SMOP_REFERENCE macro, above.

  SMOP__Object* (*WEAKREF)  (
      SMOP__Object* interpreter,       /* gets interpreter */
      SMOP__ResponderInterface* self,  /* gets the RI member found at object */
      SMOP__Object* object             /* gets object itself */

SMOP_WEAKREF can be used wherever you would normally use SMOP_REFERENCE to obtain a "weak reference" instead. This call is allowed to return you a different object than the one you point to with object, and you are supposed to use that as a proxy. Weak references do not count as a reference against the original object for the purposes of garbage collection.

This means that the original object may be freed before the weak reference itself is destroyed. If this happens, the weak reference will start to refer to some appropriate constant (like False) instead of the now-dead object.

The implementation of the weak-reference is private to each responder interface's implementation, so the exact behavior may vary depending on the kind of objects you are working with. Especially, note that if an object does not actually need to be reference counted, a weak reference may end up returning the original object, so you are not allowed to assume the macro will always return a different pointer than the one passed via object.

Note that a weak reference is itself an object. So you do still need to call SMOP_RELEASE on it when you are done with it. (It isn't provided just to help us be lazy.) However, all SMOP_REFERENCE and SMOP_RELEASE calls on the weak reference object count references to the the proxy object, not the original object.

That makes weak references a handy way to break circular dependencies between objects and code.

Other Macros ^

macro SMOP__Object__BASE

This macro defines the top members present in every SMOP Object, basically defining the members documented in the section above. Currently that is just the .RI member, but should members be added in future versions, they will appear in this list. It should be used when declaring new types of objects.

macro SMOP__ResponderInterface__BASE

Like the above macro, except that this defines the members present in all responder interface objects, as documented further above. Note this does not include SMOP__Object__BASE. It is best not to nest such macros to keep them reusable for compound types.

macro SMOP_RI(value)

Shorthand to dereference the .RI member of a SMOP__Object structure given the address of the SMOP__Object structure.

Talking Trash (Garbage Collection) ^

SMOP uses reference counting garbage collection convetions, as you probably can tell from the above documentation for SMOP_REFERENCE and SMOP_RELEASE.

In the initial implementation, a reference counting garbage collector was selected since this type of garbage collector is considerably simpler to implement (even if considerably harder to debug and maintain.) However, when design goals expanded to include interoperability with perl5, it became evident that following reference counting conventions would be a necessity in making SMOP and perl5 work together.

One thing that might not be obvious from the above technical notes is that it's up to each responder interface to implement its own garbage collector. This means that we can have several garbage collectors coexisting within the same process. For instance, the SMOP default low-level and the perl5 garbage collectors could both manage different sets of objects. In addition, objects that do not do any garbage collection at all may be present. Even in this case, all objects at least pretend to implement the mechanisms that make reference counting possible.

That is why the .REFERENCE, .RELEASE and .WEAKREF functions are included at the base level. Relatively few objects should be responder interfaces, so it is better for them just to carry vestigial members than make the code complex by trying to do without them. This set of functions should be sufficient to interact with the majority of reference counting garbage collectors.

Who owns an object?

This is the most important question: when to call SMOP_REFERENCE and when to call SMOP_RELEASE. The following documents the policy that must be followed to correctly garbage collect SMOP objects.

The below will refer to ownership "stakes" which belong to either sections of code, or other objects -- an ownership stake is a concept, not a solid object residing in memory somewhere. One stake in an object is merely an obligation by the owner to call SMOP_RELEASE once on the object, or to transfer the stake by ensuring that some other code will call SMOP_RELEASE on the object when appropriate.

There is also an obligation never to call SMOP_RELEASE on an object in which you have no ownership stakes.



Most reference counting will happen around SMOP_DISPATCH/.MESSAGE method invocations.

In general, the caller can "fire and forget" and the callee has to clean up the mess. From the caller side, the only tricky part is remembering to take an extra SMOP_REFERENCE when installing one object into a capture more than once, or if the object is to be used after a capture it is inside has been destroyed.

The callee, on the other hand, must remember to SMOP_RELEASE any objects it extracted from the capture (once for every time that object is extracted) and after that, to SMOP_RELEASE the capture itself, before returning. Alternatively it may dispose of the ownership stakes by transferring them to other code or captures, like, for example, inside its result.


This document describes everything that you can assume about an arbitrary object. This means that you can only introspect in more detail by either calling a method, or via special knowledge of the internals of the responder interface of the given object (for example, inside the code of the responder interface itself.)

It is erroneous to assume anything about the internal structure of any object, even responder interface objects, beyond what is described in this document.

syntax highlighting: