Michael Friendly > SAS-Parser-0.93 > SAS::Header



Annotate this POD


Open  2
View/Report Bugs
Module Version: 0.91   Source  


SAS::Header - Create a documentation header comment for a SAS program


 use SAS::Header;
 $p = new SAS::Header;
 $p->parse_file('mysas.sas');       # returns a SAS::Parser object
 $header = $p->makeheader();        # extract info and format the header

 my @macdefs = $p->macdefs();       # any macros defined?
 foreach (@macdefs) {
   $header .= $p->macdescribe($_);  # describe args, append to header


SAS::Header is a sub-class of SAS::Parser which parses a SAS file and creates a block-comment header. It can also generate descriptions of SAS macros found in the file. It is designed to make a reasonably good start on documentation, which would then be edited or extended manually.

It overrides the parse_ccomment() and parse_mdef() methods from SAS::Parser to extract additional information for this purpose.


The following methods are defined in the SAS:Header class.

$p -> parse_file($filename)

Calls the SAS::Parser parse_file() method with the options silent=>1, trim=>0, store=>'global|ccomment'.

$header = $p -> makeheader()

Call makeheader() after the file has been parsed. makeheader() extracts the following information from the SAS::Header object: name, title, doc, procs called, datasets created, macros defined, macros called, author, created, revised, version.

Each item is formatted as a line of the form:

    Key: Information


   Name: filename.sas
  Title: Description of what I do

and then the collection of lines is formatted as a boxed /* ... */ comment, whose width is determined by the variable $SAS::Header::width. Any ``Information'' portion which would exceed $SAS::Header::width is wrapped so that successive lines are indented.

The keys that appear in the header are determined dynamically from an array, @headitems, whose default value is

        @headitems = qw(
                name title doc SEP
                procs macdefs macros datasets modules SEP
                author created revised version);

where SEP stands for a separator line. An application may change this array to alter the items or their order in the header. However, only those keys which have some content from the current file actually appear in the header.

This boxed comment is returned as a "\n" separated string.


This method is used to override the default parse_ccomment() method of the SAS::Parser class, so that any information present in an existing program header may be found during the initial parsing.

As defined here, this method ignores all C-comments contained within a PROC or DATA step. For other C-comments, it looks for strings with the following keywords:


The keyword may be in upper, lower, or mixed case, but must be followed by (optional whitespace) and a ':'. When such a line is found, a corresponding entry is added to the SAS::Header object, and later used by makeheader(). E.g., the Author information would be stored or accessed as $p->{author}.

$author = $p->get_author()

Returns an author string extracted from the parse. If an author was found by parse_ccomment(), that string is returned. Otherwise, the get_author() method assumes the current user is the author, and returns a string composed from the USER and HOST environment variables. On systems which have NetInfo installed, nidump(8) is called to get the user's real name.

$title = $p->get_title()

Returns a title string extracted from the parse. If a title was found by parse_ccomment(), that string is returned. Otherwise, the get_title() method tries harder, by examining the $p->stored() statements, -- either /* ... */ comments, or TITLE statements, until something reasonable is found.

$version = $p->get_version()

Returns a version string extracted from the parse. If a version was found by parse_ccomment(), that string is returned. Otherwise, the get_version() method returns undef.

$doc = $p->get_doc()

Returns a 'Doc' string (a pointer to external documentation or a web URL) extracted from the parse. If a doc string was found by parse_ccomment(), that string is returned. Otherwise, the get_doc() method returns undef.

$desc = $p -> macdescribe('mymacro', ['plain'|'pod'|'html'])

Generates a description of a macro and its arguments from information collected by the parse_mdef() method during the parse. The (optional) second parameter determines the style of the macro description text. The current version recognizes 'plain', 'pod' and 'html' styles.

It is assumed that the goal of using macdescribe() is to generate a basic stub for documentation from the available information, which is then edited to provide more details, as necessary.

The description distinguishes between positional and keyword arguments, and has the following format in the 'plain' style:


  The COMBOS macro ...


  The COMBOS macro takes 2 positional arguments and 8 keyword arguments.

 * THINGS             The N things to combine
 * SIZE               Size (K) of each combination
 * INCLUDE=           Items which must be included

Descriptive text for each argument is taken from comments in the %macro statement. See parse_mdef() for details.

If no descriptive text is found for a given argument, an associative array (%stdargs) is consulted for an appropriate description. For example, a DATA= argument is given the description from

        'DATA' => 'The name of the input data set',

You can add to, or modify the default standard text simply by (re-)defining an argument-name keyword (in uppercase) in %stdargs, e.g.,

        $stdargs{DATA} = "Le nom de l'ensemble de donnees d'entree";
        $stdargs{VAR} = "Le(s) nom(s) de(s) variable(s) d'analyse";

(there's no support for accents yet).

You can also modify the names of the sections of the macro description, and the text used therein, but for now you'll have to read through the code to find out how.


The parse_mdef method parses a %macro statement to determine arguments, defaults, and brief descriptions. It stores in the parser object a list-of-lists, each item of which contains

  [$arg, $argtype, $default, $desc]

for one argument.

parse_mdef() assumes that the %macro statement has the following format, where each keyword argument is followed by and '=' sign and optional default value. Each argument is followed by a ',', and may be followed by one or more comments, which are combined as the argument description.

 %macro combos(
        things,         /* the N things to combine           */
        /* more descriptive text */
        /* and one more */
        size,           /* size (K) of each combination      */
        include=,                /* items which must be included      */
        out=out,        /* output data set containing combos */
        sep=%str( ),    /* separator within each combo       */
        join=%str(, ),  /* separator to join all combos      */
        result=combos,  /* name of macro result variable with all
                           combinations */

Accessor for the list of macro arguments. If called with one argument, it returns the list of macro arguments for the given macro.

$self->margs('name', 'arg', 'type', 'default', 'desc')

Constructor for the list of macro arguments. It pushes the remaining (list) argument on the list of macro arguments.

Formatting C-comments

The following subroutines are used to process /* ... */ comments:

$boxed = &box($text)

Creates a boxed multi-line /* ... */ comment where each line is of width $SAS::Header::width, and preceeded by $SAS::Header::indent spaces. The default indent is 1, because some systems (MVS) have difficulty with '/*' starting in column 1.

The default frame characters used are '--||', corresponding to top, bottom, left, and right, and may be changed by re-assigning to the variable $SAS::Header::frame. Note that any trailing frame characters which are unassigned are simply empty, so the string '--' omits the left and right frame characters, and the string ' ' omits them all.

box() assumes that all lines have been previously folded if necessary to fit given width. Any longer lines are silently truncated.

$unboxed = &unbox($text)

Takes a string containing a boxed multi-line /* ... */ comment and removes the left and right frame characters and leading and trailing spaces from each line, and an initial and trailing line of decorators. The set '*-=|#' are treated as frame characters.

$boxed = &rebox($text)

Takes a string containing a boxed multi-line /* ... */ comment and reformats it as a new boxed comment with the current width, indent, and frame variables.


There are no bugs, except those inherited from SAS::Parser.

syntax highlighting: