The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
=head1 NAME

bt_input - input/parsing functions in B<btparse> library

=head1 SYNOPSIS

   void  bt_set_stringopts (bt_metatype_t metatype, btshort options);
   AST * bt_parse_entry_s (char *    entry_text,
                           char *    filename,
                           int       line,
                           btshort    options,
                           boolean * status);
   AST * bt_parse_entry   (FILE *    infile,
                           char *    filename,
                           btshort    options,
                           boolean * status);
   AST * bt_parse_file    (char *    filename, 
                           btshort    options, 
                           boolean * overall_status);


=head1 DESCRIPTION

The functions described here are used to read and parse BibTeX data,
converting it from raw text to abstract-syntax trees (ASTs).

=over 4

=item bt_set_stringopts ()

   void bt_set_stringopts (bt_metatype_t metatype, btshort options);

Set the string-processing options for a particular entry metatype.  This
affects the entry post-processing done by C<bt_parse_entry_s()>,
C<bt_parse_entry()>, and C<bt_parse_file()>.  If C<bt_set_stringopts()>
is never called, the four metatypes default to the following sets of
string options:

   BTE_REGULAR    BTO_CONVERT | BTO_EXPAND | BTO_PASTE | BTO_COLLAPSE
   BTE_COMMENT    0
   BTE_PREAMBLE   0
   BTE_MACRODEF   BTO_CONVERT | BTO_EXPAND | BTO_PASTE

For example,

   bt_set_stringopts (BTE_COMMENT, BTO_COLLAPSE);

will cause the library to collapse whitespace in the value from all
comment entries; the AST returned by one of the C<bt_parse_*> functions
will reflect this change.

=item bt_parse_entry ()

   AST * bt_parse_entry (FILE *    infile,
                         char *    filename,
                         btshort    options,
                         boolean * status);

Scans and parses the next BibTeX entry in C<infile>.  You should supply
C<filename> to help B<btparse> generate accurate error messages; the
library keeps track of C<infile>'s current line number internally, so you
don't need to pass that in.  C<options> should be a bitmap of
non-string-processing options (currently, C<BTO_NOSTORE> to disable storing
macro expansions is the only such option).  C<*status> will be set to
C<TRUE> if the entry parsed successfully or with only minor warnings, and
C<FALSE> if there were any serious lexical or syntactic errors.  If
C<status> is C<NULL>, then the parse status will be unavailable to you.
Both minor warnings and serious errors are reported on C<stderr>.

Returns a pointer to the abstract-syntax tree (AST) describing the entry
just parsed, or C<NULL> if no more entries were found in C<infile> (this
will leave C<infile> at end-of-file).  Do not attempt to second guess
C<bt_parse_entry()> by detecting end-of-file yourself; it must be allowed
to determine this on its own so it can clean up some static data that is
preserved between calls on the same file.

C<bt_parse_entry()> has two important restrictions that you should know
about.  First, you should let B<btparse> manage all the input on the
file; this is for reasons both superficial (so the library knows the
current line number in order to generate accurate error messages) and
fundamental (the library must be allowed to detect end-of-file in order
to cleanup certain static variables and allow you to parse another
file).  Second, you cannot interleave the parsing of two different
files; attempting to do so will result in a fatal error that will crash
your program.  This is a direct result of the static state maintained
between calls of C<bt_parse_entry()>.

Because of two distinct "failures" possible for C<bt_parse_entry()>
(end-of-file, which is expected but means to stop processing the current
file; and error-in-input, which is not expected but allows you to
continue processing the same file), you should usually call it like
this:

   while (entry = bt_parse_entry (file, filename, options, &ok))
   {
      if (ok)
      {
         /* ... process entry ... */
      }
   }

At the end of this loop, C<feof (file)> will be true.

=item bt_parse_entry_s ()

   AST * bt_parse_entry_s (char *    entry_text,
                           char *    filename,
                           int       line,
                           btshort    options,
                           boolean * status)

Scans and parses a single complete BibTeX entry contained in a string,
C<entry_text>.  If you read this string from a file, you should help
B<btparse> generate accurate error messages by supplying the name of the
file as C<filename> and the line number of the beginning of the entry as
C<line>; otherwise, set C<filename> to C<NULL> and C<line> to C<1>.
C<options> and C<status> are the same as for C<bt_parse_entry()>.

Returns a pointer to the abstract-syntax tree (AST) describing the entry
just parsed, and C<NULL> if no entries were found in C<entry_text> or if
C<entry_text> was C<NULL>.

You should call C<bt_parse_entry_s()> once more than the total number of
entries you wish to parse; on the final call, set C<entry_text> to
C<NULL> so the function knows there's no more text to parse.  This final
call allows it to clean up some structures allocated on the first call.
Thus, C<bt_parse_entry_s()> is usually used like this:

   char *  entry_text;
   btshort  options = 0;
   boolean ok;
   AST *   entry_ast;

   while (entry_text = get_more_text ())
   {
      entry_ast = bt_parse_entry_s (entry_text, NULL, 1, options, &ok);
      if (ok)
      {
         /* ... process entry ... */
      }
   }

   bt_parse_entry_s (NULL, NULL, 1, options, NULL);    /* cleanup */

assuming that C<get_more_text()> returns a pointer to the text of an
entry to parse, or C<NULL> if there's no more text available.

=item bt_parse_file ()

   AST * bt_parse_file (char *    filename, 
                        btshort    options, 
                        boolean * status)

Scans and parses an entire BibTeX file.  If C<filename> is C<NULL> or
C<"-">, then C<stdin> will be read; otherwise, attempts to open the named
file.  If this attempt fails, prints an error message to C<stderr> and
returns C<NULL>.  C<options> and C<status> are the same as for
C<bt_parse_entry()>---note that C<*status> will be C<FALSE> if there were
I<any> errors in the entire file; for finer granularity of error-checking,
you should use C<bt_parse_entry()>.

Returns a pointer to a linked list of ASTs representing the entries in the
file, or C<NULL> if no entries were found in the file.  This list can
be traversed with C<bt_next_entry()>, and the individual entries then
traversed as usual (see L<bt_traversal>).

=back

=head1 SEE ALSO

L<btparse>, L<bt_postprocess>, L<bt_traversal>

=head1 AUTHOR

Greg Ward <gward@python.net>