The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

SAFT - create simple SAFT-XML encoded archival finding aids

VERSION

This document describes SAFT version 0.2.0

SYNOPSIS

    use SAFT;

    my $saft = SAFT->new();

    $saft->set_finding_aid_title('Guide to the archives of Gondor');
    $saft->set_abstract('This finding aid describes...');
    $saft->set_author('Gandalf the Grey');

    $saft->add_classification(1    => 'Archive of the Kings');
    $saft->add_classification(2    => 'Archive of the Stewards');
    $saft->add_classification(1.1  => 'Reign of Elendil');
    $saft->add_classification(2.26 => 'Reign of Denethor II.');

    $saft->add_file_sachakte(
        '2.26',
        {
            Signatur => '42',
            Titel    => "Of Denethor's death",
            Enthaelt => 'Includes a testimony',
            Laufzeit => '3019-3020',
        }
    );

    print $saft->to_string();
  
  

DESCRIPTION

This module provides a convenient way to create archival finding aids in a format specified by the SAFT XML standard. Archives use XML encoded finding aids to exchange metadata of their collections or to publish these metadata in web portals.

If you don't know what a finding aid is in the first place, please refer to Wikipedia or ask an archivist near you. SAFT is a standard for XML encoding those finding aids. The acronym SAFT stands for German "Standard-Austauschformat" (s.th. like "standard interchange format"). You can find the SAFT DTD and more (German) documentation on SAFT XML in the German Wikipedia and on this website: http://www.archivschule.de/forschung/retrokonversion-252/vorstudien-und-saft-xml/

SAFT XML is not very widely used (in fact, since its tag names are German, probably nobody uses it outside Germany), a far more widespread format for such purposes is the American standard Encoded Archival Description (EAD).

So why bother using SAFT anyway? Three reasons: First, it might be better suited to German archival tradition or your specific needs (personal opinion). Second, it might be easier to use than EAD (again, personal opinion). Third, I haven't heard of a Perl module for EAD so far. For SAFT? Here you go.

This module does not, however, provide every feature the SAFT DTD allows you to use. Instead methods are provided only for common cases and rather simple structures (i.e., cases I have stumbled upon and structures I have needed so far using SAFT XML). Anything that's allowed by the SAFT DTD but not provided by this module could easily be achieved using a general XML module such as XML::LibXML. In fact, a lot of the stuff this module does is actually done by wrapping XML::LibXML.

INTERFACE

The following methods are provided by the SAFT module.

new
    $saft = SAFT->new( );

This method creates a new SAFT object, representing the SAFT XML finding aid.

set_finding_aid_title
    $saft->set_finding_aid_title( $title );

This method sets the finding aid's title (element Findmittel_Info/FM_Name).

set_finding_aid_id
    $saft->set_finding_aid_id( $id );

This method sets the finding aid's id (element Findmittel_Info/FM_Sig).

set_author
    $saft->set_author( $author_name );

This method sets the finding aid's author (element Datei_Info/Erstellung/Bearbeiter).

set_creation_date
    $saft->set_creation_date( $creation_date );

This method sets the finding aid's creation date (element Datei_Info/Erstellung/Datum). Calling the new method implicitly calls $saft->set_creation_date( scalar localtime ), thus setting a reasonable default value.

set_filename
    $saft->set_filename( $filename );

This method sets the SAFT XML file's filename (element Datei_Info/Dateiname). If you call the to_file method later with another filename, this will reset the content of the relevant element.

set_abstract
    $saft->set_abstract( $text );

This method sets the finding aid's abstract (element Findmittel_Info/Einleitung/Text).

set_bibliography
    $saft->set_bibliography( $text );

This method sets the finding aid's bibliography section (element Findmittel_Info/Einleitung/Bibliographie).

set_finding_aid_note
    $saft->set_finding_aid_note( $text );

This method sets the finding aid's note (element Findmittel_Info/Bem).

set_unit_title
    $saft->set_unit_title( $title );

This method sets the title of the unit described in the finding aid (element Findmittel_Info/Bestand_Info/Bestandsname).

set_unit_id
    $saft->set_unit_id( $id );

This method sets the id of the unit described in the finding aid (element Findmittel_Info/Bestand_Info/Bestand_Sig).

set_unit_date
    $saft->set_unit_date( $date );

This method sets the date of the unit described in the finding aid (elements Findmittel_Info/Bestand_Info/Laufzeit/LZ_Text and Findmittel_Info/Laufzeit/LZ_Text).

add_classification
    $saft->add_classification( $branch_number, $branch_title );

This method creates a new classification branch (element Klassifikation) and adds it to the finding aid. Its title (element Klass_Titel) is determined by $branch_title and its branch number (element Klass_Nr) by $branch_number (example: '2.1.3'). The parameter $branch_number also determines where the new branch will be appended in the classification structure, cf. the following example:

    branch number '1'
    branch number '2'
        branch number '2.1'
            branch number '2.1.1'
            branch number '2.1.2'
            branch number '2.1.3' <= here it is!
    branch number '3'

When building a classification, it is not necessary to create the classification branches in a specific order. If you create a branch referencing (by its branch number) a non-existing parent branch, the missing parent branch will implicitly be created with an empty title. If this implicitly created branch is later created explicitly, it will not be created again, but instead only its title will be set according to the given parameter. This means, staying with the example above, you could first create branch 2.1.3, then 2.1.1 and 2.1.2, and after that 2.1 and 2 without messing up your classification structure.

set_classification_title
    $saft->set_classification_title( $branch_number, $branch_title );

This method sets the title of the classification branch determined by its branch number (example: '2.1.3').

add_file_sachakte
    $saft->add_file_sachakte( $branch_number, \%subelems );

This method creates a new record of type Sachakte and appends it at the end of the classification branch determined by $branch_number (example: '2.1.3'). To achieve a certain order among the different records in a given classification branch you have to append them by calling add_file_sachakte in the desired order (not as with add_classification_branch).

The new Sachakte's content is determined by the key/value pairs in the hash %subelems (which is passed as a reference). The following keys and value types will be accepted:

    key                 | value type
    --------------------+--------------------------------------
    Signatur (required) | scalar
    Laufzeit            | scalar
    Titel (required)    | scalar
    Enthaelt            | scalar
    Nr                  | scalar
    Az                  | scalar
    Bestellsig          | scalar
    Altsig              | scalar
    Provenienz          | scalar
    Vor_Prov            | scalar
    Abg_Stelle          | scalar
    Akzession           | scalar
    Sperrvermerk        | scalar
    Umfang              | scalar
    Lagerung            | scalar
    Zustand             | scalar
    FM_Seite            | scalar
    Bestand_Kurz        | scalar
    Bem                 | scalar
    Hilfsfeld           | scalar
    archref             | scalar
    bibref              | scalar
    FM_ref              | scalar
    altübform           | scalar
    Register            | scalar

A possible example for \%subelems might look like this:

    {
        Signatur    => '42',
        Titel       => "Of Denethor's death",
        Enthaelt    => 'Includes a testimony',
        Laufzeit    => '3019-3020',
        'altübform' => 'We have digitized that one!',
        # ...
    }
add_file_fallakte
    $saft->add_file_fallakte( $branch_number, \%subelems );

This method creates a new record of type Fallakte and appends it at the end of the classification branch determined by $branch_number (example: '2.1.3'). To achieve a certain order among the different records in a given classification branch you have to append them by calling add_file_fallakte in the desired order (not as with add_classification_branch).

The new Fallakte's content is determined by the key/value pairs in the hash %subelems (which is passed as a reference). The following keys and value types will be accepted:

    key                 | value type
    --------------------+--------------------------------------
    FA_Art (attribute)  | scalar
    Signatur (required) | scalar
    Laufzeit            | scalar
    Titel               | scalar
    Enthaelt            | scalar
    Person              | scalar or hash reference (see below)
    Institution         | scalar
    Sachverhalt         | scalar
    Datum               | scalar or hash reference (see below)
    Nr                  | scalar
    Ort                 | scalar or hash reference (see below)
    Anschrift           | scalar
    Az                  | scalar
    Prozessart          | scalar
    Instanz             | scalar
    Beweismittel        | scalar
    Formalbeschreibung  | scalar
    Bestellsig          | scalar
    Altsig              | scalar
    Provenienz          | scalar
    Vor_Prov            | scalar
    Abg_Stelle          | scalar
    Akzession           | scalar
    Sperrvermerk        | scalar
    Umfang              | scalar
    Lagerung            | scalar
    Zustand             | scalar
    FM_Seite            | scalar
    Bestand_Kurz        | scalar
    Bem                 | scalar
    Hilfsfeld           | scalar
    archref             | scalar
    bibref              | scalar
    FM_ref              | scalar
    altübform           | scalar
    Register            | scalar

Some keys will accept either a simple scalar value or a hash reference, your choice will depend on the complexity of your data. The following tables list the keys and value types that will be accepted, respectively. The special key PCDATA can be used to mix child elements and PCDATA content, or to create elements with attributes (you will need a hash for these, so you can't pass a scalar with the element's text content - just use the PCDATA key).

    Person
    key                 | value type
    --------------------+--------------------------------------
    Pers_Name           | scalar or hash reference (see below)
    Rang_Titel          | scalar
    Beruf_Funktion      | scalar
    Institution         | scalar
    Datum               | scalar or hash reference (see below)
    Ort                 | scalar or hash reference (see below)
    Nationalitaet       | scalar
    Geschlecht          | scalar
    Konfession          | scalar
    Familienstand       | scalar
    Anschrift           | scalar
    Bem                 | scalar
    Hilfsfeld           | scalar
    archref             | scalar
    bibref              | scalar
    FM_ref              | scalar
    altübform           | scalar
    Register            | scalar
    PCDATA              | scalar

    Pers_Name
    key                 | value type
    --------------------+--------------------------------------
    Vorname             | scalar
    Nachname            | scalar
    PCDATA              | scalar

    Datum
    key                 | value type
    --------------------+--------------------------------------
    Dat_Fkt (attribute) | scalar
    Jahr                | scalar
    Monat               | scalar
    Tag                 | scalar
    PCDATA              | scalar

    Ort
    key                 | value type
    --------------------+--------------------------------------
    Ort_Fkt (attribute) | scalar
    PCDATA              | scalar

A possible example for \%subelems might look like this:

    {
        Signatur    => '42',
        FA_Art      => 'Personal',
        Person      => {
            Pers_Name       => 'Aragorn',
            Rang_Titel      => 'King of Arnor and Gondor',
            Datum           => {
                Dat_Fkt     => 'Geburt',
                PCDATA      => '2931',
            }
        },
        Institution => 'Kingdom of Arnor and Gondor',
        # ...
    }
add_file_karte
    $saft->add_file_karte( $branch_number, \%subelems );

This method creates a new record of type Karte and appends it at the end of the classification branch determined by $branch_number (example: '2.1.3'). To achieve a certain order among the different records in a given classification branch you have to append them by calling add_file_karte in the desired order (not as with add_classification_branch).

The new Karte's content is determined by the key/value pairs in the hash %subelems (which is passed as a reference). The following keys and value types will be accepted:

    key                 | value type
    --------------------+--------------------------------------
    Signatur (required) | scalar
    Laufzeit            | scalar
    Titel (required)    | scalar
    Beschreibung        | scalar
    Bestellsig          | scalar
    Altsig              | scalar
    Provenienz          | scalar
    Vor_Prov            | scalar
    Abg_Stelle          | scalar
    Akzession           | scalar
    Sperrvermerk        | scalar
    Umfang              | scalar
    Lagerung            | scalar
    Zustand             | scalar
    FM_Seite            | scalar
    Bestand_Kurz        | scalar
    Bem                 | scalar
    Hilfsfeld           | scalar
    archref             | scalar
    bibref              | scalar
    FM_ref              | scalar
    altübform           | scalar
    Register            | scalar
    Ort                 | scalar or hash reference (see below)
    Enthaelt            | scalar
    Kartentyp           | scalar
    Einzeichnung        | scalar
    Az                  | scalar
    Massstab            | scalar
    Topogr_Daten        | scalar or hash reference (see below)
    Person              | scalar or hash reference (see below)
    Institution         | scalar
    Ausfuehrung         | scalar
    Material            | scalar
    Entstehungsstufe    | scalar
    Auflage             | scalar
    Format              | scalar or hash reference (see below)
    Nebenkarten         | scalar

Some keys will accept either a simple scalar value or a hash reference, your choice will depend on the complexity of your data. The following tables list the keys and value types that will be accepted, respectively. The special key PCDATA can be used to mix child elements and PCDATA content, or to create elements with attributes (you will need a hash for these, so you can't pass a scalar with the element's text content - just use the PCDATA key).

    Ort
    key                 | value type
    --------------------+--------------------------------------
    Ort_Fkt (attribute) | scalar
    PCDATA              | scalar

    Topogr_Daten
    key                 | value type
    --------------------+--------------------------------------
    TK                  | scalar
    Nr                  | scalar
    GK_hoch             | scalar
    GK_rechts           | scalar
    GK_Identifikation   | scalar
    Breitengrad         | scalar
    Laengengrad         | scalar
    Bem                 | scalar
    Hilfsfeld           | scalar
    PCDATA              | scalar

    Person
    key                 | value type
    --------------------+--------------------------------------
    Pers_Name           | scalar or hash reference (see below)
    Rang_Titel          | scalar
    Beruf_Funktion      | scalar
    Institution         | scalar
    Datum               | scalar or hash reference (see below)
    Ort                 | scalar or hash reference (see above)
    Nationalitaet       | scalar
    Geschlecht          | scalar
    Konfession          | scalar
    Familienstand       | scalar
    Anschrift           | scalar
    Bem                 | scalar
    Hilfsfeld           | scalar
    archref             | scalar
    bibref              | scalar
    FM_ref              | scalar
    altübform           | scalar
    Register            | scalar
    PCDATA              | scalar

    Pers_Name
    key                 | value type
    --------------------+--------------------------------------
    Vorname             | scalar
    Nachname            | scalar
    PCDATA              | scalar

    Datum
    key                 | value type
    --------------------+--------------------------------------
    Dat_Fkt (attribute) | scalar
    Jahr                | scalar
    Monat               | scalar
    Tag                 | scalar
    PCDATA              | scalar

    Format
    key                 | value type
    --------------------+--------------------------------------
    M_Mass (attribute)  | scalar
    Hoehe               | scalar
    Breite              | scalar
    Durchmesser         | scalar
    PCDATA              | scalar

A possible example for \%subelems might look like this:

    {
        Signatur    => '42',
        Titel       => 'Map of Mordor',
        Format      => '50cm x 50cm',
        Ort         => {
            Ort_Fkt     => 'Druckort',
            PCDATA      => 'Minas Tirith',
        },
        # ...
    }
to_string
    $string = $saft->to_string();

This method returns a string representation of the XML structure stored in the $saft object (cf. the toString method of XML::LibXML).

to_file
    $saft->to_file( $filename );

This method creates a file called $filename containing the XML structure stored in the $saft object. Calling the to_file method implicitly calls $saft->set_filename( $filename ).

DIAGNOSTICS

The following errors or warnings may occur while using the SAFT module.

E: Can't understand format of branch number '...'

You have passed the add_classification method a branch number the SAFT module doesn't understand. Branch numbers must look like '3', '4.1', '1.5.2', ... (up to ten levels), or in other words like this: /[1-9]\d*(\.[1-9]\d*)*/.

E: Can't add classification branch ... '...' - a classification branch with the same number already exists

You have tried to use the add_classification method to create a classification branch that already exists (same number, different title). You probably have a clash of branch numbers here - check your input data!

E: Can't set title of nonexisting classification branch number ... . To create a new branch use add_classification

You have used the set_classification_title method to set the title of a nonexisting branch. You either have passed the wrong branch number to set_classification_title, or you should check you input data.

E: Can't find classification branch number ... to append file

You have created a file (e.g. Sachakte or Fallakte) and passed the method a classification branch number the SAFT module can't find. Either your branch number has a format the module does not recognize or a classification branch with the given number doesn't exist. Check your input data.

W: Nested classification with more than 10 levels is not allowed by SAFT DTD

You have passed the add_classification method a branch number with more than ten levels. SAFT uses the level attribute of element Klassifikation to store the depth of a classification branch, and for this attribute only the values 01..10 are allowed. However, the classification branch will be created anyway, albeit it will have a level attribute with a value > 10, thus not being compliant to the SAFT DTD anymore.

W: Branch ... '...' already exists, repeated attempt to create it was ignored

You have tried to use the add_classification method to create a classification branch that already exists (same number, same title). Usually, this is no problem: You want a branch, you have the branch - nothing to be done ;-)

W: Missing entry or value for element 'Signatur'

You have created a file (e.g. Sachakte or Fallakte) element without providing a Signatur value. This will not prevent your element from being created, but usually you will not want to create files without a Signatur. Either you know what you are doing, or you should check your input data.

W: Missing entry or value for element 'Titel' (Signatur ...)

You have created a file (e.g. Sachakte) element without providing a Titel value. This will not prevent your element from being created, but usually you will not want to create files without a Titel. Either you know what you are doing, or you should check your input data.

DEPENDENCIES

The SAFT module uses Perl (minimum 5.10) and the following modules and pragmas:

  • warnings

  • strict

  • Carp

  • utf8

  • version

  • XML::LibXML (tested with minimum version 1.70)

INCOMPATIBILITIES

None reported.

BUGS AND LIMITATIONS

As mentioned above, this module does not provide every feature that would be allowed by the SAFT DTD.

If you plan to use the element altübform somewhere (e.g. method add_file_sachakte), you have to use the utf8 pragma as well. And don't blame me for this, I haven't written the DTD ;-)

Even though the DTD might allow this, you can't create multiple occurences of the same sub-element (e.g. Datum or Az) via the add_file_foo methods. This is because you pass the sub-elements in a hash and so each hash key (aka sub-element-to-be) must be unique. I'm sorry...

Please report any bugs or feature requests to bug-saft@rt.cpan.org, or through the web interface at http://rt.cpan.org.

AUTHOR

Martin Hoppenheit <mho@cpan.org>

LICENCE AND COPYRIGHT

Copyright (c) 2011, Martin Hoppenheit <mho@cpan.org>. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.