Bryan Baldus > MARC-Lint > MARC::Lint

Download:
MARC-Lint_1.48.tar.gz

Dependencies

Annotate this POD

CPAN RT

New  2
Open  0
View/Report Bugs
Module Version: 1.48   Source  

NAME ^

MARC::Lint - Perl extension for checking validity of MARC records

SYNOPSIS ^

    use MARC::File::USMARC;
    use MARC::Lint;

    my $lint = new MARC::Lint;
    my $filename = shift;

    my $file = MARC::File::USMARC->in( $filename );
    while ( my $marc = $file->next() ) {
        $lint->check_record( $marc );

        # Print the title tag
        print $marc->title, "\n";

        # Print the errors that were found
        print join( "\n", $lint->warnings ), "\n";
    } # while

Given the following MARC record:

    LDR 00000nam  22002538a 4500
    040    _aMdSSJTT
           _cMdSSJTT
    040    _aMdSSJTT
           _beng
           _cMdSSJTT
    100 14 _aWall, Larry.
    110 1  _aO'Reilly & Associates.
    245 90 _aProgramming Perl /
           _aBig Book of Perl /
           _cLarry Wall, Tom Christiansen & Jon Orwant.
    250    _a3rd ed.
    250    _a3rd ed.
    260    _aCambridge, Mass. :
           _bO'Reilly,
           _r2000.
    590 4  _aPersonally signed by Larry.
    856 43 _uhttp://www.perl.com/

the following errors are generated:

    1XX: Only one 1XX tag is allowed, but I found 2 of them.
    100: Indicator 2 must be blank but it's "4"
    245: Indicator 1 must be 0 or 1 but it's "9"
    245: Subfield _a is not repeatable.
    040: Field is not repeatable.
    260: Subfield _r is not allowed.
    856: Indicator 2 must be blank, 0, 1, 2 or 8 but it's "3"

DESCRIPTION ^

Module for checking validity of MARC records. 99% of the users will want to do something like is shown in the synopsis. The other intrepid 1% will overload the MARC::Lint module's methods and provide their own special field-level checking.

What this means is that if you have certain requirements, such as making sure that all 952 tags have a certain call number in them, you can write a function that checks for that, and still get all the benefits of the MARC::Lint framework.

EXPORT ^

None. Everything is done through objects.

METHODS ^

new()

No parms needed. The MARC::Lint object is little more than a list of warnings and a bunch of rules.

warnings()

Returns a list of warnings found by check_record() and its brethren.

clear_warnings()

Clear the list of warnings for this linter object. It's automatically called when you call check_record().

warn( $str [, $str...] )

Create a warning message, built from strings passed, like a print statement.

Typically, you'll leave this to check_record(), but industrious programmers may want to do their own checking as well.

check_record( $marc )

Does all sorts of lint-like checks on the MARC record $marc, both on the record as a whole, and on the individual fields & subfields.

check_xxx( $field )

Various functions to check the different fields. If the function doesn't exist, then it doesn't get checked.

check_020()

Looks at 020$a and reports errors if the check digit is wrong. Looks at 020$z and validates number if hyphens are present.

Uses Business::ISBN to do validation. Thirteen digit checking is currently done with the internal sub _isbn13_check_digit(), based on code from Business::ISBN.

TO DO (check_020):

 Fix 13-digit ISBN checking.

_isbn13_check_digit($ean)

Internal sub to determine if 13-digit ISBN has a valid checksum. The code is taken from Business::ISBN::as_ean. It is expected to be temporary until Business::ISBN is updated to check 13-digit ISBNs itself.

check_041( $field )

Warns if subfields are not evenly divisible by 3 unless second indicator is 7 (future implementation would ensure that each subfield is exactly 3 characters unless ind2 is 7--since subfields are now repeatable. This is not implemented here due to the large number of records needing to be corrected.). Validates against the MARC Code List for Languages (http://www.loc.gov/marc/) using the MARC::Lint::CodeData data pack to MARC::Lint (%LanguageCodes, %ObsoleteLanguageCodes).

check_043( $field )

Warns if each subfield a is not exactly 7 characters. Validates each code against the MARC code list for Geographic Areas (http://www.loc.gov/marc/) using the MARC::Lint::CodeData data pack to MARC::Lint (%GeogAreaCodes, %ObsoleteGeogAreaCodes).

check_245( $field )

 -Makes sure $a exists (and is first subfield).
 -Warns if last character of field is not a period
 --Follows LCRI 1.0C, Nov. 2003 rather than MARC21 rule
 -Verifies that $c is preceded by / (space-/)
 -Verifies that initials in $c are not spaced
 -Verifies that $b is preceded by :;= (space-colon, space-semicolon, space-equals)
 -Verifies that $h is not preceded by space unless it is dash-space
 -Verifies that data of $h is enclosed in square brackets
 -Verifies that $n is preceded by . (period)
  --As part of that, looks for no-space period, or dash-space-period (for replaced elipses)
 -Verifies that $p is preceded by , (no-space-comma) when following $n and . (period) when following other subfields.
 -Performs rudimentary article check of 245 2nd indicator vs. 1st word of 245$a (for manual verification).

 Article checking is done by internal _check_article method, which should work for 130, 240, 245, 440, 630, 730, and 830.

_check_article

Check of articles is based on code from Ian Hamilton. This version is more limited in that it focuses on English, Spanish, French, Italian and German articles. Certain possible articles have been removed if they are valid English non-articles. This version also disregards 008_language/041 codes and just uses the list of articles to provide warnings/suggestions.

source for articles = http://www.loc.gov/marc/bibliographic/bdapp-e.html

Should work with fields 130, 240, 245, 440, 630, 730, and 830. Reports error if another field is passed in.

SEE ALSO ^

Check the docs for MARC::Record. All software links are there.

TODO ^

LICENSE ^

This code may be distributed under the same terms as Perl itself.

Please note that these modules are not products of or supported by the employers of the various contributors to the code.

syntax highlighting: