BYU TRG > Convert-MRC > Convert::MRC

Download:
Convert-MRC-4.03.tar.gz

Dependencies

Annotate this POD

Website

View/Report Bugs
Module Version: 4.03   Source  

NAME ^

Convert::MRC - CONVERT MRC TO TBX-BASIC

VERSION ^

version 4.03

SYNOPSIS ^

        use strict;
        use warnings;

        my $converter = Convert::MRC->new;
        $converter->input_fh('/path/to/MRC/file.mrc');
        $converter->tbx_fh('/path/to/output/file.tbx');
        $converter->log_fh('/path/to/log/file.log');
        $converter->convert;

DESCRIPTION ^

MRC

The MRC format is fully described in an article by Alan K. Melby which appeared in Tradumatica. At an approximation, it is a file of tab-separated rows, each consisting of an ID, a data category, and a value to be stored for that category in the object with the given ID. The file should be sorted on its first column. If it is not, the converter may skip rows (if they are at too high a level) or end processing early (if the order of A-rows, C-rows, and R-rows is broken).

CONVERSION TO TBX-BASIC

This translator receives a file or list of files in this format and emits TBX-Basic, a standard format for terminology interchange. Incorrect or unusable input is skipped, with one exception, and the problem is noted in a log file. The outputs generally have the same filename as the inputs, and a suffix of .tbx and .warnings, but a number may be added to the filename to ensure the output filenames are unique.

The exception noted is this: If the user documents a party responsible for some change in the termbase, but does not state whether that party is a person or an organization, the party will be included in the TBX as a "respParty". This designation does not conform to the TBX-Basic standard and will need to be changed (to "respPerson" or "respOrg") before the file will validate. This is one of the circumstances in which the converter will output invalid TBX-Basic.

The other circumstance is that a file might not contain a definition, a part of speech, or a context sentence for some term, or might not contain a term itself. The converter detects these and warns about them, but there is no way it could fix them. It does not detect or warn about concepts containing no langSet or langSets containing no term, but these are also invalid.

NAME ^

Convert::MRC- Perl extension for converting MRC files into TBX-Basic.

METHODS ^

new

Creates and returns a new instance of Convert::MRC.

tbx_fh

Optional argument: string file path or GLOB

Sets and/or returns the file handle used to print the converted TBX.

log_fh

Optional argument: string file path or GLOB

Sets and/or returns the file handle used to log any messages.

input_fh

Optional argument: string file path or GLOB; '-' means STDIN

Sets and/or returns the file handle used to read the MRC data from.

batch

Processes each of the input files, printing the converted TBX file to a file with the same name and the suffix ".tbx". Warnings are also printed to a file with the same name and the suffix ".log".

convert

Converts the input MRC data into TBX-Basic:

SEE ALSO ^

AUTHOR ^

Nathan Rasmussen, Nathan Glenn <garfieldnate@gmail.com>

COPYRIGHT AND LICENSE ^

This software is copyright (c) 2013 by Alan K. Melby.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

syntax highlighting: