NAME

parsepica - fetch, parse and transform PICA+ data

SYNOPSIS

parsepica [options] [input file(s) or SRU-Server(s) and queries(s)]

OPTIONS

 -input FILE     file with input files on each line ('-': STDIN)
 -files FILE     read input files from another file ('-': STDIN)
 -output FILE    print all valid records to a given file ('-': STDOUT)
 -xml [FILE]     print records in XML
 -pxml [FILE]    print records in pretty XML (with linebreaks)
 -pretty [FILE]  print records in pretty format
 -null           supress record output
 -quiet          supress logging
 -select FIELD   select a specific field or subfield (not if XML output)
 -count          print simple statistics
 -stats 0|1|2    print full statistics (1: fields, 2: subfields)
 -config FILE    read configuration from a file ('-': search default file)
 -auto           use default config file $PICASOURCE or ./pica.conf
 -log [FILE]     print logging to a given file ('-': STDOUT, default)
 -help           brief help message
 -limit N        limit the result set to N records (only for SRU)
 -man            full documentation with examples

DESCRIPTION

This script provides a simple command line client to fetch and transform PICA+ records. You can parse and transform local files (compressed .gz files can directly be read) or query records from a server via various protocols. You can also specify a configuration file for PICA::Source which includes a pointer to an SRU, Z39.50, PSI, or unAPI source.

The records can then be written to a file or STDOUT in PICA+ or PICA/XML format. Instead of writing full records you can select single PICA+ fields. Selecting fields with parsepica is around half as fast as using grep, but grep does not really parse and check for wellformedness.

By default input is read from STDIN and written to STDOUT ('-') without logging. On request logging information is printed to STDOUT or to a specified logfile. Records that cannot be parseded produce error messages to STDERR.

EXAMPLES

parsepica file1 -o file2: Read from 'file1' and print parseable records to 'file2'
parsepica file1 -px file2.xml: Parse from 'file1' and pretty print XML format to 'file2.xml'.
parsepica http://gso.gbv.de/sru/DB=2.1/ pica.isb=3-423-31039-1: Get records with ISBN 3-423-31039-1 via SRU.
parsepica -c pica.isb=3-423-31039-1: Get records with ISBN 3-423-31039-1 via SRU if the default config file contains SRU =.http://gso.gbv.de/sru/DB=2.1/.
parsepica -se 021A -o - -q picadata: Select all fields '021A' from 'picadata' and write to STDOUT.
parsepica -log -count -null file1: Parse from 'file1' and count fileds
parsepica -log -stat 2 file1: Parse from 'file1' and print detailed statistics

LIMITATIONS

Error handling for broken records is not fully implemented. If you want to parse PICA+ records downloaded via WinIBW, you may need to first clean them with the script winibw2pica.

The limit parameter should also be implemented for other sources but SRU and an offset parameter would be useful. Fetching records via other protocols but SRU has not been tested. The statistics method can be improved a lot.

AUTHOR

Jakob Voss jakob.voss@gbv.de

To install PICA::Record, copy and paste the appropriate command in to your terminal.

cpanm

cpanm PICA::Record

CPAN shell

perl -MCPAN -e shell
install PICA::Record

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)