The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

get_entity_Publication

Annotators attach publications to ProteinSequences. The criteria we have used to gather such connections is a bit nonstandard. We have sought to attach publications to ProteinSequences when the publication includes an expert asserting a belief or estimate of function. The paper may not be the original characterization. Further, it may not even discuss a sequence protein (much of the literature is very valuable, but reports work on proteins in strains that have not yet been sequenced). On the other hand, reports of sequencing regions of a chromosome (with no specific assertion of a clear function) should not be attached. The attached publications give an ID (usually a Pubmed ID), a URL to the paper (when we have it), and a title (when we have it).

Example:

    get_entity_Publication -a < ids > table.with.fields.added

would read in a file of ids and add a column for each filed in the entity.

The standard input should be a tab-separated table (i.e., each line is a tab-separated set of fields). Normally, the last field in each line would contain the id. If some other column contains the id, use

    -c N

where N is the column (from 1) that contains the id.

This is a pipe command. The input is taken from the standard input, and the output is to the standard output.

The Publication entity has the following relationship links:

Concerns ProteinSequence

Command-Line Options

-c Column

Use the specified column to define the id of the entity to retrieve.

-h

Display a list of the fields available for use.

-fields field-list

Choose a set of fields to return. Field-list is a comma-separated list of strings. The following fields are available:

title
pubdate

Output Format

The standard output is a tab-delimited file. It consists of the input file with an extra column added for each requested field. Input lines that cannot be extended are written to stderr.