The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

protein_families_to_proteins

protein_families_to_proteins can be used to access the set of proteins (i.e., the set of MD5 values) represented by each of a set of protein_families. We define protein_families as sets of fids (rather than sets of MD5s. This may, or may not, be a mistake.

Example:

    protein_families_to_proteins [arguments] < input > output

The standard input should be a tab-separated table (i.e., each line is a tab-separated set of fields). Normally, the last field in each line would contain the identifer. If another column contains the identifier use

    -c N

where N is the column (from 1) that contains the subsystem.

This is a pipe command. The input is taken from the standard input, and the output is to the standard output.

Documentation for underlying call

This script is a wrapper for the CDMI-API call protein_families_to_proteins. It is documented as follows:

  $return = $obj->protein_families_to_proteins($protein_families)
Parameter and return types
$protein_families is a protein_families
$return is a reference to a hash where the key is a protein_family and the value is a proteins
protein_families is a reference to a list where each element is a protein_family
protein_family is a string
proteins is a reference to a list where each element is a protein
protein is a string

Command-Line Options

-c Column

This is used only if the column containing the subsystem is not the last column.

-i InputFile [ use InputFile, rather than stdin ]

Output Format

The standard output is a tab-delimited file. For each input line, there are multiple output lines, one for each protein in the family. The protein is added to the end of each line.

Input lines that cannot be extended are written to stderr.