Mecom - A Perl module for protein contact interfaces evolutive analysis
Version 1.11
# Create the object my $coe = Mecom->new( pdb => 'pdb/files/path/2occ.pdb', alignment => 'aln/files/path/chainM.aln', chain => 'M', ); # Run calcs $coe->run; # Write HTML Report open REP, ">report.html"; print REP $coe->run_report; close REP;
This module integrates a workflow aimed to address the evolvability of the contact interfaces within a protein complex. The method Mecom->run launchs the whole analysis. Also, such workflow is divided into the following steps:
Mecom->run
Mecom->run_struct
Mecom->run_filtering
Mecom->run_subalign
Mecom->run_yang
Mecom->run_stats1
A detailed explanation about these methods is reported below.
$obj = Mecom->new(%input_data);
The new class method construct a new Mecom object. The returned object can be used to perform several evolutive analysis. new accepts the following parameters in an input hash as above used %input_data:
new
%input_data
pdbfilepath (required if contactfile is missing)
A valid pdb file path to be opened for reading.
contactfilepath (required if pdb is missing)
A valid contact file path. This file must contain the structural information retrieved by a previous analysis on the same chain
alignfilepath (required)
A valid DNA multiple alignment file path. The alignment must correspond with the specified chain and must be at least as long as the pdb chain (x3)
chain (required)
A given subunits within the studied complex
pth (default 4 Angstroms)
Proximity threshold. The maximun distance between two residues to be considered as a contact pair
sth (default 0.05)
Exposure threshold. The maximun exposure fraction to be considered as a buried residue.
sthmargin (default 0)
An error margin for sth. For instance: if is set to 0.01, residues with exposure higher than 0.06 will be considered as exposed, those with exposure lower than 0.04 will be buried and those residues with exposure between 0.04 and 0.06 will not be considered
contactwith
A string with valid chain identificators separated by commas:
$contactwith = "A,B,D";
if it is set, the program will only consider as contact residues those in close proximity with the specified chains. The others will be excluded.
informat (default fasta)
Specify the format of the input alignment file. Supported formats include fasta, genbank, embl, swiss (SwissProt), Entrez Gene and tracefile formats such as abi (ABI) and scf. There are many more, for a complete listing see the SeqIO HOWTO (http://bioperl.open-bio.org/wiki/HOWTO:SeqIO).
If no format is specified and a filename is given then the module will attempt to deduce the format from the filename suffix. If there is no suffix that Bioperl understands then it will attempt to guess the format based on file content. If this is unsuccessful then SeqIO will throw a fatal error.
The format name is case-insensitive: 'FASTA', 'Fasta' and 'fasta' are all valid.
Currently, the tracefile formats (except for SCF) require installation of the external Staden "io_lib" package, as well as the Bio::SeqIO::staden::read package available from the bioperl-ext repository.
oformat (default clustalw)
Specify the format of the output sub-alignments. As above.
gc (default 0)
The genetic code. The attribute must be one of the following integers, which correspond with the indicated genetic code:
0: Standar 1: Mammailan mitochondrial 2: Yeast mitochondrial 3: Mold mitochondiral 4: Invertebrate mitochondrial 5: Ciliate nuclear 6: Echinoderm mitochondrial 7: Euplotid mitochondrial 8: Alternative yeast nuclear 9: Ascidian mitochondrial 10: Blepharisma nuclear
These codes correspond to transl_table 1 to 11 of GENEBANK
ocontact (default ocontact)
A valid file path to write the structural results
dsspbin (default dssp)
The path to the DSSP binary
Title : run Usage : $obj->run Function: Launch the whole workflow analysis Returns : Args :
Title : run_struct Usage : $obj->run_struct Function: Launch structural analysis and stores the result in the attribute: "structdata" Returns : True if success Args :
Title : run_struct Usage : $obj->run_filtering Function: Build different categories of sets (Contact, NonContact ...) and set the attribute "lists" with the result Returns : True if success Args :
Title : run_subalign Usage : $obj->run_subalign Function: Build new alignments from the input chain alignment and the categories built by run_filtering method. Stores the result into "subalns" attribute Returns : True if success Args :
Title : run_yang Usage : $obj->run_yang Function: Launch PAML for each alignment stored at "sub_alns" attribute and store the results into "paml_res" Returns : True if success Args :
Title : run_stats1 Usage : $obj->run_stats1 Function: Run a Z-Test with the obtained evolutionary data and store the results into "stats" attribute Returns : True if success Args :
Title : run_report Usage : $obj->run_report Function: Write a HTML report Returns : [String] HTML report with the results and input data Args :
Title : run_report Usage : $obj->cat_aln(@alns) Function: Concatenates alignment objects. Sequences are identified by id. An error will be thrown if the sequence ids are not unique in the first alignment. If any ids are not present or not unique in any of the additional alignments then those sequences are omitted from the concatenated alignment, and a warning is issued. An error will be thrown if any of the alignments are not flush, since concatenating such alignments is unlikely to make biological sense. Returns : A unique Bio::SimpleAlign object Args : A list of Bio::SimpleAlign objects
Once each analysis has been performed, the resulting data is stored in other setable attributes:
structdata
[Array] A table with the structural information calculated by Mecom::Contact.pm and DSSP
lists
[Hash] Each item contains a list of number corresponding with each type of residue. The key for a given item is the name for the category.
Contact NonContact ExposedNonContact ContactWith_$specified_chains [...]
subalns
[Hash] Each item contains a sub-alignment for a given category (see above)
pamlres
[Hash] Results for evolutive analysis. Each item contains the results for a given sub-alignment (see above)
stats
[Hash] Statistical results
All attributes are accesible and mutable from methods called get_attribute and set_attribute, respectively. For example:
# Set the proximity threshold ("pth") to 3 Angstroms $obj->set_pth(3); # Print the current value of the attribute "pth" print $obj->get_pth;
The processed data is also stored in attributes. Thus, this kind of methods can also be used to access and modify the results.
Hector Valverde, <hvalverde@uma.es>
<hvalverde@uma.es>
Juan Carlos Aledo, <caledo@uma.es>
<caledo@uma.es>
Please report any bugs or feature requests to bug-Mecom-Complex at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Mecom-Complex. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
bug-Mecom-Complex at rt.cpan.org
This module is the program core of MECOM Perl program. Further information about this project is available at:
http://mecom.hval.es/
You can find documentation for this module with the UNIX man command.
man Mecom
Copyright 2013 Hector Valverde and Juan Carlos Aledo.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.
To install Mecom, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Mecom
CPAN shell
perl -MCPAN -e shell install Mecom
For more information on module installation, please visit the detailed CPAN module installation guide.