fafind-eq-seq - find equal sequences
./fafind-eq-seq [--help] [--eval 'perlcode'] <file1> [<file2> ... <fileN>] >file_with_results.txt ./fafind-eq-seq [--help] [--eval 'perlcode'] --filter <file1> [<file2> ... <fileN>] >file_with_only_unique_seqs.fasta
Find identical / equal sequences in a given set of fasta files. Info messages go to standard error (stderr), results to standard output (stdout).
The result output of file_with_results.txt consists of lines following the pattern
<ID> <DESCRIPTION><TAB><FILE> <TAB><ID> <DESCRIPTION><TAB><FILE> <TAB><ID> <DESCRIPTION><TAB><FILE> <ID> <DESCRIPTION><TAB><FILE> <TAB><ID> <DESCRIPTION><TAB><FILE> <TAB><ID> <DESCRIPTION><TAB><FILE> <ID> <DESCRIPTION><TAB><FILE> <TAB><ID> <DESCRIPTION><TAB><FILE> <TAB><ID> <DESCRIPTION><TAB><FILE>
whereas each unindented line and the following <TAB>-indented lines mark one group of identical sequences.
Do not print the groups but the sequences in fasta format instead. Duplicated sequences are omitted. The resulting fasta output is not checked for identical ids, etc.
Synonyms: -f
Display this message.
Synonyms: -?, -h
Manipulate input sequences on the fly. The current sequence string is set to $_.
$_
This doesn't change the actual output sequence, e.g. on filtering.
Can be very handy for comparing aa-sequences from two different files, at which one file uses * as stop codon and the other file not:
./fafind-eq-seq --eval 's/\*$//' <file1> <file2> >file_with_results.txt
Synonyms: -e
jw bargsten, <joachim.bargsten at wur.nl>
<joachim.bargsten at wur.nl>
To install Bio::Gonzales, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::Gonzales
CPAN shell
perl -MCPAN -e shell install Bio::Gonzales
For more information on module installation, please visit the detailed CPAN module installation guide.