Bio::Gonzales::Seq::IO - fast utility functions for sequence IO
use Bio::Gonzales::Seq::IO qw( faslurp faspew fahash fasubseq faiterate )
faslurp
reads in all sequences from @filenames
and returns an array in list or an arrayref in scalar context of the read sequences. The sequences are stored as FAlite2::Entry objects.
Allows you to create an iterator for the fasta file $filename
. This iterator can be used to loop over the sequence file w/o reading in all content at once. Iterator usage:
while(my $sequence_object = $iterator->()) { #do something with the sequence object }
#ARRAY OF ARRAYS @ids_with_locations = ( [ $id, $begin, $end, $strand ], ... );
Config options can be:
%c = ( keep_id => 1, # keeps the original id of the sequence wrap => 1, # see further down relaxed_range => 1, # substitute 0 or undef for $begin with '^' and for $end with '$' );
There are several possibilities for $begin
and $end
:
GGCAAAGGA ATGATGGTGT GCAGGCTTGG CATGGGAGAC ^..........^ (1,11) OR ('^', 11) ^.....................................^ (4,'$') ^..............^ (21,35) { with wrap on: OR (-19,35) OR (-19, -5) } ^..................^ (21,35) { with wrap on: OR (-19,'$') }
wrap
: The default is to limit all negative values to the sequence boundaries, so a negative begin would be equal to 1 or '^' and a negative end would be equal to '$'.
Does the same as faslurp, but returns an hash with the sequence ids as keys and the sequence objects as values.
"spew" out the given sequences to a file. Every $seqN
argument can be an hash reference with FAlite2::Entry objects as values or an array reference of FAlite2::Entry objects or just plain FAlite2::Entry objects.
Creates an iterator that writes the sequences to the given $filename
or $fh
.
for my $sequence_object (@sequences) { $iterator->($sequence_object) } #DO NOT FORGET THIS, THIS CALL WILL CLOSE THE FILEHANDLE $iterator->(); #this is equal to: $iterator->(@sequences); $iterator->(); #or $iterator->(\@sequences); $iterator->(); #DO NOT DO THIS: $iterator->();
The filehandle will not be closed in case one supplies not a $filename
but a $fh
handle.
$Bio::Gonzales::Seq::IO::WIDTH = 60; #sequence width in fasta output #but only if set to 'all_pretty' ('all' is default) $Bio::Gonzales::Seq::IO::SEQ_FORMAT = 'all_pretty';
jw bargsten, <joachim.bargsten at wur.nl>