Florent Angly > Bio-Community-0.001005 > Bio::Community::IO::Driver::unifrac



Annotate this POD


View/Report Bugs
Source   Latest Release: Bio-Community-0.001008


Bio::Community::IO::Driver::unifrac - Driver to read and write files in the (Fast)Unifrac format


   my $in = Bio::Community::IO->new( -file => 'unifrac_commmunities.txt', -format => 'unifrac' );

   # See Bio::Community::IO for more information


This Bio::Community::IO::Driver::unifrac driver reads and writes Unifrac environment files and FastUnifrac sample ID mapping files, whose format is described at http://bmf2.colorado.edu/unifrac/help.psp#env_file and http://bmf2.colorado.edu/fastunifrac/help.psp#sample_id_mapping_file. In this tab-delimited format, the first column is a sequence ID, the second is the name of a community, and the optional third column contains the number of observations of this sequence in the community. Multiple communities can be written in a Unifrac formatted-file and spaces are not supported in community name or member description. Example:

  Sequence.1    Sample.1        1
  Sequence.1    Sample.2        2
  Sequence.2    Sample.1        15
  Sequence.3    Sample.1        2
  Sequence.4    Sample.2        8
  Sequence.5    Sample.1        4
  Sequence.6    Sample.3        1
  Sequence.6    Sample.2        1

For each Bio::Community::Member $member generated from a Unifrac file, $member->desc() contains the content of the first field, i.e. the first column. Since the Unifrac format does not specify a member ID, one is automatically generated and can be retrieved using $member->id().

Note that member counts (the third column) is optional. Example:

  Sequence.1    Sample.1
  Sequence.1    Sample.2
  Sequence.2    Sample.1
  Sequence.3    Sample.1
  Sequence.4    Sample.2
  Sequence.5    Sample.1
  Sequence.6    Sample.3
  Sequence.6    Sample.2

In this case the data is to be interpreted as presence/absence data. When reading a Unifrac file without counts, all members are given a count of 1. Conversely, when writing a Unifrac file, if all members have a count of 1, then the third column is not written. Also, when writing Unifrac files, any spaces in community member name or member description is replaced by a dot.


See Bio::Community::IO.


Florent Angly florent.angly@gmail.com


User feedback is an integral part of the evolution of this and other Bioperl modules. Please direct usage questions or support issues to the mailing list, bioperl-l@bioperl.org, rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.

If you have found a bug, please report it on the BioPerl bug tracking system to help us keep track the bugs and their resolution: https://redmine.open-bio.org/projects/bioperl/


Copyright 2011-2014 by Florent Angly <florent.angly@gmail.com>

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10.1 or, at your option, any later version of Perl 5 you may have available.

syntax highlighting: