The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

CracTools - A set of tools designed to extract data from CRAC's SAM files and to provide annotations.

VERSION

version 1.251

DESCRIPTION

CracTools-core is the cornerstone of the CracTools. It is a toolbox that aim to ease the build of pipelines in the field of bioinformatics. It has been originally built to produce pipelines on top of CRAC software, but you can use the CracTools-core tools in an other context. It has a lot of built-in features to parse file, intersect biological events, integrate annotation, sharing configuration.

CracTools-core is also shiped with some binaries that are directly based on the CracTools-core API:

cractools extract: this tools aims to extract biological events (splices, snp, indels, chimeras) from BAM files produced by CRAC's analysis.
cractools gtf2togff3: is a tools that convert gtf annotation files to gff3 format that is the standard in CracTools.
More tools are about to come (soon)

SPECIFICITIES

In CracTools, strand are encoded as 1, -1 for forward and reverse respectively. CracTools work on close intervals [a,b] and 0-based coordinate system.

MODULES

File parsing

CracTools::Utils

Is a module that provide usefull functions for opening files (I/O) with iterators, simple parsing of standard files format (VCF,BED,GTF,GFF), or performing transormations like reverse-complementing.

CracTools::SAMReader and CracTools::SAMReader::SAMline

Are modules that provide iterators and objects to easily read SAM/BAM file generated by CRAC and provide dedicated methods to extract additional fields added by CRAC to each record.

CracTools::GFF::Annotation

Is a module to parse and access GFF3 file.

Genomic-based datastructures

CracTools::Interval::Query

Is a module to store and query genomic intervals associated with variables. It is based on the interval tree datastructure provided by Set::IntervalTree.

CracTools::Interval::Query::File

Acts like CracTools::Interval::Query but read interval from files and return lines of the file matching the query. It has built-in methods to parse, SAM, G{T|F}F, BED, VCF files but you can provide your own method for other file formats.

CracTools::GenomeMask

Is a module that define a BitVector mask over a whole genome and provide method to query this mask. It can read genome sequence and length from various sources (SAM headers, CRAC index, User input).

Annotation

CracTools::Annotator

Is a module based on CracTools::Interval::Query::File that provides powerfull methods to query annotation files and prioritize hits to fit specific application needs.

Utilities

CracTools::Config

Is a module that aim to integrate a common configuration file among all the cractools pipelines. It automatically load the configuration file by looking to diverse locations, then it provides methods to retrieved the variables declared in the configuration file.

CracTools::Output

Is a module that provide methods to write customized column-based output files with pre-defined headers.

AUTHORS

  • Nicolas PHILIPPE <nphilippe.research@gmail.com>

  • Jérôme AUDOUX <jaudoux@cpan.org>

  • Sacha BEAUMEUNIER <sacha.beaumeunier@gmail.com>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2017 by IRMB/INSERM (Institute for Regenerative Medecine and Biotherapy / Institut National de la Santé et de la Recherche Médicale) and AxLR/SATT (Lanquedoc Roussilon / Societe d'Acceleration de Transfert de Technologie).

This is free software, licensed under:

  The GNU Affero General Public License, Version 3, November 2007