VCF.pm. Module for validation, parsing and creating VCF files. Supported versions: 3.2, 3.3, 4.0, 4.1, 4.2
From the command line: perl -MVCF -e validate example.vcf perl -I/path/to/the/module/ -MVCF -e validate_v32 example.vcf
From a script: use VCF;
my $vcf = VCF->new(file=>'example.vcf.gz',region=>'1:1000-2000'); $vcf->parse_header(); # Do some simple parsing. Most thorough but slowest way how to get the data. while (my $x=$vcf->next_data_hash()) { for my $gt (keys %{$$x{gtypes}}) { my ($al1,$sep,$al2) = $vcf->parse_alleles($x,$gt); print "\t$gt: $al1$sep$al2\n"; } print "\n"; } # This will split the fields and print a list of CHR:POS while (my $x=$vcf->next_data_array()) { print "$$x[0]:$$x[1]\n"; } # This will return the lines as they were read, including the newline at the end while (my $x=$vcf->next_line()) { print $x; } # Only the columns NA00001, NA00002 and NA00003 will be printed. my @columns = qw(NA00001 NA00002 NA00003); print $vcf->format_header(\@columns); while (my $x=$vcf->next_data_array()) { # this will recalculate AC and AN counts, unless $vcf->recalc_ac_an was set to 0 print $vcf->format_line($x,\@columns); } $vcf->close();
About : Validates the VCF file. Usage : perl -MVCF -e validate example.vcf.gz # (from the command line) validate('example.vcf.gz'); # (from a script) validate(\*STDIN); Args : File name or file handle. When no argument given, the first command line argument is interpreted as the file name.
About : Same as validate, but assumes v3.2 VCF version. Usage : perl -MVCF -e validate_v32 example.vcf.gz # (from the command line) Args : File name or file handle. When no argument given, the first command line argument is interpreted as the file name.
About : Creates new VCF reader/writer. Usage : my $vcf = VCF->new(file=>'my.vcf', version=>'3.2'); Args : fh .. Open file handle. If neither file nor fh is given, open in write mode. file .. The file name. If neither file nor fh is given, open in write mode. region .. Optional region to parse (requires tabix indexed VCF file) silent .. Unless set to 0, warning messages may be printed. strict .. Unless set to 0, the reader will die when the file violates the specification. version .. If not given, '4.0' is assumed. The header information overrides this setting.
About : (Re)Open file. No need to call this explicitly unless reading from a different region is requested. Usage : $vcf->open(); # Read from the start $vcf->open(region=>'1:12345-92345'); Args : region .. Supported only for tabix indexed files
About : Close the filehandle Usage : $vcf->close(); Args : none Returns : close exit status
About : Reads next VCF line. Usage : my $vcf = VCF->new(); my $x = $vcf->next_line(); Args : none
About : Reads next VCF line and splits it into an array. The last element is chomped. Usage : my $vcf = VCF->new(); $vcf->parse_header(); my $x = $vcf->next_data_array(); Args : Optional line to parse
About : Parsing big VCF files with many sample columns is slow, not parsing unwanted samples may speed things a bit. Usage : my $vcf = VCF->new(); $vcf->set_samples(include=>['NA0001']); # Exclude all but this sample. When the array is empty, all samples will be excluded. $vcf->set_samples(exclude=>['NA0003']); # Include only this sample. When the array is empty, all samples will be included. my $x = $vcf->next_data_hash(); Args : Optional line to parse
To install VCF, copy and paste the appropriate command in to your terminal.
cpanm
cpanm VCF
CPAN shell
perl -MCPAN -e shell install VCF
For more information on module installation, please visit the detailed CPAN module installation guide.