BS_PCRTagger.pl
Version 2.00
This utility creates unique tags for open reading frames to aid the analysis of synthetic content in a nascent synthetic genome. Each tag in a gene has a wildtype and a synthetic version that correspond to the same offset in the gene; each tag can be paired with another to form gene specific amplicons which are also specific to either wildtype or synthetic sequence, depending on which tags are used. To pick tags for a chromosome, each open reading frame over I<MINORFLEN> base pairs long will be slightly recoded to contain a set of PCR tags. The locations and sequences of these tags are carefully chosen to maximize the selectivity of the tags for either wild type or synthetic sequence. Each wild type or synthetic tag and its reverse complement are unique in the entire wild type genome; this is accomplished by creating a BLAST database for the entire wild type genome and BLASTing each potential tag against it (this requires that a complete wild type genome is available in the BioStudio repository). Pairs of tags are selected in such a way that they will not amplify any other genomic sequence under 1000 bases long. Each synthetic counterpart to a wild type tag is recoded with GeneDesign's “most different” algorithm to guarantee maximum nucleotide sequence difference while maintaining identical protein sequence and, hopefully, minimizing any effect on gene expression. The synthetic tags are all at least I<MINPERDIFF> percent recoded from the wild type tags. Each tag is positioned in such a way that the first and last nucleotides correspond to the wobble of a codon that can be edited to change its wobble without changing its amino acid. This usually automatically excludes methionine or tryptophan, but it can exclude others when a I<MINRSCUVAL> filter is in place. The wobble restriction ensures that the synthetic and wild type counterparts have different 5’ and 3’ nucleotides, minimizing the chances that they (and their complements) will cross-prime. This means that tags will be between I<MINTAGLEN> and I<MAXTAGLEN> base pairs long, where I<TAGLEN> is a multiple of 3 plus 1. All tags have melting temperature between I<MINTAGMELT> and I<MAXTAGMELT> so they can be used in a single set of PCR conditions. Tag pairs are chosen to form amplicons specific for each ORF, with at least one amplicon chosen per kilobase of ORF. Each amplicon is between I<MINAMPLEN> and I<MAXAMPLEN> base pairs long, ensuring that they will all fall within an easily identifiable range on an agarose gel. No amplicon will be chosen within the first I<FIVEPRIMESTART> base pairs of an ORF to avoid disrupting unknown regulatory features. Amplicons are forbidden from overlapping each other by more than I<MAXAMPOLAP> percent.
Required arguments:
-C, --CHROMOSOME : The chromosome to be modified -E, --EDITOR : The person responsible for the edits -ME, --MEMO : Justification for the edits
Optional arguments:
-SCA, --SCALE : [genome, chrom (def)] Which version number to increment -SCO, --SCOPE : [seg, chrom (def)] How much sequence will the edit affect. seg requires STARTPOS and STOPPOS. -STA, --STARTPOS : The first base for analysis;ignored unless SCOPE = seg -STO, --STOPPOS : The last base for analysis;ignored unless SCOPE = seg --MINTAGMELT : (default 58) Minimum melting temperature for tags --MAXTAGMELT : (default 60) Maximum melting temperature for tags --MINPERDIFF : (default 33) Minimum base pair difference between synthetic and wildtype versions of a tag --MINTAGLEN : (default 19) Minimum length for tags. Must be a multiple of 3, plus 1 --MAXTAGLEN : (default 28) Maximum length for tags. Must be a multiple of 3, plus 1 --MINAMPLEN : (default 200) Minimum span for a pair of tags --MAXAMPLEN : (default 500) Maximum span for a pair of tags --MAXAMPOLAP : (default 25) Maximum percentage of overlap allowed between different tag pairs --MINORFLEN : (default 501) Minimum size of gene for tagging eligibility --FIVEPRIMESTART : (default 101) The first base in a gene eligible for a tag --MINRSCUVAL : (default 0.06) The minimum RSCU value for any replacement codon in a tag --OUTPUT : [html, txt (def)] Format of reporting and output. -h, --help : Display this message
1 POD Error
The following errors were encountered while parsing the POD:
Non-ASCII character seen before =encoding in '“most'. Assuming UTF-8
To install Bio::BioStudio, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::BioStudio
CPAN shell
perl -MCPAN -e shell install Bio::BioStudio
For more information on module installation, please visit the detailed CPAN module installation guide.