Robert E Bruccoleri > Bio-SDRS > sdrs.pl

Download:
Bio-SDRS-0.08.tar.gz

Annotate this POD

CPAN RT

New  1
Open  0
View/Report Bugs
Source  

NAME ^

sdrs.pl - Command line script to use Bio::SDRS to run a Signmoidal Dose Response Search.

SYNOPSIS ^

    sdrs.pl -multiple=1.05 \
            -step=20 \
            -ldose=0.4 \
            -hdose=25000 \
            -trim=0 \
            -significance=0.05 \
            -outdir=results \
            data/OVCAR4_HCS_avg.txt

OPTIONS ^

-multiple=<float>

Specifies the multiplicity factor for increasing the dose during the search. It must be greater than one.

-ldose=<float>

Specifies the minimum dose for the search. Must be greater than zero.

-hdose=<float>

Specifies the maximum dose for the search. Must be greater than the ldose above.

-step=<float>

This value specifies the maximum change in doses in the search. In the search process, this module starts at the ldose value. It tries multiplying the current dose by the multiple value, but it will only increase the dose by no more than the step value specified here. It must be positive.

-maxproc=<integer>

Specifies the maximum number of processes to be used in the search.

-trim=<float>

This value specifies the proportion of assays to be omitted in the analysis.

First, the maximal response across the doses is calculated for every assay. Then, assays with lower response will be removed from the analysis by this factor. Must be a real number between zero and one, where a value of zero (default) means all assays will be analyzed, and one means none.

This factor is useful when the SDRS result will be fed into a multiple test correction procedure such as a false discovery rate (FDR) algorithm. For example, for genome-scale transcriptional dose response data, a non-zero value is suggested to remove non-expressed transcripts, which not only speeds up the process, but also improves the FDR output.

-significance=<float>

This value specifies the minimum permitted significance value for the F score cutoff. It must be between zero and 1.

-outdir=<directory>

Specifies the directory where the results are written. If this directory does not exist, it is created.

-[no]debug

Controls display of debugging information. Normally should be left off.

DESCRIPTION ^

This program provides a simple command line interface to the Bio::SDRS Perl module.

INPUT FILE ^

There is only one input file to this script which is meant to provide the data for a number of assays. The file provides the doses of compound used in the experiment, and it provides the measurements for each dose for every assay. The file uses Tab Separated Value (TSV) format, with each value separated from each other using the Tab character (ASCII code 9).

The first line of the input file must contain the doses as floating point numbers, with a comment word as the first word. For example, the following line would describe a eight dose experiment.

Dose 0.42 1.27 3.81 11.43 34.29 102.88 308.64 925.92

The succeeding lines of the input file are the measured responses for each assay, with no limit to the number of assays. The first word in each line is the assay name, and the remaining lines are the responses for each dose as specified in the dose line at the top of the file, and in the same order. Here is an example response line consistent with the dose example above:

Caspase3 2.1 1.695 1.675 1.735 1.56 1.77 2.34 2.595

OUTPUT FILES ^

The output files are written to the directory specified by the outdir option above. The names are composed in part from the options:

sdrs.multiple.step.out

This file contains the best (local) fitting model at every search dose for every assay. Each line is a tab separated value containing the assay name, search dose, F-score for the best fit at the dose and corresponding A, B, and D values.

sdrs.multiple.step.EC50.out

This file contains the estimated EC50 data for each assay in the input file as a tab separated record. The columns are as follows:

 1  assay name
 2  MAX         Maximum F score
 3  MIN         Minimum F score
 4  LOW         Lower bound of 95% confidence interval for the estimated EC50. 
 5  HIGH        Upper bound of 95% confidence interval for the estimated EC50.
 6  EC50        estimated EC50.
 7  PVALUE      P value of the best fitting model
 8  EC50RANGE   range of 95% confidence interval for the estimated EC50.
 9  PEAK        Number of peaks in the F scores at search doses across experimental dose range.
10  A           estimated value for A in the best model.
11  B           estimated value for B in the best model.
12  D           estimated value for D in the best model.
13  FOLD        Ratio of B/A or 99999.0 if a == 0. Positive if D < 0, negative otherwise.

sdrs.sorted_probes.out

For every calculated dose in the search, list the assays in order of F score. Each line in this file is a TSV record where the first field is the calculated dose, and the remaining fields are the assay names.

sdrs.pval_FDR.out

For every calculated dose in the search, list the P Values for fit for the assays in order of F score exactly in the same order as the sdrs.sorted_probes.out file. Each line in this file is a TSV record where the first field is the calculated dose, and the remaining fields are the assay names. This file is meant to be used in conjunction with the sdrs.sorted_probes.out file.

SEE ALSO ^

Bio::SDRS.

AUTHORS ^

 Ruiru Ji <ruiruji@gmail.com>
 Nathan O. Siemers <nathan.siemers@bms.com>
 Lei Ming <lei.ming@bms.com>
 Liang Schweizer <liang.schweizer@bms.com>
 Robert Bruccoleri <bruc@acm.org>
syntax highlighting: