ForkBatch {Rolexa}R Documentation

Multi-threaded Probabilistic Base Calling

Description

Performs multi-threaded base calling on a collection of intensity files generated by the Solexa image analysis software

Usage

ForkBatch(nthreads=3, nfiles=2, prefix="rseq_", normal=TRUE, decorrelate=c('both','cycle','channel','none'), fit=FALSE, gzip=TRUE, pattern="s_[0-9_]+_int.txt", IntDirectory=Rolexa.env$IntDir,OutDirectory=Rolexa.env$OutDir, SeqDirectory=Rolexa.env$SeqDir)
OneBatchFit(files, gzip=TRUE, normal=TRUE, decorrelate=c('both','cycle','channel','none'))
OneBatchEval(files, SeqDirectory, gzip=TRUE, normal=TRUE, decorrelate=c('both','cycle','channel','none'))

Arguments

nthreads number of threads to use
nfiles number of input files to concatenate in one batch
prefix output file prefix, see BuildOutfileList
fit perform a complete EM optimization (default: only evaluates the model on a given classification)
pattern regular expression matching intensity file names
IntDirectory directory containing the intensity files
OutDirectory output directory
SeqDirectory directory containing the sequence files of a previous base calling
files list of files indices to the vector Rolexa.env$infiles, see LoadIntensities
gzip write to compressed (gzipped) file
normal apply TileNormalize before base calling
decorrelate apply DeCorrelateChannels and DeCorrelateCycles before base calling

Details

The function ForkBatch runs through the list Rolexa.env$infiles, concatenates them by batches of nfiles, then calls OneBatchFit (if fit=TRUE) or OneBatchEval in each of the nthreads threads until all batches have been processed. Each batch results are passed to FilterResults and saved in an output file from the list Rolexa.env$outfiles.

The functions BuildInfileList or BuildOutfileList are called at the start if the files lists are empty.

The function OneBatchFit takes a list of indices to the vector Rolexa.env$infiles and calls LoadIntensities on the corresponding files, then runs SeqFitScore (if fit=FALSE then OneBatchEval and SeqEvalScore are called) and finally results go through FilterResults and SaveResults.

Author(s)

Jacques Rougemont, Arnaud Amzallag, Christian Iseli, Laurent Farinelli, Ioannis Xenarios, Felix Naef

References

Probabilistic base calling of Solexa sequencing data, BMC Bioinformatics 2008, 9:431

See Also

BuildInfileList, BuildOutfileList, FilterResults, SaveResults

Examples

library(Rolexa.demo)
datadir=system.file("data",package="Rolexa.demo")
## Not run: 
#This will take some time to complete:
ForkBatch(nthreads=2, nfiles=2, prefix="demo_", normal=TRUE,
decorrelate='both', fit=FALSE, gzip=TRUE, pattern="s_[0-9_]+_int.txt",
IntDirectory=datadir, OutDirectory=datadir)
## End(Not run)

[Package Rolexa version 1.1.7 Index]