FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem
Abstract
FlipFlop implements a fast method for de novo transcript
discovery and abundance estimation from RNA-Seq data. It differs
from Cufflinks by simultaneously performing the identification and
quantitation tasks using a convex penalized maximum likelihood approach,
which leads to improved precision/recall.
Other softwares taking this approach have an exponential
complexity in the number of exons of a gene. We use a novel
algorithm based on network flow formalism, which gives us a
polynomial runtime.
In practice, FlipFlop was shown to outperform
penalized maximum likelihood based softwares in terms of speed and
to perform transcript discovery in less than 1/2 second even for
large genes.
FlipFlop also includes an extension to a multiple-sample framework.
It can jointly identify and quantify isoforms across several samples.
For doing so, it uses a multi-task estimation procedure with the group-lasso penalty.
This increases the statistical power of the estimation.
Download
FlipFlop is a user friendly bioconductor R package. The released version is freely available
here,
and can be directly installed by typing in R:
source("http://bioconductor.org/biocLite.R")
biocLite("flipflop")
The vignette is
available here
and the manual is available
here.
Note that the devel version of FlipFlop if available here. All updates of the code
are done in the devel, check the devel web-page for the latest version.
User Guide
Please check this User Guide Page for a detailed
description of the usage of FlipFlop, and its different options.
This version of FlipFlop deals with single and paired-end
reads. For paired-end reads you need to give the mean fragment
size and standard deviation to the main function for the moment. We
are also working on a new version that will include a GC and
mappability correction.
Note also that, if you are using strand-specific libraries, you should separate
in advance you reads from both strands and run FlipFlop on two distinct SAM files
(indeed FlipFlop does not automatically deal with strand-specificity so far).
References
Elsa Bernard, Laurent Jacob, Julien Mairal and Jean-Philippe Vert. Efficient RNA Isoform Identification and Quantification from RNA-Seq Data with Network Flows. Bioinformatics, 2014.
Elsa Bernard, Laurent Jacob, Julien Mairal, Eric Viara and Jean-Philippe Vert. A convex formulation for joint RNA isoform detection and quantification from multiple RNA-seq samples. Techreport HAL-01123141, March 2015.
Experiments
If you would like to reproduce the experimental
results shown in the Bioinformatics paper,
please visit this page: Experiments and Tutorials.
NEWS
22/03/2015 (version > 1.5.12)   FlipFlop can be run on multiple cores and easily parallelized.
05/03/2015 (version > 1.5.10)   FlipFlop estimates isoforms from multiple-samples.
Contact
If you have any question or suggestion regarding this software, please
contact Elsa
Bernard.
© Elsa Bernard
Valid CSS