FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem

Abstract

FlipFlop implements a fast method for de novo transcript discovery and abundance estimation from RNA-Seq data. It differs from Cufflinks by simultaneously performing the identification and quantitation tasks using a convex penalized maximum likelihood approach, which leads to improved precision/recall.

Other softwares taking this approach have an exponential complexity in the number of exons of a gene. We use a novel algorithm based on network flow formalism, which gives us a polynomial runtime. In practice, FlipFlop was shown to outperform penalized maximum likelihood based softwares in terms of speed and to perform transcript discovery in less than 1/2 second even for large genes.

FlipFlop also includes an extension to a multiple-sample framework. It can jointly identify and quantify isoforms across several samples. For doing so, it uses a multi-task estimation procedure with the group-lasso penalty. This increases the statistical power of the estimation.

Download

FlipFlop is a user friendly bioconductor R package. The released version is freely available here, and can be directly installed by typing in R:

source("http://bioconductor.org/biocLite.R")
biocLite("flipflop")


The vignette is available here and the manual is available here.

Note that the devel version of FlipFlop if available here. All updates of the code are done in the devel, check the devel web-page for the latest version.

User Guide

Please check this User Guide Page for a detailed description of the usage of FlipFlop, and its different options.

This version of FlipFlop deals with single and paired-end reads. For paired-end reads you need to give the mean fragment size and standard deviation to the main function for the moment. We are also working on a new version that will include a GC and mappability correction. Note also that, if you are using strand-specific libraries, you should separate in advance you reads from both strands and run FlipFlop on two distinct SAM files (indeed FlipFlop does not automatically deal with strand-specificity so far).

References

  • Elsa Bernard, Laurent Jacob, Julien Mairal and Jean-Philippe Vert. Efficient RNA Isoform Identification and Quantification from RNA-Seq Data with Network Flows. Bioinformatics, 2014.
  • Elsa Bernard, Laurent Jacob, Julien Mairal, Eric Viara and Jean-Philippe Vert. A convex formulation for joint RNA isoform detection and quantification from multiple RNA-seq samples. Techreport HAL-01123141, March 2015.
  • Experiments

    If you would like to reproduce the experimental results shown in the Bioinformatics paper, please visit this page: Experiments and Tutorials.

    NEWS

  • 22/03/2015 (version > 1.5.12)   FlipFlop can be run on multiple cores and easily parallelized.
  • 05/03/2015 (version > 1.5.10)   FlipFlop estimates isoforms from multiple-samples.
  • Contact

    If you have any question or suggestion regarding this software, please contact Elsa Bernard.

    © Elsa Bernard

    Valid CSS