KBase offers a powerful suite of expression analysis tools. Starting with short reads, you can use the tool suite to assemble, quantify long transcripts, and identify differentially expressed genes. You can also compare the expression data with the flux when studying metabolic models in KBase and identify pathways where expression and flux agree or conflict.
These Narrative Tutorials guide through RNAseq workflows using a HISAT2/StringTie/DESeq2 pipeline.
E. coli RNA-seq Analysis Tutorial - bacterium-based example
Arabidopsis RNA-seq Analysis Tutorial - plant-based example
We support the popular Tuxedo suite of tools (original and new) for RNA-seq analysis. As a result, KBase requires a reference genome to guide the analysis of short reads.
Import Short Reads: Use the Import tab or any of the reads uploader apps from the Apps Panel to import the short reads from your experiment into your Narrative. Example reads are also available from the Public tab. The reads must be a set of single-end, paired-end, or interleaved paired-end reads in FASTA, FASTQ, or SRA format.
Create Sample Set: Run the Create RNA-seq Sample Set App to group together your reads into an RNA-seq sample set with associated experimental metadata so that you can easily and efficiently run the RNA-seq Apps in batch mode wherever appropriate.
The RNA-seq pipeline in KBase is modular and consists of three steps. You can pick any of the multiple Apps available for a given step depending on your preference or individual characteristics of the App.
Transcriptome Assembly and Quantification: Run the Cufflinks or StringTie App on the read alignments from the previous step to generate and assemble full-length transcripts and quantify transcripts and genes as appropriate. You can view downloadable normalized full expression matrices in FPKM (fragments per kilobase of exon model per million mapped reads) and TPM (transcripts per million).
Differential Gene Expression: Run the Cuffdiff or Ballgown or DESeq2 App to generate gene or transcript level differential expression based on the quantification from the previous step. Run Create Feature Set/Filtered Expression Matrix From Differential Expression after selecting appropriate q-value and fold change cutoffs as input parameters for the filtering of the differential gene expression.
Figure 1: The original and new Tuxedo RNA-seq analysis suites in KBase have modular Apps for building flexible analysis workflows.
KBase offers a number of Apps to filter, cluster, visualize, and functionally enrich the feature sets based on differential expression derived from RNA-seq analysis. Also, the expression data from RNA-seq can be assimilated into metabolic models to identify pathways where expression and flux agree or conflict.
Filtering: You can create a filtered expression matrix and associated feature set based on fold-change or adjusted p-value. You can also filter an expression matrix based on LOR or ANOVA.
Functional Enrichment: Assess the functional enrichment in plant genomes for a set of features using associated GO terms.
Integration into Metabolic Models: Assimilate the expression data from RNA-seq into the metabolic models to compare reaction fluxes with gene expression and thus identify pathways where expression and flux agree or conflict.
Figure 2: The complete differential RNA-seq workflow in KBase