KBase Documentation
  • KBase Documentation
  • KBase Terms & Conditions
  • Getting Started
    • Signing Up and Signing In
      • Step-by-Step Sign Up
    • Supported Browsers
    • Narrative Quick Start
    • Narrative Interface User Guide
      • Access the Narrative Interface
      • Tour the Narrative Interface
      • Narrative Navigator
      • Create a Narrative
      • Explore Data
      • Add Data to Your Narrative
      • Browse KBase Analysis Tools
      • Analyze Data Using KBase Apps
      • Job Browser
      • Revise Your Narrative
      • Format Markdown Cells
      • Share Narratives
      • Linking Static Narratives to ORCID
      • Access and Copy Narratives
      • Organizations
    • FAQs
  • Manage Your Account
    • Linking Accounts
    • Linking KBase to ORCiD
  • Working with Data
    • Data Upload and Download Guide
      • Data Types
      • Importing Data
        • Bulk Import Limitations
      • Assembly
      • Genome
      • FASTQ/SRA Reads
      • Flux Balance Analysis (FBA) Model
      • Media
      • Expression Matrix
      • Phenotype Set
      • Amplicon Matrix
      • Chemical Abundance Matrix
      • SampleSet
      • Compressed/Zipped Files
      • Bulk Import Specification
      • Downloading Data
    • Searching, Adding, and Uploading Data
    • Filtering, Managing, and Viewing Data
    • Linking Metadata
      • Ontologies and Validated Terms
    • Public Data in KBase
    • Transfer Data with Globus
  • Using Apps
    • Analysis Apps in KBase
      • Assembly & Annotation
      • Comparative Genomics
      • Metabolic Modeling
      • Metagenomics & Community Exploration
      • Data Matrices - Amplicon, Stats
      • Chemical Abundance
      • Expression & Transcriptomics
    • Apps in Beta
  • Running Common Workflows
    • Assembling & Annotating Microbial Genomes
      • FAQ: Assembly and Annotation
    • Comparative Genomics & Phylogenetic Analysis
      • FAQ: Comparative Genomics
    • Metagenomic & Community Analysis
      • FAQ: Metagenomics & Community Analysis
    • Transcriptomic Analysis
      • FAQ: RNA-seq Analysis
    • Constructing Metabolic Models
      • Constructing and Analyzing Metabolic Flux Models of Microbial Communities
      • FAQ: Metabolic Modeling
  • Community Developed Workflows and Tools
    • Functional Annotation
    • Functional and Taxonomic Profiling of MAGs
    • Taxonomy
    • Viral
    • Random Walk with Restart Toolkit
  • Troubleshooting
    • Problems with the User Interface
    • Help Board
    • How to Report Issues
    • Job Errors and Their Meanings
      • Common Job Errors
        • The Job Log
      • Import Job Errors
      • Assembly App Errors
      • Annotation App Errors
      • Functional Genomics App Errors
      • Modeling App Errors
  • Developing Apps
    • The KBase SDK
    • Create a KBase Developer Account
    • KBase GitHub Repository
  • External Links
    • KBase Narrative Interface
    • KBase web site
    • KBase App Catalog
  • kbase.us
Powered by GitBook
On this page
  • Before submitting them for annotation. Is there a direct way to import to JGI or do we have to save FASTA files and separately submit to JGI/IMG?
  • What is the typical threshold to determine whether our assembled genome is contaminated?
  • Is there a limit on the size of the FASTQ files I can upload to KBase?
  • What is the best assembler?
  • How do I compare the results of various assemblers?
  • Does KBase support co-assembly?
  • JGI/IMG recommends KBase for genome assembly before submission to them for annotation. Is there a direct way to import to JGI or do we have to save FASTA files and separately submit to JGI/IMG?
  • What is the difference between RAST and Prokka annotations?
  • What is the difference between Annotate Microbial Genome and Annotate Microbial Assembly?
  • Is it possible or necessary to manually curate annotations, or are the RAST and Prokka annotations sufficient?
  • Does RAST annotate archaea and protists?
  • Can I annotate plants or fungi?
  • Are there tools specialized for fungal data?

Was this helpful?

  1. Running Common Workflows
  2. Assembling & Annotating Microbial Genomes

FAQ: Assembly and Annotation

PreviousAssembling & Annotating Microbial GenomesNextComparative Genomics & Phylogenetic Analysis

Last updated 1 year ago

Was this helpful?

Before submitting them for annotation. Is there a direct way to import to JGI or do we have to save FASTA files and separately submit to JGI/IMG?

This page on transferring data from JGI to KBase takes you through the process. You can also start from the KBase search and use the JGI tab.

What is the typical threshold to determine whether our assembled genome is contaminated?

In general, a genome that is >90% complete and <5% contaminated is high quality. A rough guide to the quality of MAGs and SAGs can be found here:

Is there a limit on the size of the FASTQ files I can upload to KBase?

Assemblers currently have an upper limit of between 180,263,840 paired reads and 240,351,788 reads depending on complexity. If the job has been run twice, exceeded the 7 day limit, and your data is in this size range, it may be too big for KBase at this time.

What is the best assembler?

The “best” assembler often depends on the user. Sometimes the user may want the most contiguous metagenome, or an assembly with minimized assembly artifacts. Users can use multiple assemblers and choose whichever results in the best assembly for their purpose.

How do I compare the results of various assemblers?

You can use the to compare the contig distributions of Assembly objects.

Does KBase support co-assembly?

JGI/IMG recommends KBase for genome assembly before submission to them for annotation. Is there a direct way to import to JGI or do we have to save FASTA files and separately submit to JGI/IMG?

What is the difference between RAST and Prokka annotations?

In addition to having different options in the app, their method for assigning the annotation is different. The determination of better or worse is in the eye of the beholder. The primary advantage of RAST is its linking to our metabolic modeling. The RAST functional roles are considered a controlled vocabulary where we map specific RAST annotations to biochemical reactions in the model, so if you plan to build metabolic models, you should annotate with RAST. Because RAST tends to assign more hypothetical proteins, some people will run Prokka first, and then reannotate with RAST using the Retain old annotation for hypotheticals option.

What is the difference between Annotate Microbial Genome and Annotate Microbial Assembly?

Is it possible or necessary to manually curate annotations, or are the RAST and Prokka annotations sufficient?

Manual curation of annotations is not supported on-system. RAST and Prokka are likely sufficient for many applications, but as you mention, for difficult-to-annotate or highly divergent metabolic genes you may need to use additional tools. In addition to RAST and Prokka, there are on-system tools available for feature annotation using pre-generated hidden Markov models, which could be useful for higher-resolution annotation.

Does RAST annotate archaea and protists?

Can I annotate plants or fungi?

For fungi, it is recommended to use external tools and then import the annotated genomes into KBase.

Are there tools specialized for fungal data?

Yes, but the results are unstable currently above 200 M reads (Illumina 150bp x 2). Use the to get a combined Reads Library object.

These takes you through the process. You can also start from the KBase search and use the JGI tab.

takes an Assembly object, follows it by gene calling using algorithms from Prodigal and Glimmer, and then functional annotation.

takes a Genome object, does not call genes, and instead preserves the original gene calls. It then re-annotates the genes, overwriting previous annotations.

primarily annotates bacteria and archaea, and may be limited with protists.

Tools available for plant annotations include and . These tools use the PlantSEED curated Database.

Currently there is a tool to construct draft metabolic models of fungal species () that uses highly curated published fungal models as the underlying biochemistry data.

https://www.nature.com/articles/nbt.3893
QUAST App
Merge Reads Libraries App
instructions on transferring data from JGI to KBase
Annotate Microbial Assembly
Annotate Microbial Genome
RAST
Annotate Plant Transcripts with Metabolic Functions
Annotate Plant Enzymes with OrthoFinder
Build Fungal Model