KBase Documentation
  • KBase Documentation
  • KBase Terms & Conditions
  • Getting Started
    • Signing Up and Signing In
      • Step-by-Step Signup Guide
      • Authentication Update
    • Supported Browsers
    • Narrative Quick Start
    • Narrative Interface User Guide
      • Access the Narrative Interface
      • Tour the Narrative Interface
      • Narrative Navigator
      • Create a Narrative
      • Explore Data
      • Add Data to Your Narrative
      • Browse KBase Analysis Tools
      • Analyze Data Using KBase Apps
      • Job Browser
      • Revise Your Narrative
      • Format Markdown Cells
      • Share Narratives
      • Linking Static Narratives to ORCID
      • Access and Copy Narratives
      • Organizations
    • FAQs
  • Manage Your Account
    • Linking Accounts
    • Linking KBase to ORCiD
  • Working with Data
    • Data Upload and Download Guide
      • Data Types
      • Importing Data
        • Bulk Import Limitations
      • Assembly
      • Genome
      • FASTQ/SRA Reads
      • Flux Balance Analysis (FBA) Model
      • Media
      • Expression Matrix
      • Phenotype Set
      • Amplicon Matrix
      • Chemical Abundance Matrix
      • SampleSet
      • Compressed/Zipped Files
      • Bulk Import Specification
      • Downloading Data
    • Searching, Adding, and Uploading Data
    • Filtering, Managing, and Viewing Data
    • Linking Metadata
      • Ontologies and Validated Terms
    • Public Data in KBase
    • Transfer Data with Globus
  • Using Apps
    • Analysis Apps in KBase
      • Assembly & Annotation
      • Comparative Genomics
      • Metabolic Modeling
      • Metagenomics & Community Exploration
      • Data Matrices - Amplicon, Stats
      • Chemical Abundance
      • Expression & Transcriptomics
    • Apps in Beta
  • Running Common Workflows
    • Assembling & Annotating Microbial Genomes
      • FAQ: Assembly and Annotation
    • Comparative Genomics & Phylogenetic Analysis
      • FAQ: Comparative Genomics
    • Metagenomic & Community Analysis
      • FAQ: Metagenomics & Community Analysis
    • Transcriptomic Analysis
      • FAQ: RNA-seq Analysis
    • Constructing Metabolic Models
      • Constructing and Analyzing Metabolic Flux Models of Microbial Communities
      • FAQ: Metabolic Modeling
  • Community Developed Workflows and Tools
    • Functional Annotation
    • Functional and Taxonomic Profiling of MAGs
    • Taxonomy
    • Viral
    • Random Walk with Restart Toolkit
  • Troubleshooting
    • Problems with the User Interface
    • Help Board
    • How to Report Issues
    • Job Errors and Their Meanings
      • Common Job Errors
        • The Job Log
      • Import Job Errors
      • Assembly App Errors
      • Annotation App Errors
      • Functional Genomics App Errors
      • Modeling App Errors
  • Developing Apps
    • The KBase SDK
    • Create a KBase Developer Account
    • KBase GitHub Repository
  • External Links
    • KBase Narrative Interface
    • KBase web site
    • KBase App Catalog
  • kbase.us
Powered by GitBook
On this page
  • Importing a GenBank-formatted genome
  • Importing a GFF-formatted Genome
  • Uploading a Genome from other sources
  • Bulk Import
  • Uploading a Genome using a URL

Was this helpful?

  1. Working with Data
  2. Data Upload and Download Guide

Genome

Formatting and uploading annotated assemblies and GenBank or GFF and FASTA files.

PreviousAssemblyNextFASTQ/SRA Reads

Last updated 7 months ago

Was this helpful?

In KBase, a Genome object is the annotated version of an or annotated predicted coding sequences and can encompass several types of feature calls. If you want to upload solely the DNA sequence from a FASTA file (without annotations), go to the page.

The Genome importer supports only GenBank and GFF formatted files.

A GenBank-formatted input file should include sequence contig(s), feature calls (annotations), and taxonomy information for the organism. KBase parses the input file into two data objects: an assembly object with the sequence and a genome object containing the original feature calls and annotations.

GenBank-formatted files with no features can be uploaded as Genomes.

A GenBank-formatted file can be uploaded into the Staging Area from your local computer (files with.gb, .gbff, or .gbk extensions) or directly from an FTP or HTTP URL.

A GFF-formatted file must be paired with a corresponding FASTA file of the DNA sequence. These will be parsed into two data objects: an assembly object with the sequence and a genome object containing the original feature calls and annotations.

Further instructions for adding data to your Staging Area can be found .

Importing a GenBank-formatted genome

Using a file on your computer, open the . Then drag & drop the genome file into your Staging Area.

Open the Import As... pulldown menu to the right of the filename in your Staging Area and select “GenBank Genome.”

Make sure the correct file type is selected and the checkbox is active, then click the “Import Selected” button.

The Data Browser will close and the “Import GenBank File as Genome from Staging Area” App will be added to your Narrative.

Notice that the name of the Genome file is filled in, as is a suggested name for the Genome and Assembly data objects that will be created by the import, which can be changed. Adjust the Genome Type and source of the GenBank file or any advanced parameters if needed.

Click the green "Run" button to start the import. When the import is finished, your Data Panel will update to show the new Genome and Assembly objects, and a report will appear in the Import App.

Importing a GFF-formatted Genome

Note the name of the corresponding FASTA file.

Make sure the correct file type is selected and the checkbox is active, then click the "Import Selected" button. The data slide-out will close and the “Import GFF/FASTA File as Genome from Staging Area” App will be added to your Narrative. The GFF File Path name will be filled in.

You will need to fill in the name of the FASTA file. Using the dropdown for the “FASTA File Path”, select the FASTA file in the Staging Area. Ensure the file type is a FASTA file type.

The name of the Scientific Name may be filled in, as is a suggested name for the Genome data object that will be created by the import. You can edit the name of the output Genome Object Name, Scientific Name, and any advanced options as needed. Click the green "Run" button. When the import is finished, your Data Panel will update to show the new Genome object, and a report will appear in the Import App.

Uploading a Genome from other sources

Bulk Import

Uploading a Genome using a URL

The Data Browser will close and the “Upload a File to Staging from Web” App will appear in your Narrative. Alternatively, you can open the app directly from the Apps Panel. From the app, click on the dropdown for the URL Type and select the URL type.

When uploading a GenBank Genome, you will only need to use one link. When uploading a GFF Genome, you will need to use two links for the GFF file and the FASTA file.

In the App, click the "+" button for the URLs and paste in the name of the Genome file (GenBank or GFF). Hit the "+" button again and paste in the name of the FASTA file.

Then click the green "Run" button to start the upload. After the App completes the files will appear in your Staging Area, which you can access via the Import tab in the Data Browser.

The genome file(s) are now in your Staging Area. Now you need to import them to your Narrative to use them in analyses.

Open the and drag and drop both the genome and corresponding FASTA file into your Staging Area. In your Staging Area, open the Import As... pulldown menu to the right of the GFF filename and select “GFF Genome."

You can upload data into your KBase Staging Area using , or by supplying a URL for a publicly accessible FTP location, Google Drive, Dropbox, or a direct HTTP link to import into the Narrative. Options for adding data to your Staging Area are described .

Both GenBank and GFF genomes can be imported as one of the supported bulk import types. You can select multiple assemblies simultaneously from the staging area to import them at once. See the bulk import section of

Open the and click on the Upload with URL button (below the drag & drop area) to open an Upload App Cell.

Import tab in the Data Browser
Globus
here
the guide to importing data into the Narrative.
Import tab in the Data Browser
Assembly
Assembly
Import tab within the Data Browser
here
Import GenBank Genome
Import GenBank Genome App
GFF/FASTA Genome Import
Drag and Drop, Upload with Globus, Upload with URL options to upload and import data into KBase
Upload File to Staging from Web App