Formatting and uploading annotated assemblies and GenBank or GFF and FASTA files.
The Genome importer supports only GenBank and GFF formatted files.
A GenBank-formatted input file should include sequence contig(s), feature calls (annotations), and taxonomy information for the organism. KBase parses the input file into two data objects: an assembly object with the sequence and a genome object containing the original feature calls and annotations.
GenBank-formatted files with no features can be uploaded as Genomes.
A GenBank-formatted file can be uploaded into the Staging Area from your local computer (files with.gb, .gbff, or .gbk extensions) or directly from an FTP or HTTP URL.
A GFF-formatted file must be paired with a corresponding FASTA file of the DNA sequence. These will be parsed into two data objects: an assembly object with the sequence and a genome object containing the original feature calls and annotations.
Using a file on your computer, open the Import tab within the Data Browser. Then drag & drop the genome file into your Staging Area.
Open the Import As... pulldown menu to the right of the filename in your Staging Area and select “GenBank Genome.”
Make sure the correct file type is selected and the checkbox is active, then click the “Import Selected” button.
Import GenBank Genome
The Data Browser will close and the “Import GenBank File as Genome from Staging Area” App will be added to your Narrative.
Notice that the name of the Genome file is filled in, as is a suggested name for the Genome and Assembly data objects that will be created by the import, which can be changed. Adjust the Genome Type and source of the GenBank file or any advanced parameters if needed.
Import GenBank Genome App
Click the green "Run" button to start the import. When the import is finished, your Data Panel will update to show the new Genome and Assembly objects, and a report will appear in the Import App.
Open the Import tab in the Data Browser and drag and drop both the genome and corresponding FASTA file into your Staging Area. In your Staging Area, open the Import As... pulldown menu to the right of the GFF filename and select “GFF Genome."
Note the name of the corresponding FASTA file.
Make sure the correct file type is selected and the checkbox is active, then click the "Import Selected" button. The data slide-out will close and the “Import GFF/FASTA File as Genome from Staging Area” App will be added to your Narrative. The GFF File Path name will be filled in.
You will need to fill in the name of the FASTA file. Using the dropdown for the “FASTA File Path”, select the FASTA file in the Staging Area. Ensure the file type is a FASTA file type.
GFF/FASTA Genome Import
The name of the Scientific Name may be filled in, as is a suggested name for the Genome data object that will be created by the import. You can edit the name of the output Genome Object Name, Scientific Name, and any advanced options as needed. Click the green "Run" button. When the import is finished, your Data Panel will update to show the new Genome object, and a report will appear in the Import App.
Drag and Drop, Upload with Globus, Upload with URL options to upload and import data into KBase
The drag & drop option from your local computer works for many files, but there is a size limit that depends on your computer and browser. For larger files (around 20 gigabases), use the Globus Online transfer.
Both GenBank and GFF genomes can be imported as one of the supported bulk import types. You can select multiple assemblies simultaneously from the staging area to import them at once. See the bulk import section of the guide to importing data into the Narrative.
Open the Import tab in the Data Browser and click on the Upload with URL button (below the drag & drop area) to open an Upload App Cell.
The Data Browser will close and the “Upload a File to Staging from Web” App will appear in your Narrative. Alternatively, you can open the app directly from the Apps Panel. From the app, click on the dropdown for the URL Type and select the URL type.
Upload File to Staging from Web App
When uploading a GenBank Genome, you will only need to use one link. When uploading a GFF Genome, you will need to use two links for the GFF file and the FASTA file.
In the App, click the "+" button for the URLs and paste in the name of the Genome file (GenBank or GFF). Hit the "+" button again and paste in the name of the FASTA file.
Then click the green "Run" button to start the upload. After the App completes the files will appear in your Staging Area, which you can access via the Import tab in the Data Browser.
The genome file(s) are now in your Staging Area. Now you need to import them to your Narrative to use them in analyses.