Comment on page
Add Data to Your Narrative
Now that you are familiar with ways to find and explore data in KBase, you can select or upload data to analyze. The Data Panel in a Narrative shows the data objects that are currently available in that particular Narrative.
From the Data Panel, you can access the data slide-out, which allows you to search for data of interest and add it to your Narrative. In the Data Panel, click the "Add Data" button, the “+” button, or the arrow at the upper right of the panel to access the Data Browser slide-out.
Data Privacy Any data that you upload to KBase is kept private unless you explicitly choose to share it. You can share any of your Narratives (including their associated data) with one or more specific users, or make it publicly available to all KBase users. Please see the Sharing page for more information about how to do that. The Terms and Conditions page describes the KBase data policy.
The first four tabs of the Data Browser (My Data, Shared With Me, Public, and Example) let you search data that is already in KBase. The Import tab lets you import data from your computer to your Narrative so that you can analyze it in KBase.
- The My Data tab shows data objects that you have added to your project. You may need to refresh this tab to see your most recently added data.
- The Shared With Me and Public tabs display datasets that others have loaded and made accessible to you (or to everyone). Data within each group is searchable and can be filtered. Since there are a large number of public datasets, you may wish to filter them by data type (using the pulldown selector on the left) or narrow the list by searching (in the “Search data” text box) for specific text in the data objects’ names.
- The Example tab shows datasets that have been pre-loaded for use with particular apps. These can be handy for trying out the Narrative Interface.
- The Import tab allows you to upload your own datasets for analysis.
If you hover your cursor over any data object under the first four tabs, options will appear allowing you to add that object to the Narrative or find out more about it.
In the previous section, we described the process of adding a genome to your Narrative from the public data in KBase. Now let’s check out the different data types available under the Example tab. The icon to the left of each data object represents its data type.
As described in the previous section, the blue "< Add" button next to these icons lets you add the data object to your Narrative. You can add more more data to your Narrative from the My Data, Shared with Me, Public, and Example tabs to try out with various KBase Apps.
The Import tab lets you drag & drop data from your computer into your Staging Area to import into your Narrative, where you can then analyze it using KBase’s analysis apps.
You can then click the “?” icon just below the drop zone to the right to launch a short interactive tour that shows the different parts of this user interface.
Getting data from your computer to your KBase Narrative is a three-step process:
- 1.Drag and drop the data file(s) from your computer to the new Import tab to upload them to your Staging Area.
- 2.Choose a format for importing data from your Staging Area into your Narrative.
- 3.Run the Import app that is created.
What is a Staging Area? Your Staging Area is a temporary holding place for your uploaded data files. It is private to you–no one else can see the data in your Staging Area. The files in your Staging Area stay there and can be seen from any of your Narratives. When you’re ready to add data to a Narrative, you can choose a data type from the pulldown menu next to a data file in your staging area in order to add it to your Narrative as an object of that type–see below for instructions.
Why have a Staging Area? It’s more robust and extensible this way. Unlike our previous importer, the staging upload can handle large data files without timing out. The user interface is more intuitive: you can drag and drop one or more data files into your Staging Area, or even whole directories or zip files. Finally, the new importer is easier for our developers to extend to new data types.
What's the difference between 'Upload' and 'Import'? In this documentation, we will use “Upload” to refer to getting data from your computer into your Staging Area and “Import” to refer to the process of converting a data file from your Staging Area into a data object in your Narrative that you can analyze with any of our analysis apps. Sometimes we use the term “Import” to cover the Upload+Import process.
Drag & Drop Limitations The drag & drop from your local computer works for many files, but there is a size limit that depends on your computer and browser. Some users have reported problems around 20GB. For larger files, use the Globus Online transfer.
Find the file(s) you want to import into your KBase account, and drag them into the drop zone (the rectangular area surrounded by a dashed line). You can select multiple files from your computer and drag them all at once. (In the example below, the user is dragging two files into the dashed area.) You can also select a folder of data files and drag the folder into the Staging Area drop zone.
If you don’t like using drag & drop, you can instead click in the upload area to open a file chooser and select a file from your computer to upload.
While files are being uploaded to your Staging Area, you’ll see a green progress bar.
When the file is done uploading, you will see it appear in the list of files in your Staging Area. If you don’t see your file, try clicking the refresh button.
By default, these are sorted by age, with the most recently uploaded file at the top. To sort the list by other fields, such as name or size, click a column header.
90-day lifetime for files in your Staging Area Your Staging Area is meant to be a temporary holding area for data you want to import into your KBase account. After adding files to your Staging Area, be sure to import them into your Narrative soon, as files in the Staging Area are automatically removed after 90 days. Data objects imported into a Narrative will exist indefinitely.
Globus is a data management and file transfer system that can facilitate bulk transfer of data (either large data files or a large number of files) between two endpoints. The endpoints that apply here are KBase, JGI, and your local computer. The KBase endpoint is called “KBase Bulk Share,” and JGI has their own way to link to Globus. To do any transfer using Globus, you will need a Globus account.
Uploading data from JGI If you are a JGI user, you can transfer public genome reads and assemblies (as well as your private data and annotated genomes) from JGI to your KBase account—see the JGI data transfer page for instructions.
Below the link to Globus, another link says “Click here to use an App to upload from a public URL” (for example, a GenBank ftp URL, or a Dropbox or Google Drive URL that is publicly accessible).
Clicking this link adds the “Upload File to Staging from Web” App to your Narrative:
There are also several apps that import specific file types (single- or paired-end reads or SRA files) from a URL directly to your Narrative, bypassing your Staging Area. These are available from the Apps Panel and the App Catalog.
The files in your Staging Area are ready to import into your Narrative as KBase data objects that can be used in your analyses. To import a file from your Staging Area, choose a format (data type) from the pulldown menu to the right of the file’s age. (You can find out more about KBase data types and accepted formats in the Upload/Download Guide.) Then click the import icon to the right of the format menu.
When you click the import icon, the Data Browser slides shut and an Import App cell for each data type or a Bulk Import App cell will be created in your Narrative, with the appropriate parameters filled in. For example, here’s an import app created by choosing “GenBank” as the import format:
Importing different types of data This example shows how to import GenBank data. Please see the Upload/Download Guide for detailed instructions for other supported data types and importing multiple files.
If the GenBank file came from a different source, use the pulldown menu to select it. You can change the output object name, if desired, and then click the "Run" button to start the import. When the import is done, you should see the message “Finished with success” near the top of the app cell, and some information about the app run.
If you look at your Data Panel, you should see the new data object created by the import.
You can now use this data object as input into the relevant KBase apps. If you want to see which apps accept a particular data type as input, you can click the “…” menu in the data object cell that appears when you hover over it, and then use the “Show Apps with this as input” icon to filter the apps in the Apps Panel.
What if my import fails? Sometimes an import doesn’t work. One of the most common causes of failure is attempting to import a file that’s the wrong data type, or not the expected format for that data type. For example, the screenshot below shows what a user got when they tried to import a GenBank file as Media.
In some cases, the cause of an import error will not be obvious. If you can’t figure out why your import isn’t working, please contact us (via the Help Board) for help. Note, however, that no one besides you has access to your Staging Area, so we will not be able to see the files you uploaded to your Staging Area. You may need to attach your input file to your Help Board ticket in order for us to diagnose the problem.
Multiple files can be imported together in a single Import App as part of the bulk import feature. When you select the file type from "Import As...," you will be able to check multiple files in the staging area and import them all by clicking "Import Selected."
Bulk Import Staging Area. You can select the appropriate type for multiple files, then "Import Selected" to import them simultaneously.
When you import selected, there will be a new cell type which contains a tab for each different file type you selected from the staging area. You will fill out the same parameters for each file type by clicking the file type in the "Data Types" column on the left; these are the same parameters as single imports.
Bulk Import Cell. File types can be selected under the data type column to set the parameters for each type.
Supported file types include:
- Assembly - FASTA
- SRA Reads
- FASTQ Reads Interleaved - FASTQ Interleaved reads
- FASTQ Reads Noninterleaved - FASTQ paired-end reads or FASTQ single-end reads library
- GenBank Genome - GenBank
- GFF Metagenome - FASTA and GFF3
The list of files in your Staging Area includes their name, size, and age (from when they were uploaded). If you have many files in your Staging Area, use the Search box to locate specific files.
For more information about a file in your staging area, click the arrow to the left of the filename to open a tab like this:
You can click “First 10 lines” or “Last 10 lines” to see that portion of the file:
Opening the information about a file in your Staging Area also reveals a trash can icon that allows you to remove the file from your staging area.
You will be asked to confirm that you want to delete the file. This action is not reversible.
Note that if you had already imported the file to your Narrative as a data object, that object won’t go away when you delete the file in your Staging Area. If you want to delete a data object, you can do that in your Data Panel.