KBase Documentation
  • KBase Documentation
  • KBase Terms & Conditions
  • Getting Started
    • Signing Up and Signing In
      • Step-by-Step Signup Guide
      • Authentication Update
    • Supported Browsers
    • Narrative Quick Start
    • Narrative Interface User Guide
      • Access the Narrative Interface
      • Tour the Narrative Interface
      • Narrative Navigator
      • Create a Narrative
      • Explore Data
      • Add Data to Your Narrative
      • Browse KBase Analysis Tools
      • Analyze Data Using KBase Apps
      • Job Browser
      • Revise Your Narrative
      • Format Markdown Cells
      • Share Narratives
      • Linking Static Narratives to ORCID
      • Access and Copy Narratives
      • Organizations
    • FAQs
  • Manage Your Account
    • Linking Accounts
    • Linking KBase to ORCiD
  • Working with Data
    • Data Upload and Download Guide
      • Data Types
      • Importing Data
        • Bulk Import Limitations
      • Assembly
      • Genome
      • FASTQ/SRA Reads
      • Flux Balance Analysis (FBA) Model
      • Media
      • Expression Matrix
      • Phenotype Set
      • Amplicon Matrix
      • Chemical Abundance Matrix
      • SampleSet
      • Compressed/Zipped Files
      • Bulk Import Specification
      • Downloading Data
    • Searching, Adding, and Uploading Data
    • Filtering, Managing, and Viewing Data
    • Linking Metadata
      • Ontologies and Validated Terms
    • Public Data in KBase
    • Transfer Data with Globus
  • Using Apps
    • Analysis Apps in KBase
      • Assembly & Annotation
      • Comparative Genomics
      • Metabolic Modeling
      • Metagenomics & Community Exploration
      • Data Matrices - Amplicon, Stats
      • Chemical Abundance
      • Expression & Transcriptomics
    • Apps in Beta
  • Running Common Workflows
    • Assembling & Annotating Microbial Genomes
      • FAQ: Assembly and Annotation
    • Comparative Genomics & Phylogenetic Analysis
      • FAQ: Comparative Genomics
    • Metagenomic & Community Analysis
      • FAQ: Metagenomics & Community Analysis
    • Transcriptomic Analysis
      • FAQ: RNA-seq Analysis
    • Constructing Metabolic Models
      • Constructing and Analyzing Metabolic Flux Models of Microbial Communities
      • FAQ: Metabolic Modeling
  • Community Developed Workflows and Tools
    • Functional Annotation
    • Functional and Taxonomic Profiling of MAGs
    • Taxonomy
    • Viral
    • Random Walk with Restart Toolkit
  • Troubleshooting
    • Problems with the User Interface
    • Help Board
    • How to Report Issues
    • Job Errors and Their Meanings
      • Common Job Errors
        • The Job Log
      • Import Job Errors
      • Assembly App Errors
      • Annotation App Errors
      • Functional Genomics App Errors
      • Modeling App Errors
  • Developing Apps
    • The KBase SDK
    • Create a KBase Developer Account
    • KBase GitHub Repository
  • External Links
    • KBase Narrative Interface
    • KBase web site
    • KBase App Catalog
  • kbase.us
Powered by GitBook
On this page
  • What is "chemical abundance" in KBase?
  • Formatting chemical abundance matrices
  • Uploading and Importing
  • Using the Uploaded Data

Was this helpful?

  1. Working with Data
  2. Data Upload and Download Guide

Chemical Abundance Matrix

Metabolomics, exometabolite, and chemical abundance data can be integrated with metabolic modeling and flux balance analysis tools in KBase.

PreviousAmplicon MatrixNextSampleSet

Last updated 7 months ago

Was this helpful?

What is "chemical abundance" in KBase?

The name of this data type “chemical abundance” is a broad term that we use to represent a wide array of measurements associated with chemicals. This data type can be used to upload and store diverse types of chemical data in the system such as metabolomics (intracellular and/or extracellular) that is derived based on microbiomes/isolate organism growth experiments etc., computationally predicted compounds, or data collected on the concentration or from elemental analysis. These data could be collected on environmental samples, such as soil, sediment, or water. Currently, the metabolomics data derived from the samples are the most popular data that is uploaded and stored in the system.

Once chemical abundance data matrices are , they can be analyzed using , such as Escher mapping. Additional statistical analysis of the chemical abundance attribute maps, such as PCA and clustering, can also be performed.

A Chemical Abundance Matrix can be uploaded from a TSV (tab-separated values) file with a .tsv or .tab file extension, or from Excel spreadsheet with a .xls extension.

Each Chemical Type can be either a specific compound or element, aggregate (totals), exometabolites (measurements of compounds or elements that are consumed or excreted into the medium).

Formatting chemical abundance matrices

The minimal set of metadata in a chemical abundance matrix includes an ID (unique value) field, a chemical type (aggregate, exometabolite, specific), and one or more of the following: Compound ID (e.g; ModelSEED, KEGG, ChEBI), mass, formula, inchikey, inchi, smiles, or compound name. Additional metadata such as units are strongly encouraged to provide with proper information that fits your scientific use cases or be kept as ‘unknown’. (see section "Template Fields Descriptions" for an explanation of each field) Providing additional metadata may enhance the downstream analysis of use cases for you and other readers.

If a SampleSet exists, it can be applied to the chemical abundance data. Chemical abundance data needs to be formatted to ensure Samples are correctly linked.

Note that linking to Samples is not required, but highly recommended. When linking to using this app, the template will be automatically populated with Sample IDs to ensure the chemical abundance data is properly linked to corresponding Samples in the system.

This App generates a spreadsheet onto which you can copy your data to ensure it links to the SampleSet when uploaded.

When creating the chemical abundance template, there are 10 columns in the default sheet shown in the sheet below. Column headings in italics come pre-filled with validated values to choose form a dropdown.

Column Heading
Description

ID (unique value)

The identifier for the chemical/element/compound or peak value. This can be user-defined but must be unique for each compound.

Chemical Type

  • Specific - Specific compounds in the sample (e.g; TiO2, Tyrosine)

  • Aggregate - Measurement of total concentration of individual elements (e.g; Total C, N)

  • Exometabolite - Measurements of compounds that are consumed or excreted into the medium by an organism or a microbial community

Measurement Type

unknown, FTICR, Orbitrap, Quadrapole

Units

mg/Kg, g/Kg, mg/L, mg/g DW, mM (millimolar), M (molar), % (percentage), Total Weight %, unknown

Unit Medium

soil, solvent, water

Chemical Ontology Class

Used for grouping chemicals by their functional groups (e.g; aromatics) or could use according to ontologies defined in public databases (e.g; ChEBI).

Chromatography Type

unknown, HPLC, MS/MS, LCMS, GS

Chemical Class

Used for defining major classes like lanthanides

Protocol

Any term or sentence defining the protocol used in your lab

Identifier

An optional identifier/abbreviation that you use for the compound and or element (e.g; Menaquinone => MK-8, Fatty acid => (Iso-C16:1) )

Select chemical data to include, such as aggregate M/Z, compound name, predicted formula, and more, depending on what data you have for upload.

Column Heading
Description

Aggregate M/Z

M/Z represents the ratio of mass (M) divided by net charge (Z) of the chemical.

Compound Name

Optional name for the compound

Predicted Formula

Formula of the compound (if available). The formula could be a predicted formula based on techniques such as Fourier-transform ion cyclotron resonance (FTICR) mass spectrometry, computationally predicted formula based on molecular weight, derived from instruments based on known protocols, or based on the chemical structure determination from NMR.

Predicted Structure (smiles)

Predicted structure using Simplified Molecular Input Line Entry System (SMILES) notation

Predicted Structure (inchi-key)

Predicted structure using International Chemical Identifier (InChi) key notation.

Theoretical Mass

Theoretical molecular mass of the compound

Retention Time

The retention time of the chemical; not validated, so users can include units or not depending on needs

Polarity

Polarity of the chemical.

If you plan to use your data with metabolic modeling analysis pipelines (see Use cases section) we highly encourage to have at least one type of identifier to be listed (if available), as the compound identifiers will be used to map compounds in metabolic models. (Alternatively, you can provide inchikeys which we able to map to compounds in metabolic models).

Uploading and Importing

Once the matrix is in your Staging Area, you can import the data into your Narrative.

In the first section for Input Objects, select the previously imported SamplesSet file from the dropdown menu for linking metadata.

Under Parameters, select the Chemical Abundance Matrix using the FilePath dropdown.

Fill in the Matrix Object Name and click the "Run" button to add the metabolomics data to the Narrative.

Using the Uploaded Data

The creates an Excel spreadsheet for direct download that can be populated with chemical abundance data. While Chemical abundance data works best and more meaningful when linked with an existing SampleSet in the system, linking a SampleSet is not required (See section Linking SampleSet). While Chemical abundance data works best and more meaningful when linked with an in the system, linking a SampleSet is not required.

Finally, select at least one form of standard Chemical IDs to include in the template. These can be selected from the , , and databases. You can use their respective websites to get the identifier for your compound. If you are creating a chemical abundance sheet from scratch, you don't have to include one of these chemical IDs, but it is recommended that you do so in order to compare similar chemical once uploaded.

For full functionality of analysis tools using metadata, first .

Once you have added and your data to the Chemical Abundance Matrix template, you can upload it using the .

Using a file on your computer, open the . Then drag & drop the chemical abundance matrix file into the Staging Area.

Once you've uploaded your chemical abundance data, you can explore apps using that type of data or use one of our use case demonstration Narratives. See

Create Chemical Abundance Matrix Template App
existing SampleSet
KEGG
ChEBI
ModelSEED
upload and import the SampleSet
formatted
Import Chemical Abundance Matrix from CSV/Excel/TSV File in Staging Area App
Import tab within the Data Browser
the chemical abundance section of in Using Apps for more info.
uploaded
KBase Apps for metabolomics
Chemical Abundance Upload Webinar
Create Chemical Abundance Matrix Template
Import Chemical Abundance Matrix