KBase Documentation
  • KBase Documentation
  • KBase Terms & Conditions
  • Getting Started
    • Signing Up and Signing In
      • Step-by-Step Signup Guide
      • Authentication Update
    • Supported Browsers
    • Narrative Quick Start
    • Narrative Interface User Guide
      • Access the Narrative Interface
      • Tour the Narrative Interface
      • Narrative Navigator
      • Create a Narrative
      • Explore Data
      • Add Data to Your Narrative
      • Browse KBase Analysis Tools
      • Analyze Data Using KBase Apps
      • Job Browser
      • Revise Your Narrative
      • Format Markdown Cells
      • Share Narratives
      • Linking Static Narratives to ORCID
      • Access and Copy Narratives
      • Organizations
    • FAQs
  • Manage Your Account
    • Linking Accounts
    • Linking KBase to ORCiD
  • Working with Data
    • Data Upload and Download Guide
      • Data Types
      • Importing Data
        • Bulk Import Limitations
      • Assembly
      • Genome
      • FASTQ/SRA Reads
      • Flux Balance Analysis (FBA) Model
      • Media
      • Expression Matrix
      • Phenotype Set
      • Amplicon Matrix
      • Chemical Abundance Matrix
      • SampleSet
      • Compressed/Zipped Files
      • Bulk Import Specification
      • Downloading Data
    • Searching, Adding, and Uploading Data
    • Filtering, Managing, and Viewing Data
    • Linking Metadata
      • Ontologies and Validated Terms
    • Public Data in KBase
    • Transfer Data with Globus
  • Using Apps
    • Analysis Apps in KBase
      • Assembly & Annotation
      • Comparative Genomics
      • Metabolic Modeling
      • Metagenomics & Community Exploration
      • Data Matrices - Amplicon, Stats
      • Chemical Abundance
      • Expression & Transcriptomics
    • Apps in Beta
  • Running Common Workflows
    • Assembling & Annotating Microbial Genomes
      • FAQ: Assembly and Annotation
    • Comparative Genomics & Phylogenetic Analysis
      • FAQ: Comparative Genomics
    • Metagenomic & Community Analysis
      • FAQ: Metagenomics & Community Analysis
    • Transcriptomic Analysis
      • FAQ: RNA-seq Analysis
    • Constructing Metabolic Models
      • Constructing and Analyzing Metabolic Flux Models of Microbial Communities
      • FAQ: Metabolic Modeling
  • Community Developed Workflows and Tools
    • Functional Annotation
    • Functional and Taxonomic Profiling of MAGs
    • Taxonomy
    • Viral
    • Random Walk with Restart Toolkit
  • Troubleshooting
    • Problems with the User Interface
    • Help Board
    • How to Report Issues
    • Job Errors and Their Meanings
      • Common Job Errors
        • The Job Log
      • Import Job Errors
      • Assembly App Errors
      • Annotation App Errors
      • Functional Genomics App Errors
      • Modeling App Errors
  • Developing Apps
    • The KBase SDK
    • Create a KBase Developer Account
    • KBase GitHub Repository
  • External Links
    • KBase Narrative Interface
    • KBase web site
    • KBase App Catalog
  • kbase.us
Powered by GitBook
On this page
  • Ontology Landing Pages
  • Validated Metadata Terms
  • Unit Conversions
  • Terms
  • Commonly used units:

Was this helpful?

  1. Working with Data
  2. Linking Metadata

Ontologies and Validated Terms

Make the most out of your metadata by using controlled vocabulary

PreviousLinking MetadataNextPublic Data in KBase

Last updated 1 month ago

Was this helpful?

TL;DR: Use vocabulary that can be identified in KBase so your metadata is functional.

When uploading Samples to KBase, there are many metadata terms that are built in and can be validated. Doing so allows your data to be more easily compared with other data and allows us to integrate data when different sources may use different conventions.

Many standard terms, such as NCBI BioSamples and IGSN SESAR, aim to describe sample metadata. These standards often overlap and may differ significantly depending on the focus of the organization or institution that manages the samples. At the same time, many labs and smaller institutions have their own templates and organization of their data.

Thus, there must be a way to integrate similar metadata from different sources and account for the differences between the sources.

When Samples are uploaded into KBase via a spreadsheet, they are transformed into an internal representation. Columns within the spreadsheet are mapped and transformed based on the template used. Columns that correspond to recognized terms are validated to ensure they properly formatted, such as checking that the value is the proper type (e.g. string versus a number), in the correct range, match an enumerated list, or appear in an ontology.

Controlled terms are useful both because they undergo this validation and they provide a more precise meaning for the value and comparisons accounting for units.

When the uploader encounters terms it doesn’t recognize, those terms and values will be stored in a user section of the samples. These values still serve a purpose and can be used in analysis within that data set and they can provide contextual information that the original uploader understands. However, unrecognized terms can not reliably be compared across SampleSets and other samples in the system. For example, two projects may use the same term to represent different concepts (e.g. depth below sea-level or depth below surface).

Using templates and controlled terms clarifies the exact meaning of a term and enforces additional validation. For a full list of terms KBase recognizes, see the below. If there are terms you would like added, please contact us at engage@kbase.us to suggest additions.

Ontology Landing Pages

The Ontology API supports multiple ontology systems. Currently supported systems are Gene Ontology (GO) and Environmental Ontology (ENVO). You can see more information on both ontologies from their own home pages, or view the KBase landing page for a given ontology page with the links and URL format below (respectively).

    • KBase: https://narrative.kbase.us/#ontology/term/go_ontology/GO:#######

    • KBase: https://narrative.kbase.us/#ontology/term/envo_ontology/ENVO:########

Validated Metadata Terms

If you have Sample Attribute columns that are represented as validated terms in SESAR, ENIGMA, or KBase formats, you may request the terms to be added to KBase's vocabulary.

KBase will review newly proposed attributes to assess: overlap with current supported attributes; existing or relevant ontological frameworks; and presence of and interoperability of units for measured data.

Unit Conversions

Terms

Commonly used units:

Length

  • Base: meter, m

  • Micron/micrometer: µm or um, 1*10^-6 m

  • Millimeter: mm, 1*10^-3 m

  • Centimeter: cm, 1*10^-2 m

  • Kilometer: km, 1*10^3 m

Time

  • Base: second, s

  • Minute: min, 60 s

  • Hour: hr, 3600 s or 60 min

  • Day: d, 24 hr

  • Year: yr or a, 365.25 d

Mass

  • Base: gram, g

  • Microgram: µg, 1*10^-6 g

  • Milligram: mg, 1*10^-3 g

  • Kilogram: kg, 1*10^3 g

The file TSV file below contains a list of the metadata terms currently supported for validation along with a description that provides general formatting direction. You can view the in the KBase Samples GitHub.

You can see a full list of the units and their defined relations to each other in .

To determine if your term is valid, the easiest way is to search the.

most up-to-date version of this list
GitHub
ENVO Database
Gene Ontology (GO)
The Environmental Ontology (ENVO)
validated metadata table
19KB
metadata_validation.tsv
Validated Metadata TSV