Data Types

Data within KBase covers a wide range of data types relevant to systems biology research, including genomes and their annotations, metagenomes, expression, and protein-protein interactions, inferred models of organismal and community metabolism and gene regulation, even geographical information about populations.

Remember to follow the KBase Data Policy and Sources agreement.

When working with data in KBase and using Apps, remember that choosing the same output name for data objects will overwrite existing data objects of the same type with that name.

Data Type Descriptions

KBase handles data as objects versus files for interoperability with KBase Apps. The following are data types and their adjacent file types:

  • Assembly ⏤ FASTA files; extensions .fasta, .fna, .fa, .fas

  • FASTQ/SRA Reads ⏤ Interleaved, Non-interleaved, paired-end reads, single-end reads; extensions .fastq, .fq, .sra

  • Genome ⏤ GenBank or GFF3 with a FASTA file; extensions .genbank, .gb, .gbk, or .gbff, .gff, and .fasta, .fna, .fa, .fas

  • Metagenome ⏤ GFF Metagenome with a FASTA file; extensions .gbff, .gff, and .fasta, .fna, .fa, .fas

  • FBA Model ⏤ SBML, Excel, or TSV; extensions .sbml, .xml, .tsv, .xls, .xlsx

  • Media ⏤ Excel or TSV; .xls, .xlsx, .tsv

  • Expression Matrix ⏤ Excel or TSV; .xls, .xlsx, .tsv

  • Phenotype Set ⏤ Excel or TSV; .xls, .xlsx, .tsv .tab

  • Amplicon Matrix ⏤ Excel and FASTA; .xls, .xlsx, and .fasta, .fna, .fa, .fas

  • Chemical Abundance Matrix ⏤ Excel, CSV, or TSV; .xls, .xlsx, .csv, .tsv

  • SampleSet ⏤ Excel or TSV; .xls, .xlsx, .tsv

The Data Type Catalog lists and describes the key data types representing different classifications of biological data within KBase.

Last updated