Data Types

Data within KBase covers a wide range of data types relevant to systems biology research, including genomes and their annotations, metagenomes, expression, and protein-protein interactions, inferred models of organismal and community metabolism and gene regulation, even geographical information about populations.

Remember to follow the KBase Data Policy and Sourcesarrow-up-right agreement.

triangle-exclamation

Data Type Descriptions

KBase handles data as objects versus files for interoperability with KBase Apps. The following are data types and their adjacent file types:

  • Assembly ⏤ FASTA files; extensions .fasta, .fna, .fa, .fas

  • FASTQ/SRA Reads ⏤ Interleaved, Non-interleaved, paired-end reads, single-end reads; extensions .fastq, .fq, .sra

  • Genome ⏤ GenBank or GFF3 with a FASTA file; extensions .genbank, .gb, .gbk, or .gbff, .gff, and .fasta, .fna, .fa, .fas

  • Metagenome ⏤ GFF Metagenome with a FASTA file; extensions .gbff, .gff, and .fasta, .fna, .fa, .fas

  • FBA Model ⏤ SBML, Excel, or TSV; extensions .sbml, .xml, .tsv, .xls, .xlsx

  • Media ⏤ Excel or TSV; .xls, .xlsx, .tsv

  • Expression Matrix ⏤ Excel or TSV; .xls, .xlsx, .tsv

  • Phenotype Set ⏤ Excel or TSV; .xls, .xlsx, .tsv .tab

  • Amplicon Matrix ⏤ Excel and FASTA; .xls, .xlsx, and .fasta, .fna, .fa, .fas

  • Chemical Abundance Matrix ⏤ Excel, CSV, or TSV; .xls, .xlsx, .csv, .tsv

  • SampleSet ⏤ Excel or TSV; .xls, .xlsx, .tsv

The Data Type Catalogarrow-up-right lists and describes the key data types representing different classifications of biological data within KBase.

Last updated

Was this helpful?