Data Matrices - Amplicon, Stats

Some of the tools in KBase available for rarefaction, standardization, and analyzing amplicon matrices and OTU tables.

Data Cleaning

Rarefaction and standardization can be performed in KBase to track data provenance on how the data was transformed. These Apps can be used in any order and/or repeatedly. For instance to remove singletons, rarefy samples, and then calculate relative abundance of rarefied samples.

  • Transform Matrix Apparrow-up-right – uses an AmpliconMatrix object as input

    • provides abundance filtering, which can remove rows or columns if all values fall below a user-specified threshold and/or if the sum of all values falls below a user-specified threshold

    • perform standardization, with options for centering the data before scaling, scaling the data to unit variance, and whether to perform the standardization on columns or rows

    • perform ratio transformation, by either Centre or Isometric Log Ratio Transformation methods, and be applied to columns or rows

  • Rarefy Matrix Apparrow-up-right – allows manual selection of a seed value to aid reproducibility to columns or rows

Taxonomic Classification

  • Classify rRNA with taxonomy using naïve Bayes with RDP Classifier App (in dev/beta) – assigns taxonomies to amplicon matrices based on several gene databases including RDP Classifier 2.13 - 16srna, RDP Classifier - fungallsu, and SILVA 138 - SSU, Full Length. Silva databases are currently under development.

  • Taxonomy Abundance Barplot Apparrow-up-right (in beta) – visualize the taxonomic abundance

Statistical Analysis

Statistical comparisons can be calculated, such as clustering (hierarchical or k-means) and PCA analysis.

Functional Prediction for Amplicon Matrices

circle-info

Louca Lab FAPROTAX sitearrow-up-right (http://www.loucalab.com/archive/FAPROTAX/arrow-up-right)

Louca, S., Parfrey, L.W., Doebeli, M. (2016) Decoupling function and taxonomy in the global ocean microbiome. Science 353: 1272-1277. https://www.science.org/doi/10.1126/science.aaf4507arrow-up-right

circle-info

Douglas, G.M., Maffei, V.J., Zaneveld, J.R. et al. (2020). PICRUSt2 for prediction of metagenome functions. Nat Biotechnol 38: 685–688 (2020). https://doi.org/10.1038/s41587-020-0548-6arrow-up-right

PICRUSt2: An improved and extensible approach for metagenome inferencearrow-up-right (preprint)

Most of the relevant KBase Apps can be found in the Utilities section of the Apps panel and filtered through searching with "GenericsAPI" or the GenericsAPI Module Catalogarrow-up-right.

Last updated

Was this helpful?