Software and Tools

Software utilities leveraged by CAMRA

JCVI Software

CAMRA GitHub Repository

The GitHub repository of tools used by the CAMRA project.

The repository includes Docker files for JCVI-produced and 3rd party bioinformatic tools, alongside workflows in the Workflow Description Language (WDL) tailored for Terra, a cloud-native research platform.

GGRaSP

An R-package for selecting representative genomes using Gaussian mixture models.

GGRaSP (Gaussian Genome Representative Selector with Prioritization) is an R-package that generates a reduced subset of genomes that prioritizes maintaining genomes of interest to the user as well as minimizing the loss of genetic variation. GGRaSP also allows for unsupervised clustering by modeling the genomic relationships using a Gaussian Mixture Model to select an appropriate cluster threshold, thus allowing for both generalizable high-throughput and more dataset specific use.

LOCUST

A custom sequence locus typer for classifying microbial genotypic and phenotypic attributes.

LOCUST (LOcus CUstom Sequence Typer) is a custom sequence locus typer tool for classifying microbial genomes. It provides a fully automated opportunity to customize the classification of genome-wide nucleotide variant data most relevant to biological research.

OMeta

OMeta is a data driven application that can be configured to track project, study, sample, experiment, or clinical data.

OMeta (Ontologies based metadata tracking system) is very flexible and can be configured to track any metadata, and metadata can be configured based on events like subject registration, vaccination, physical exam, treatment, diagnosis, lab test etc. OMeta keeps track of all events and updates to maintain complete audit trail (who, what and when).

PANACEA

PanACEA is a tool for Pan-Genome visualization, which utilizes locally-computed interactive web-pages to view ordered pan-genome data.

PanACEA (Pan-genome Atlas with Chromosome Explorer and Analyzer) is an interactive tool for Pan-Genome visualization. It consists of multi-tiered, hierarchical display pages enabling the navigation of both detailed and high-level views of the data to include both core and variable pan-chromosome regions down to single genes. Regions and genes are functionally annotated to allow for rapid searching and visual identification of regions of interest with the option that user-supplied genomic phylogenies and metadata can be incorporated.

PanOCT

PanOCT is a program written in PERL for pan-genomic analysis of closely related prokaryotic species or strains.

PanOCT, Pan-genome Ortholog Clustering Tool, is a program written in PERL for pan-genomic analysis of closely related prokaryotic species or strains. Unlike traditional graph-based ortholog detection programs, it uses micro synteny or conserved gene neighborhood (CGN) in addition to homology to accurately place proteins into orthologous clusters.

Phage_Finder

Phage_Finder is a heuristic computer program written in PERL to identify prophage regions within bacterial genomes

The goal of this software is to provide an open-sourced, standardized and automated system to identify and classify prophages within prokaryotic genomes. It is hoped that this package will facilitate future studies on the biology and evolution of these prophages by providing a level of microbial genome annotation that was previously void.

Third-party Software

BAGEP (a.k.a. BAGPIPE)

Workflow for downstream analysis of whole genome sequencing of bacterial samples

Bacterial Genome Pipeline (BAGEP) is an automated and scalable workflow for downstream analysis of whole genome sequencing of bacterial samples. It also generates a graphical interactive heatmap for the visulisation of SNPs and their positions across the genome.

CARD/RGI

A bioinformatic database of resistance genes, their products, and associated phenotypes.

The Comprehensive Antibiotic Resistance Database ("CARD") provides data, models, and algorithms relating to the molecular basis of antimicrobial resistance. The CARD provides curated reference sequences and SNPs organized via the Antibiotic Resistance Ontology ("ARO"). These data can be browsed on the website or downloaded in a number of formats. These data are additionally associated with detection models, in the form of curated homology cut-offs and SNP maps, for prediction of resistome from molecular sequences. These models can be downloaded or can be used for analysis of genome sequences using the Resistance Gene Identifier ("RGI"), either online or as a stand-alone tool.