Data Format Overview

There are mainly two data types accepted as an input in ResistoXplorer: a list of antimicrobial resistance genes (ARGs) or a data table from metagenomic-based AMR studies. The list data is a list of ARGs with optional abundance or fold change values. The data table is a table or matrix in tab-separated text or comma-separated values (.csv) file format containing information on features (ARGs or taxa) and samples. There are three types of data tables (files) required: an abundance profile (resistome or microbiome), an annotation (functional or taxonomic) file and a metadata file.

Resistome profile derived from whole-genome shotgun metagenomic data can be uploaded. The tab-separated (.txt) or comma-separated values (.csv) file format is used for resistome profile. Basically, it is a data table or matrix containing abundance values (raw read counts from metagenomic data saved as a tab delimited text (.txt) or comma-separated (.csv) file with rows for features (ARGs) and columns for samples). This delimited file can be generated from any spreadsheet or text editor software. Such file has to be in specific format which is described below:

  • It should contain sample names or IDs in first row beginning with "#NAME" in first column;
  • Both sample and feature names must be unique and consist of a combination of common English letters, underscores and numbers for naming purpose. Other special characters (e.g. single (') or double (") quotes) can also be used for feature (ARG) names. Latin/Greek letters are not supported;
  • Data values (read counts) should contain only numeric and positive values. Blank cells or with NA values are not allowed. Such values should be replaced by zero.
  • Non specific feature names (e.g. ARG_0001) can also be used as first column. In such case, a tab-delimited (.txt) or comma-separated (.csv) annotation mapping file must also be uploaded which contains functional annotation information at multiple levels, for each feature (ARG);
  • Lastly, in case of selecting already compiled database for functional annotation, the user should make sure that the feature (ARGs) names in abundance table should be in the same format as required by selected database. For more details on format for each database, kindly refer to "Annotation" tab from above.
  • Resistome abundance profile with features (ARGs) annotated through ResFinder database (Download here)
    #NAME                Sample1   Sample2  Sample3 Sample4 Sample5
    dfrA1_2_AJ419168       21       4	 4	  0	  0
    tet(O)_1_M18896        424	232	 191	  786	  189
    tet(T)_1_L42544        0	45	 0	  0	  1
    aph(3')-III_1_M26832   47	48	 50	  51	  46
  • Resistome abundance profile with non specific feature (ARG) names (Download here) along with mapping functional annotation file (Download here)
    #NAME                                                                                Sample1   Sample2  Sample3 Sample4 Sample5
    222|JQ394987.1|JQ394987|Multi-drug_resistance|Multi-drug_efflux_pumps|MDFA              21      4	 4	  0	  0
    424|D85892.1|D85892|MLS|Macrolide_phosphotransferases|MPHB                              424	232	 191	  786	  189
    518|AJ007350.1|AJ007350|betalactams|Class_A_betalactamases|ACI                          0	45	 0	  0	  1
    AGly|AY712687.1|gene1|Aminoglycosides|Aminoglycoside_O-nucleotidyltransferases|ANT6     47	48	 50	  51	  46

In case of Integration module, the user is also required to upload taxonomic abundance profile along with the resistome.

Taxonomic profiles derived from both 16S rRNA marker gene survey data or whole-genome shotgun metagenomic data can be uploaded. In case of taxonomic abundance profile, data values consist of read count (abundance) of taxa in each sample. The required file formats and data formatting for taxonomic profile is exactly same as stated above for resistome profile. Additionally, the user can also provide a taxonomic annotation mapping file separately for performing analysis at multiple taxonomic level (e.g. species, genus, phylum). Please note, parsing of features (taxa) names containing multiple taxonomic levels in abundance profile is not possible, hence an additional annotation file is always provided in such cases.

  • Resistome abundance profile (Download here) along with mapping taxonomic annotation file (Download here)
    #NAME                       Sample1  Sample2  Sample3  Sample4 Sample5 Sample6 Sample7 Sample8
    Acidobacterium capsulatum      219	49	42	50	6	17	22	21
    Acidimicrobium ferrooxidans    424	0	191	0	0	0	0	0
    Actinomyces oris               32	4	4	22	76	16	1	0
    Bifidobacterium animalis       47	0	0	4	0	0	0	0
