ariba-prepareref(1) Prepare reference data for running the pipeline


ariba ,prepareref /[,options/] ,<outdir>/


ARIBA is a tool that identifies antibiotic resistance genes by running local assemblies.

More information on the individual subtools can be found on the ARIBA wiki:

positional arguments:

Output directory (must not already exist)

optional arguments:

-h, --help
show this help message and exit

input files options:

--ref_prefix FILENAME_PREFIX
Prefix of input files (same as was used with getref), to save listing --preseabs,--varonly ...etc. Will look for files called "ref_prefix." followed by: metadata.t sv,presence_absence.fa,noncoding.fa,variants_only.fa. Using this will cause these to be ignored if used: --presabs,--varonly,--noncoding,--metadata
--presabs FILENAME
FASTA file of presence absence genes
--varonly FILENAME
FASTA file of variants only genes
--noncoding FILENAME
FASTA file of noncoding sequences
--metadata FILENAME
tsv file of metadata about the reference sequences

cd-hit options:

Do not run cd-hit. Each input sequence is put into its own "cluster". Incompatible with --cdhit_clusters.
--cdhit_clusters FILENAME
File specifying how the sequences should be clustered. Will be used instead of running cdhit. Format is one cluster per line. Sequence names separated by whitespace. First name in line is the cluster representative. Incompatible with --no_cdhit
--cdhit_min_id FLOAT
Sequence identity threshold (cd-hit option -c) [0.9]
--cdhit_min_length FLOAT
length difference cutoff (cd-hit option -s) [0.9]

other options:

--min_gene_length INT
Minimum allowed length in nucleotides of reference genes [6]
--max_gene_length INT
Maximum allowed length in nucleotides of reference genes [10000]
--genetic_code INT
Number of genetic code to use. Currently supported 1,4,11 [11]
--threads INT
Number of threads (currently only applies to cdhit) [1]
Be verbose

REQUIRED: either --ref_prefix, or at least one of --presabs, --varonly, --noncoding


This tool is developed and maintained by the Pathogen Informatics group at the Wellcome Trust Sanger Institute under the GNU General Public License, version 3.