vcf-annotate(1)
annotate VCF file, add filters or custom annotations
SYNOPSIS
cat
in.vcf | vcf-annotate [OPTIONS] > out.vcf
DESCRIPTION
About: Annotates VCF file, adding filters or custom annotations. Requires tabix indexed file with annotations.
-
Currently annotates only the INFO column, but it will be extended on demand.
OPTIONS
- -a, --annotations <file.gz>
-
The tabix indexed file with the annotations: CHR\tFROM[\tTO][\tVALUE]+.
- -c, --columns <list>
-
The list of columns in the annotation file, e.g. CHROM,FROM,TO,-,INFO/STR,INFO/GN. The dash
in this example indicates that the third column should be ignored. If TO is not
present, it is assumed that TO equals to FROM.
- -d, --description <file|string>
-
Header annotation, e.g. key=INFO,ID=HM2,Number=0,Type=Flag,Description='HapMap2 membership'.
The descriptions can be read from a file, one annotation per line.
- -f, --filter <list>
-
Apply filters, list is in the format flt1=value/flt2/flt3=value/etc.
- -h, -?, --help
-
This help message.
Filters:
- +
-
Apply all filters with default values (can be overridden, see the example below).
- -X
-
Exclude the filter X
- 1, StrandBias
-
FLOAT Min P-value for strand bias (given PV4) [0.0001]
- 2, BaseQualBias
-
FLOAT Min P-value for baseQ bias [1e-100]
- 3, MapQualBias
-
FLOAT Min P-value for mapQ bias [0]
- 4, EndDistBias
-
FLOAT Min P-value for end distance bias [0.0001]
- a, MinAB
-
INT Minimum number of alternate bases [2]
- c, SnpCluster
-
INT1,INT2 Filters clusters of 'INT1' or more SNPs within a run of 'INT2' bases []
- D, MaxDP
-
INT Maximum read depth [10000000]
- d, MinDP
-
INT Minimum read depth [2]
- q, MinMQ
-
INT Minimum RMS mapping quality for SNPs [10]
- Q, Qual
-
INT Minimum value of the QUAL field [10]
- r, RefN
-
Reference base is N []
- W, GapWin
-
INT Window size for filtering adjacent gaps [10]
- w, SnpGap
-
INT SNP within INT bp around a gap to be filtered [10]
Example:
-
zcat in.vcf.gz | vcf-annotate -a annotations.gz -d descriptions.txt | bgzip -c >out.vcf.gz
zcat in.vcf.gz | vcf-annotate -f +/-a/c=3,10/q=3/d=5/-D -a annotations.gz -d descriptions.txt | bgzip -c >out.vcf.gz
Where descriptions.txt contains:
-
key=INFO,ID=GN,Number=1,Type=String,Description='Gene Name'
key=INFO,ID=STR,Number=1,Type=Integer,Description='Strand'