SYNOPSIS
gt shredder [option ...] [sequence_file ...]
DESCRIPTION
-coverage [value]
- set the number of times the sequence_file is shreddered (default: 1)
-minlength [value]
- set the minimum length of the shreddered fragments (default: 300)
-maxlength [value]
- set the maximum length of the shreddered fragments (default: 700)
-overlap [value]
- set the overlap between consecutive pieces (default: 0)
-sample [value]
- take samples of the generated sequences pieces with the given probability (default: 1.000000)
-clipdesc [yes|no]
- clip descriptions after first space (fooled by \t, \n etc) adds offset and length to ensure unique identifier (default: no)
-width [value]
- set output width for FASTA sequence printing (0 disables formatting) (default: 0)
-o [filename]
- redirect output to specified file (default: undefined)
-gzip [yes|no]
- write gzip compressed output file (default: no)
-bzip2 [yes|no]
- write bzip2 compressed output file (default: no)
-force [yes|no]
- force writing to output file (default: no)
-help
- display help and exit
-version
- display version information and exit
Each sequence given in sequence_file is shreddered into consecutive pieces of random length (between -minlength and -maxlength) until it is consumed. By this means the last shreddered fragment of a given sequence can be shorter than the argument to option -minlength. To get rid of such fragments use gt seqfilter (see example below).
EXAMPLES:
Shredder a given BAC:
-
$ gt shredder U89959_genomic.fas > fragments.fas
Shredder an EST collection into pieces between 50 and 100 bp and get rid of all (terminal) fragments shorter than 50 bp:
-
$ gt shredder -minlength 50 -maxlength 100 U89959_ests.fas \ | gt seqfilter -minlength 50 - > fragments.fas # 130 out of 1260 sequences have been removed (10.317%)
Shredder an EST collection and show only random 10% of the resulting fragments:
-
$ gt shredder -sample 0.1 U89959_ests.fas