Transdecoder(1) Transcriptome Protein Prediction

USAGE

Required:

 -t <string>                            transcripts.fasta

Common options:

 --retain_long_orfs <int>               retain all ORFs found that are equal or longer than these many nucleotides even if no other evidence 
                                         marks it as coding (default: 900 bp => 300aa)
 --retain_pfam_hits <string>            domain table output file from running hmmscan to search Pfam (see transdecoder.github.io for info)     
                                        Any ORF with a pfam domain hit will be retained in the final output.
 
 --retain_blastp_hits <string>          blastp output in '-outfmt 6' format.
                                        Any ORF with a blast match will be retained in the final output.
 --single_best_orf                      Retain only the single best ORF per transcript.
                                        (Best is defined as having (optionally pfam and/or blast support) and longest orf)
 --cpu <int>                            Use multipe cores for cd-hit-est. (default=1)

Advanced options

 --train <string>                       FASTA file with ORFs to train Markov Mod for protein identification; otherwise 
                                        longest non-redundant ORFs used
 -T <int>                               If no --train, top longest ORFs to train Markov Model (hexamer stats) (default: 500)
                                        Note, 10x this value are first selected for use with cd-hit to remove redundancies,
                                        and then this -T value of longest ORFs are selected from the non-redundant set.