$ sga preprocess --help Usage: sga preprocess [OPTION] READS1 READS2 ... Prepare READS1, READS2, ... data files for assembly If pe-mode is turned on (pe-mode=1) then if a read is discarded its pair will be discarded as well. --help display this help and exit -v, --verbose display verbose output --seed set random seed Input/Output options: -o, --out=FILE write the reads to FILE (default: stdout) -p, --pe-mode=INT 0 - do not treat reads as paired (default) 1 - reads are paired with the first read in READS1 and the second read in READS2. The paired reads will be interleaved in the output file 2 - reads are paired and the records are interleaved within a single file. --pe-orphans=FILE if one half of a read pair fails filtering, write the passed half to FILE Conversions/Filtering: --phred64 convert quality values from phred-64 to phred-33. --discard-quality do not output quality scores -q, --quality-trim=INT perform Heng Li's BWA quality trim algorithm. Reads are trimmed according to the formula: argmax_x{\sum_{i=x+1}^l(INT-q_i)} if q_l<INT where l is the original read length. -f, --quality-filter=INT discard the read if it contains more than INT low-quality bases. Bases with phred score <= 3 are considered low quality. Default: no filtering. The filtering is applied after trimming so bases removed are not counted. Do not use this option if you are planning to use the BCR algorithm for indexing. -m, --min-length=INT discard sequences that are shorter than INT this is most useful when used in conjunction with --quality-trim. Default: 40 -h, --hard-clip=INT clip all reads to be length INT. In most cases it is better to use the soft clip (quality-trim) option. --permute-ambiguous Randomly change ambiguous base calls to one of possible bases. If this option is not specified, the entire read will be discarded. -s, --sample=FLOAT Randomly sample reads or pairs with acceptance probability FLOAT. --dust Perform dust-style filtering of low complexity reads. --dust-threshold=FLOAT filter out reads that have a dust score higher than FLOAT (default: 4.0). --suffix=SUFFIX append SUFFIX to each read ID Adapter/Primer checks: --no-primer-check disable the default check for primer sequences -r, --remove-adapter-fwd=STRING -c, --remove-adapter-rev=STRING Remove the adapter STRING from input reads. Report bugs to js18@sanger.ac.uk