Usage: sga correct [OPTION] ... READSFILE Correct sequencing errors in all the reads in READSFILE --help display this help and exit -v, --verbose display verbose output -p, --prefix=PREFIX use PREFIX for the names of the index files (default: prefix of the input file) -o, --outfile=FILE write the corrected reads to FILE (default: READSFILE.ec.fa) -t, --threads=NUM use NUM threads for the computation (default: 1) --discard detect and discard low-quality reads -d, --sample-rate=N use occurrence array sample rate of N in the FM-index. Higher values use significantly less memory at the cost of higher runtime. This value must be a power of 2 (default: 128) -a, --algorithm=STR specify the correction algorithm to use. STR must be one of kmer, hybrid, overlap. (default: kmer) --metrics=FILE collect error correction metrics (error rate by position in read, etc) and write them to FILE Kmer correction parameters: -k, --kmer-size=N The length of the kmer to use. (default: 31) -x, --kmer-threshold=N Attempt to correct kmers that are seen less than N times. (default: 3) -i, --kmer-rounds=N Perform N rounds of k-mer correction, correcting up to N bases (default: 10) --learn Attempt to learn the k-mer correction threshold (experimental). Overrides -x parameter. Overlap correction parameters: -e, --error-rate the maximum error rate allowed between two sequences to consider them overlapped (default: 0.04) -m, --min-overlap=LEN minimum overlap required between two reads (default: 45) -c, --conflict=INT use INT as the threshold to detect a conflicted base in the multi-overlap (default: 5) -l, --seed-length=LEN force the seed length to be LEN. By default, the seed length in the overlap step is calculated to guarantee all overlaps with --error-rate differences are found. This option removes the guarantee but will be (much) faster. As SGA can tolerate some missing edges, this option may be preferable for some data sets. -s, --seed-stride=LEN force the seed stride to be LEN. This parameter will be ignored unless --seed-length is specified (see above). This parameter defaults to the same value as --seed-length -b, --branch-cutoff=N stop the overlap search at N branches. This parameter is used to control the search time for highly-repetitive reads. If the number of branches exceeds N, the search stops and the read will not be corrected. This is not enabled by default. -r, --rounds=NUM iteratively correct reads up to a maximum of NUM rounds (default: 1) Report bugs to js18@sanger.ac.uk