User Tools

Site Tools


archive:bioinformatic_tools:pluck-scripts

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
archive:bioinformatic_tools:pluck-scripts [2010/04/16 05:40]
karplus created
archive:bioinformatic_tools:pluck-scripts [2015/07/28 06:25] (current)
ceisenhart ↷ Page moved from bioinformatic_tools:pluck-scripts to archive:bioinformatic_tools:pluck-scripts
Line 6: Line 6:
   * check-inversions FIXME   * check-inversions FIXME
   * classify-blast-reads FIXME   * classify-blast-reads FIXME
 +  * differences2stitcher Given a reference and difference format, output changes in stitcher format. Can have problems with many SNPs close together. Will try to change multiple SNPs in a single expression if possible.
   * extract-fragments FIXME   * extract-fragments FIXME
   * filter-blat FIXME   * filter-blat FIXME
-  * find-dna-differences ​FIXME+  * find-dna-differences ​compares a genome (or set of contigs) to a reference genome and reports differences in three formats: alignments of matching regions in a human-readable format, bed format for the location in the reference genome (loses some information about long insertions or replacements),​ and a short form that gives old_seq:​reference location:​new_seq for each change. ​ This program is only intended for small sets of differences,​ not for large rearrangements or distant relationships. ​ It may be buggy at the moment, as some of the Pog contigs that mapped completely to the genome were reported as not mapping (perhaps they were exact repeats?​). ​
   * find-frequent-color-kmers FIXME   * find-frequent-color-kmers FIXME
-  * kstitcher ​FIXME+  * kstitcher ​David Bernick originally created a program called "​stitcher"​ which would stitch together newbler contigs. Stitcher format: "@name {+contigname|-contigname}+ 15*N" start a new contig called "name comment"​. This format allows for nested parantheses. No operator precedence; i.e., 15*(-contig1) and 15*-contig1 have very different results. The numeric value does not need to be a scalar. 0.5*contig will report the first half of the contig. Use parantheses whenever possible. Contig names should not be made soley of bases (i.e. GATACA). Strange operator "expr1 < expr2 > expr3" is analagous to "expr1 =~ /​expr2/​expr3/"​ iff expr2 is unique. If expr2 is not unique, the replacement will not take place. 
 +  * [[archive:​bioinformatic_tools:​pluck-scripts:​look-for-exit|look-for-exit]] used in the mitochondrial genome to find variants of the repeats and exits from the repeats.
   * make-contig-lengths FIXME   * make-contig-lengths FIXME
   * make-inversion-hypotheses FIXME   * make-inversion-hypotheses FIXME
   * make-pseudoreads FIXME   * make-pseudoreads FIXME
-  * make-scaffold-from-blat ​FIXME+  * make-scaffold-from-blat ​Takes a psl file as input. Will try to create a scaffold. Useful for compiling data from many sources that have alot of overlap. For example, after initial contigs are made, many reads are left over. De novo assembly of those reads can create new contigs which span the gaps between initial assembled contigs. make-scaffold-from-blast can create new scaffolds from these two sets of contigs.
   * map-colorspace FIXME   * map-colorspace FIXME
   * pair-contigs FIXME   * pair-contigs FIXME
Line 25: Line 27:
   * fasta.py Input/​Output module for fasta files, together with alphabet definitions and utility functions like reverse complement.   * fasta.py Input/​Output module for fasta files, together with alphabet definitions and utility functions like reverse complement.
   * subst_matrix.py ​ for creating DNA substitution matrices from a small number of parameters.   * subst_matrix.py ​ for creating DNA substitution matrices from a small number of parameters.
- 
archive/bioinformatic_tools/pluck-scripts.1271396428.txt.gz · Last modified: 2010/04/16 05:40 by karplus