This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
archive:bioinformatic_tools:allpaths [2010/04/07 07:14] jstjohn |
archive:bioinformatic_tools:allpaths [2010/04/19 06:21] jstjohn |
||
---|---|---|---|
Line 1: | Line 1: | ||
===== ALLPATHS ===== | ===== ALLPATHS ===== | ||
+ | Attached is the Allpaths3 version 1.0 documentation converted from .docx format to .pdf. You can download this file by clicking the following link: {{:bioinformatic_tools:allpathsv3_manual_r1.0_2.pdf|allpathsv3_manual}}. | ||
+ | |||
+ | ====Potential Pitfalls==== | ||
+ | Designed to work with 100+ bp paired end reads from a *minimum of one short and one long set of libraries. Also the program expects 40X coverage from each of those libraries! Additionally they say it requires a minimum of 32Gb of ram which I assume means shared memory, so it may not work on our cluster. Maybe it would be useful to run on small portions of our data though? | ||
+ | |||
====High Level Overview==== | ====High Level Overview==== | ||
ALLPATHS is the most recent (as of this writing) tool developed by the broad institute to assemble shotgun sequences[(cite:broad>http://www.broadinstitute.org/science/programs/genome-biology/computational-rd/computational-research-and-development)]. The broad institute claims that version 3 of the program can assemble up to mammalian sized genomes if the reads are at least 100+ base pairs[(cite:broad)]. Version 3 (currently 3.2) of the program may be downloaded from [[ftp://ftp.broad.mit.edu/pub/crd/ALLPATHS/Release-3-0/|here]] and that folder also contains some documentation on how to use the program. Also the program ships with test data that you can assemble. | ALLPATHS is the most recent (as of this writing) tool developed by the broad institute to assemble shotgun sequences[(cite:broad>http://www.broadinstitute.org/science/programs/genome-biology/computational-rd/computational-research-and-development)]. The broad institute claims that version 3 of the program can assemble up to mammalian sized genomes if the reads are at least 100+ base pairs[(cite:broad)]. Version 3 (currently 3.2) of the program may be downloaded from [[ftp://ftp.broad.mit.edu/pub/crd/ALLPATHS/Release-3-0/|here]] and that folder also contains some documentation on how to use the program. Also the program ships with test data that you can assemble. | ||
Line 19: | Line 24: | ||
Configure error, requires boost with at least the Boost.System binaries installed. Now I am installing that... | Configure error, requires boost with at least the Boost.System binaries installed. Now I am installing that... | ||
+ | | ||
+ | Boost successfully installed. Now installing allpaths. Configure successful and currently building. --- //[[jstjohn@soe.ucsc.edu|John St. John]] 2010/04/07 00:53// | ||
+ | |||
+ | |||
+ | Build unsuccessful: | ||
+ | |||
+ | ./ParallelVecUtilities.h: In function 'void ParallelSort(vec<T>&)':\\ | ||
+ | ./ParallelVecUtilities.h:27: error: '__gnu_parallel' has not been declared\\ | ||
+ | ./ParallelVecUtilities.h: In function 'void ParallelSort(vec<T>&, StrictWeakOrdering)':\\ | ||
+ | ./ParallelVecUtilities.h:35: error: '__gnu_parallel' has not been declared\\ | ||
+ | ./ParallelVecUtilities.h: In function 'void ParallelReverseSort(vec<T>&)':\\ | ||
+ | ./ParallelVecUtilities.h:42: error: '__gnu_parallel' has not been declared\\ | ||
+ | ./ParallelVecUtilities.h: In function 'void ParallelReverseSort(vec<T>&, StrictWeakOrdering)':\\ | ||
+ | ./ParallelVecUtilities.h:50: error: '__gnu_parallel' has not been declared\\ | ||
+ | ./ParallelVecUtilities.h: In function 'void ParallelWhatPermutation(const V&, vec<T3>&, C, bool)':\\ | ||
+ | ./ParallelVecUtilities.h:316: error: '__gnu_parallel' has not been declared | ||
+ | |||
+ | Reading deeper into the documentation (the PDF attached to this page), I see that it requires gcc-4.3+. Campusrocks currently has gcc-4.1 installed. Perhaps if we compile the latest gcc we can install this program? | ||
+ | |||
+ | |||
+ | Installed gcc-4.5! (had to do it myself, the sys admins wouldn't try) | ||
+ | |||
+ | The gcc/g++-4.5 libraries are installed in: | ||
+ | /campusdata/BME235/lib | ||
+ | /campusdata/BME235/lib64 | ||
+ | |||
+ | To compile with the gcc 4.5 compilers you need to have your environment properly set up so that everything knows where to look for the linked libraries. I did this by setting my LD_LIBRARY_PATH variable as follows in my .profile | ||
+ | |||
+ | LD_LIBRARY_PATH=/campusdata/BME235/lib:/campusdata/BME235/lib64:$LD_LIBRARY_PATH | ||
+ | export LD_LIBRARY_PATH | ||
+ | |||
+ | Note if you want to run the install via a script, an example script that sets up environmental variables is here: | ||
+ | |||
+ | /campusdata/BME235/programs/allpaths/allpaths3-3.2/installallpaths.sh | ||
+ | | ||
+ | |||
+ | I have encountered a new error much further along in the compilation. To view the error see the last 40 lines of file: | ||
+ | /campus/BME235/programs/allpaths/allpaths3-3.2/installallpaths.sh.o2730 | ||
+ | |||
+ | /campusdata/BME235/programs/allpaths/allpaths3-3.2/src/paths/MuxWalkGraph.cc:740: undefined reference to `digraphE<KmerPath>::TransferEdges(int, int, unsigned char)' | ||
+ | |||
===== References ===== | ===== References ===== | ||
<refnotes>notes-separator: none</refnotes> | <refnotes>notes-separator: none</refnotes> | ||
~~REFNOTES cite~~ | ~~REFNOTES cite~~ |