This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
archive:bioinformatic_tools:abyss [2010/05/15 20:09] jstjohn |
archive:bioinformatic_tools:abyss [2010/05/19 17:46] jstjohn |
||
---|---|---|---|
Line 234: | Line 234: | ||
Note that I originally started this process on campusrocks-1-0.local with two cores allocated per available compute node on campusrocks. The process was eventually killed either by a cluster admin, or something else. I then decided to re run the program from campusrocks-0-6.local due to its larger amount of available ram. The assembly picked up where it left off and the process that previously was killed finished within a fairly short period of time. | Note that I originally started this process on campusrocks-1-0.local with two cores allocated per available compute node on campusrocks. The process was eventually killed either by a cluster admin, or something else. I then decided to re run the program from campusrocks-0-6.local due to its larger amount of available ram. The assembly picked up where it left off and the process that previously was killed finished within a fairly short period of time. | ||
+ | |||
+ | After completely crashing campusrocks-0-6.local with my process I realized that the makefile was taking the -j j=2 command, probably ignoring the -j=2 part, and parallelizing as much as possible at each step (on each core). On my head node I was running 8 huge processes simultaniously, which probably lead to node 6 going down. I almost did the same to node 1-20 before I realized what was going on and stopped the script. I have reissued the makefile with the following command which doesn't try to pump more parallelization out of the head node: | ||
+ | |||
+ | <code> | ||
+ | /campus/BME235/programs/abyss_tmp/bin/abyss-pe mpirun="/opt/openmpi/bin/mpirun -machinefile machines -x PATH=/campus/BME235/bin/programs/abyss_tmp/bin:$PATH" np=23 n=8 k=28 name=slugAbyss lib='lane1 lane2 lane3 lane5 lane6 lane7 lane8' lane1='/campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_1_1_all_qseq.fastq /campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_1_2_all_qseq.fastq' lane2='/campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_2_1_all_qseq.fastq /campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_2_2_all_qseq.fastq' lane3='/campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_3_1_all_qseq.fastq /campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_3_2_all_qseq.fastq' lane5='/campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_5_1_all_qseq.fastq /campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_5_2_all_qseq.fastq' lane6='/campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_6_1_all_qseq.fastq /campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_6_2_all_qseq.fastq' lane7='/campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_7_1_all_qseq.fastq /campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_7_2_all_qseq.fastq' lane8='/campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_8_1_all_qseq.fastq /campus/BME235/data/slug/Illumina/illumina_run_1/CeleraReads/s_8_2_all_qseq.fastq' | ||
+ | </code> | ||
+ | |||
+ | Also because the makefile crashed, it didn't get a chance to clean up the output from the previous step. I had to manually delete the lane-x-3.hist files (which were all of size 0 anyway). After doing this the makefile was able to pick up where it left off and re-generate the lane-x-3.hist files. | ||
+ | |||
+ | There is some error where the laneX.hist files are empty... | ||
+ | ====Attempt 4==== | ||
+ | I have access to kolossus which has 1.1tb of ram. I will now run the program on kolossus to see if it will assemble there... | ||
+ | |||
+ | Step1: | ||
+ | Install ABySS on kolossus. Following the exactly same process as listed above except with --prefix=/scratch/jstjohn on kolossus. The installation was straightforward and went without a hitch. | ||
+ | |||
+ | Binaries and libraries are located here: | ||
+ | /scratch/jstjohn/bin | ||
+ | /scratch/jstjohn/lib | ||
+ | |||
+ | Step2: | ||
+ | Galt has already coppied the banana slug illumina reads to /scratch/galt/bananaSlug, I added the 454 fastq reads to that folder as well. | ||
+ | |||
+ | Step3: | ||
+ | from screen on kolossus execute the following command: | ||
+ | <code> | ||
+ | set path = ( /scratch/jstjohn/bin $path ) | ||
+ | abyss-pe -j j=4 k=35 n=2 mpirun="/scratch/jstjohn/bin/mpirun -machinefile machinefile -x PATH=/scratch/jstjohn/bin:$PATH" np=30 lib='lib1' lib1='/scratch/galt/bananaSlug/slug_1.fastq /scratch/galt/bananaSlug/slug_2.fastq' se='/scratch/galt/bananaSlug/GAZ7HUX02.fastq /scratch/galt/bananaSlug/GAZ7HUX03.fastq /scratch/galt/bananaSlug/GAZ7HUX04.fastq /scratch/galt/bananaSlug/GCLL8Y406.fastq' name=slugAbyss3 | ||
+ | </code> | ||
+ | |||
+ | |||
+ | Note that this run combines both the illumina runs and the 454 data for banana slug. I am also experimenting with a k=35 since Galt had better luck with a kmer size of 31 using SOAPdenovo than a kmer size of 23, perhaps the trend continues into larger kmers. If this doesn't work for whatever reason, I will also try shorter and longer kmers. | ||
+ | |||
+ | We combined all fastq files into two large files representing the two read pairs. Each of these files is approximately 50GB and contain roughly 20GB of reads. Even on kolossus I am getting some out of disk space errors in the following step: | ||
+ | |||
+ | <code> | ||
+ | KAligner -j4 -k35 /scratch/galt/bananaSlug/slug_1.fastq /scratch/galt/bananaSlug/slug_2.fastq slugAbyss3-3.fa \ | ||
+ | |ParseAligns -k35 -h lib1-3.hist \ | ||
+ | |sort -nk2,2 \ | ||
+ | |gzip >lib1-3.pair.gz | ||
+ | </code> | ||
+ | |||
+ | Near the height I have observed this is eating up about 50G of ram, but the issue appears to be in available space for the sort algorithm in kolossus's /tmp/ directory. I am trying this again so I can copy down the error and send it to cluster-admin because kolossus should have around 400GB free of local HD space on top of its 1.1TB of ram. (kolossus has more ram than HD space: 1.1TB of ram vs 750GB hd) | ||
+ | |||
+ | |||
===== References ===== | ===== References ===== | ||
<refnotes>notes-separator: none</refnotes> | <refnotes>notes-separator: none</refnotes> | ||
~~REFNOTES cite~~ | ~~REFNOTES cite~~ |