Wed Jun 03 12:55:08 2015 run on kolossus, pid=15858 [Jun 2 2015 19:08:18 R52488 ]
DiscovarDeNovo \
READS=sample:MK::/scratch/ceisenhart/bams/BS-MK_seqprep_dupRemo \
ved.bam+sample:tag::/scratch/ceisenhart/bams/BS_tag_noAdap_noDu \
p.bam+sample:Mi19::/scratch/ceisenhart/bams/SW019_MiSeq_adapter \
Trimmed_dupRemoved.bam+sample:UC18::/scratch/ceisenhart/bams/UC \
SF_SW018_noAdap_noDup.bam+sample:UC19::/scratch/ceisenhart/bams \
/UCSF_SW019_noAdap_noDup.bam NUM_THREADS=16 MEMORY_CHECK=True \
OUT_DIR=/hive/users/ceisenhart/06.03.2015Assembly \
MAX_MEM_GB=750
SYSTEM INFO
- OS: Linux :: 2.6.32-504.3.3.el6.x86_64 :: #1 SMP Wed Dec 17 01:55:02 UTC 2014
- node name: kolossus
- hardware type: x86_64
- cache size: 24576 KB
- cpu MHz: 2260.913
- cpu model name: Intel(R) Xeon(R) CPU X7560 @ 2.27GHz
- physical memory: 1009.71 GB
MEMORY CHECK (typically takes several minutes; could cause
machine to become sluggish or result in this job being killed)
- Apparently able to allocate 100% of nominally available memory.
- Can access at least 750 GB.
Wed Jun 03 12:57:05 2015: finding input files
Wed Jun 03 12:57:05 2015: reading 5 files (which may take a while)
Wed Jun 03 12:57:05 2015: processing /scratch/ceisenhart/bams/BS-MK_seqprep_dupRemoved.bam.
Wed Jun 03 12:57:05 2015: memory in use = 0.01 GB, peak = 0.01 GB
Wed Jun 03 14:01:01 2015: there are 249,569,918 reads of mean length 250
Wed Jun 03 14:01:01 2015: memory in use = 111.46 GB, peak = 109.00 GB
Wed Jun 03 14:01:19 2015: reads sorted
Wed Jun 03 14:01:19 2015: memory in use = 112.40 GB, peak = 110.68 GB
Wed Jun 03 14:04:02 2015: data stashed in output structures
Wed Jun 03 14:04:02 2015: memory in use = 68.22 GB, peak = 120.57 GB
Wed Jun 03 14:04:02 2015: processing /scratch/ceisenhart/bams/BS_tag_noAdap_noDup.bam.
Wed Jun 03 14:04:02 2015: memory in use = 68.22 GB, peak = 120.57 GB
Wed Jun 03 14:36:30 2015: there are 135,779,818 reads of mean length 249
Wed Jun 03 14:36:30 2015: memory in use = 106.51 GB, peak = 120.57 GB
Wed Jun 03 14:36:40 2015: reads sorted
Wed Jun 03 14:36:40 2015: memory in use = 107.02 GB, peak = 120.57 GB
Wed Jun 03 14:38:13 2015: data stashed in output structures
Wed Jun 03 14:38:13 2015: memory in use = 100.67 GB, peak = 120.57 GB
Wed Jun 03 14:38:13 2015: processing /scratch/ceisenhart/bams/SW019_MiSeq_adapterTrimmed_dupRemoved.bam.
Wed Jun 03 14:38:13 2015: memory in use = 100.67 GB, peak = 120.57 GB
Wed Jun 03 14:55:49 2015: there are 54,509,730 reads of mean length 300
Wed Jun 03 14:55:49 2015: memory in use = 117.09 GB, peak = 120.57 GB
Wed Jun 03 14:55:53 2015: reads sorted
Wed Jun 03 14:55:53 2015: memory in use = 117.29 GB, peak = 120.57 GB
Wed Jun 03 14:56:43 2015: data stashed in output structures
Wed Jun 03 14:56:43 2015: memory in use = 115.79 GB, peak = 120.98 GB
Wed Jun 03 14:56:43 2015: processing /scratch/ceisenhart/bams/UCSF_SW018_noAdap_noDup.bam.
Wed Jun 03 14:56:43 2015: memory in use = 115.79 GB, peak = 120.98 GB
Wed Jun 03 15:21:40 2015: there are 113,862,306 reads of mean length 232
Wed Jun 03 15:21:40 2015: memory in use = 155.64 GB, peak = 152.54 GB
Wed Jun 03 15:21:48 2015: reads sorted
Wed Jun 03 15:21:48 2015: memory in use = 156.06 GB, peak = 152.83 GB
Wed Jun 03 15:23:10 2015: data stashed in output structures
Wed Jun 03 15:23:10 2015: memory in use = 148.62 GB, peak = 163.18 GB
Wed Jun 03 15:23:10 2015: processing /scratch/ceisenhart/bams/UCSF_SW019_noAdap_noDup.bam.
Wed Jun 03 15:23:10 2015: memory in use = 148.62 GB, peak = 163.18 GB
Wed Jun 03 15:59:42 2015: there are 150,162,130 reads of mean length 249
Wed Jun 03 15:59:42 2015: memory in use = 206.37 GB, peak = 202.74 GB
Wed Jun 03 15:59:53 2015: reads sorted
Wed Jun 03 15:59:53 2015: memory in use = 206.93 GB, peak = 202.74 GB
Wed Jun 03 16:01:41 2015: data stashed in output structures
Wed Jun 03 16:01:41 2015: memory in use = 198.18 GB, peak = 222.28 GB
INPUT FILES:
[1,type=frag,sample=MK,lib=1,frac=1] /scratch/ceisenhart/bams/BS-MK_seqprep_dupRemoved.bam
[2,type=frag,sample=tag,lib=1,frac=1] /scratch/ceisenhart/bams/BS_tag_noAdap_noDup.bam
[3,type=frag,sample=Mi19,lib=1,frac=1] /scratch/ceisenhart/bams/SW019_MiSeq_adapterTrimmed_dupRemoved.bam
[4,type=frag,sample=UC18,lib=1,frac=1] /scratch/ceisenhart/bams/UCSF_SW018_noAdap_noDup.bam
[5,type=frag,sample=UC19,lib=1,frac=1] /scratch/ceisenhart/bams/UCSF_SW019_noAdap_noDup.bam
Wed Jun 03 16:01:41 2015: found 5 samples
Wed Jun 03 16:01:41 2015: starts = 0,249569918,385349736,439859466,553721772
Wed Jun 03 16:11:44 2015: using 703,883,902 reads
Wed Jun 03 16:11:44 2015: data extraction complete, peak mem = 222.28 GB
3.24 hours used extracting reads
Wed Jun 03 16:25:48 2015: see total physical memory of 1,084,169,011,200 bytes
Wed Jun 03 16:25:48 2015: see user-imposed limit on memory of 805,306,368,000 bytes
Wed Jun 03 16:25:48 2015: 4.56 bytes per read base, assuming max memory available
We need 7 passes.
Expect 87893563 keys per batch.
Provide 98925162 keys per batch.
We need 9 passes.
Expect 68361660 keys per batch.
Provide 85294165 keys per batch.
Wed Jun 03 23:16:11 2015: back from buildReadQGraph
Wed Jun 03 23:16:11 2015: memory in use = 249.86 GB, peak = 583.25 GB
checksum_60 = 1500661769504106748
Wed Jun 03 23:21:36 2015: constructing places
Wed Jun 03 23:23:46 2015: sorting places
Wed Jun 03 23:39:21 2015: building all
Thu Jun 04 00:00:20 2015: calling LongReadsToPaths
Thu Jun 04 00:56:43 2015: writing
Thu Jun 04 01:00:50 2015: translating paths
Thu Jun 04 01:12:34 2015: final stage of path translation
Thu Jun 04 02:43:59 2015: writing paths
2.67 minutes used reloading assembly
Thu Jun 04 03:13:06 2015: start walking
memory in use = 217,802,301,440
Thu Jun 04 03:40:40 2015: start walking
memory in use = 206,994,972,672
1.04 hours used cleaning 200-mer graph
11.9 hours used in ReadQGrapher
4.05e-06 seconds used reloading reads
checksum_200 = 1499603233170572930
1 peak mem usage = 583.25 GB
2.56 minutes used loading stuff
2 peak mem usage = 583.25 GB
launching gap assemblies, mem usage = 189,699,788,800
Thu Jun 04 04:25:49 2015: finding unsatisfieds
Thu Jun 04 04:26:44 2015: creating multiplicity map
Thu Jun 04 04:26:53 2015: economizing links
Thu Jun 04 04:26:54 2015: forming neighborhoods
Thu Jun 04 04:28:24 2015: forming initial clusters
Thu Jun 04 04:29:57 2015: start sort
5.34 seconds used sorting
Thu Jun 04 04:30:04 2015: merging clusters
xs.size( ) = 21565962
45.5 minutes used merging
xs.size( ) = 1545255
Thu Jun 04 05:16:33 2015: start overlap-based merging
Thu Jun 04 05:17:51 2015: start overlap-based merging
LR.size( ) = 1327468
LR.size( ) = 665564
Thu Jun 04 05:40:49 2015: now processing 665564 blobs
Thu Jun 04 05:40:49 2015: memory in use = 204.69 GB, peak = 583.25 GB
………. ………. ………. ………. ……….
………. ………. ………. ………. ……….
3.34 days spent in local assemblies, memory in use = 228.88 GB, peak = 583.25 GB
Sun Jun 07 13:44:52 2015: patch reserving space
Sun Jun 07 13:44:52 2015: memory in use = 228.89 GB
35.4 seconds used patching, peak mem usage = 583.25 GB
new_stuff.size( ) = 7741492
Sun Jun 07 13:50:24 2015: building hb2
3.58 minutes used in new stuff 1 test
memory in use now = 211,686,506,496
Warning: HashSet initial size too small.
Sun Jun 07 14:37:56 2015: back from buildBigKHBVFromReads
47.6 minutes used in new stuff 2 test
peak mem usage = 583.25 GB
8 minutes used in new stuff 5
Sun Jun 07 14:51:06 2015: finding interesting reads
Sun Jun 07 14:51:06 2015: memory in use = 179.70 GB, peak = 583.25 GB
Sun Jun 07 15:19:25 2015: building dictionary
Sun Jun 07 15:19:25 2015: memory in use = 179.87 GB, peak = 583.25 GB
Sun Jun 07 15:21:35 2015: reducing
Sun Jun 07 15:21:35 2015: memory in use = 382.82 GB, peak = 583.25 GB
We need 1 passes.
Expect 19835364 keys per batch.
Provide 49588410 keys per batch.
Sun Jun 07 15:26:00 2015: kmerizing
Sun Jun 07 15:26:00 2015: memory in use = 403.09 GB, peak = 583.25 GB
We need 1 passes.
Expect 53444203 keys per batch.
Provide 133610506 keys per batch.
Sun Jun 07 15:31:52 2015: cleaning
Sun Jun 07 15:31:52 2015: memory in use = 403.09 GB, peak = 583.25 GB
Sun Jun 07 15:33:51 2015: finding uniquely aligning edges
Sun Jun 07 15:33:51 2015: memory in use = 403.10 GB, peak = 583.25 GB
1.88 hours used in new phase
hb.N( ) = 63606048, hb.EdgeObjectCount( ) = 48291486
107401 paths improved by rerouting
Sum(invalid) = 1610553, npids = 351941951
22051 edges tamped down
Sun Jun 07 17:00:54 2015: checking involution
Sun Jun 07 17:00:54 2015: done
WARNING: 75298 suspicious read-paths.
Sum(invalid) = 296088, npids = 351941951
11865 edges tamped down
Sun Jun 07 17:34:43 2015: making paths index for pull apart
Sun Jun 07 17:37:11 2015: pulling apart repeats
4.85 seconds used separating paths 1
2.98 minutes used in fixing mToLeft, mToRight, and mEdgeToPathIds
Sun Jun 07 17:44:25 2015: there were 65104 repeats pulled apart.
Sun Jun 07 17:44:25 2015: there were 534710 read paths removed during separation.
Sun Jun 07 17:46:20 2015: improving paths
Sun Jun 07 18:10:36 2015: done
10902131 paths extended
Sun Jun 07 18:26:51 2015: start degloop
Sun Jun 07 18:26:51 2015: creating path index
Sun Jun 07 18:29:43 2015: starting loop
Sun Jun 07 18:34:34 2015: degloop complete
Sun Jun 07 18:40:51 2015: unwinding three-edge plasmids
Sun Jun 07 18:40:59 2015: removing small components
Sun Jun 07 18:50:10 2015: writing a.fin files
Sun Jun 07 19:09:01 2015: determining candidates
Sun Jun 07 19:09:22 2015: determining candidates
Sun Jun 07 19:09:39 2015: determining candidates
Sun Jun 07 19:09:58 2015: determining candidates
Sun Jun 07 19:10:14 2015: determining candidates
CN fraction good = 0.57
Sun Jun 07 19:16:36 2015: deleting 4 gaps and adding 57 gaps to force symmetry
Sun Jun 07 19:20:17 2015: done making gaps, time used = 6 minutes
Sun Jun 07 19:32:19 2015: determining candidates
Sun Jun 07 19:32:35 2015: determining candidates
Sun Jun 07 19:32:47 2015: determining candidates
Sun Jun 07 19:33:01 2015: determining candidates
Sun Jun 07 19:33:14 2015: determining candidates
20.6 seconds using setting up final fasta
3.34 minutes using printing final fasta
assembly has 5580699 edges of mean length 747.2
contig line N50: 10,427
scaffold line N50: 12,549
total bases in 1 kb+ scaffolds: 1,885,373,341
total bases in 10 kb+ scaffolds: 1,106,140,476
There are 703,883,902 reads of mean length 250.9 and mean base quality 31.4.
MPL1 = mean length of first read in pair up to first error = 169
(normal range is 175-225 for 250 base reads)
Estimated chimera rate in read pairs (including mismapping) = 0.08%.
genomic read coverage, using 1 kb+ scaffolds for genome size estimate: 93.7
run started Wed Jun 03 12:55:08 2015, completed Sun Jun 07 19:50:13 2015
peak mem usage = 583.25 GB, total time = 103 hours
final checksum = 638735405986598733
DiscovarDeNovo READS=sample:MK::/scratch/ceisenhart/bams/BS-MK_seqprep_dupRemoved.bam+sample:tag::/scratch/ceisenhart/bams/BS_tag_noAdap_noDup.bam+sample:Mi19::/scratch/ceisenhart/bams/SW019_MiSeq_adapterTrimmed_dupRemoved.bam+sample:UC18::/scratch/ceisenhart/bams/UCSF_SW018_noAdap_noDup.bam+sample:UC19::/scratch/ceisenhart/bams/UCSF_SW019_noAdap_noDup.bam NUM_THREADS=16 MEMORY_CHECK=True OUT_DIR=/hive/users/ceisenhart/06.03.2015Assembly MAX_MEM_GB=750
Sun Jun 07 19:50:29 2015: done