Wed Jun 03 12:55:08 2015 run on kolossus, pid=15858 [Jun 2 2015 19:08:18 R52488 ]

DiscovarDeNovo \

             READS=sample:MK::/scratch/ceisenhart/bams/BS-MK_seqprep_dupRemo \
             ved.bam+sample:tag::/scratch/ceisenhart/bams/BS_tag_noAdap_noDu \
             p.bam+sample:Mi19::/scratch/ceisenhart/bams/SW019_MiSeq_adapter \
             Trimmed_dupRemoved.bam+sample:UC18::/scratch/ceisenhart/bams/UC \
             SF_SW018_noAdap_noDup.bam+sample:UC19::/scratch/ceisenhart/bams \
             /UCSF_SW019_noAdap_noDup.bam NUM_THREADS=16 MEMORY_CHECK=True   \
             OUT_DIR=/hive/users/ceisenhart/06.03.2015Assembly               \
             MAX_MEM_GB=750

SYSTEM INFO

- OS: Linux :: 2.6.32-504.3.3.el6.x86_64 :: #1 SMP Wed Dec 17 01:55:02 UTC 2014

- node name: kolossus

- hardware type: x86_64

- cache size: 24576 KB

- cpu MHz: 2260.913

- cpu model name: Intel(R) Xeon(R) CPU X7560 @ 2.27GHz

- physical memory: 1009.71 GB

MEMORY CHECK (typically takes several minutes; could cause

machine to become sluggish or result in this job being killed)

- Apparently able to allocate 100% of nominally available memory.

- Can access at least 750 GB.

Wed Jun 03 12:57:05 2015: finding input files

Wed Jun 03 12:57:05 2015: reading 5 files (which may take a while)

Wed Jun 03 12:57:05 2015: processing /scratch/ceisenhart/bams/BS-MK_seqprep_dupRemoved.bam.

Wed Jun 03 12:57:05 2015: memory in use = 0.01 GB, peak = 0.01 GB

Wed Jun 03 14:01:01 2015: there are 249,569,918 reads of mean length 250

Wed Jun 03 14:01:01 2015: memory in use = 111.46 GB, peak = 109.00 GB

Wed Jun 03 14:01:19 2015: reads sorted

Wed Jun 03 14:01:19 2015: memory in use = 112.40 GB, peak = 110.68 GB

Wed Jun 03 14:04:02 2015: data stashed in output structures

Wed Jun 03 14:04:02 2015: memory in use = 68.22 GB, peak = 120.57 GB

Wed Jun 03 14:04:02 2015: processing /scratch/ceisenhart/bams/BS_tag_noAdap_noDup.bam.

Wed Jun 03 14:04:02 2015: memory in use = 68.22 GB, peak = 120.57 GB

Wed Jun 03 14:36:30 2015: there are 135,779,818 reads of mean length 249

Wed Jun 03 14:36:30 2015: memory in use = 106.51 GB, peak = 120.57 GB

Wed Jun 03 14:36:40 2015: reads sorted

Wed Jun 03 14:36:40 2015: memory in use = 107.02 GB, peak = 120.57 GB

Wed Jun 03 14:38:13 2015: data stashed in output structures

Wed Jun 03 14:38:13 2015: memory in use = 100.67 GB, peak = 120.57 GB

Wed Jun 03 14:38:13 2015: processing /scratch/ceisenhart/bams/SW019_MiSeq_adapterTrimmed_dupRemoved.bam.

Wed Jun 03 14:38:13 2015: memory in use = 100.67 GB, peak = 120.57 GB

Wed Jun 03 14:55:49 2015: there are 54,509,730 reads of mean length 300

Wed Jun 03 14:55:49 2015: memory in use = 117.09 GB, peak = 120.57 GB

Wed Jun 03 14:55:53 2015: reads sorted

Wed Jun 03 14:55:53 2015: memory in use = 117.29 GB, peak = 120.57 GB

Wed Jun 03 14:56:43 2015: data stashed in output structures

Wed Jun 03 14:56:43 2015: memory in use = 115.79 GB, peak = 120.98 GB

Wed Jun 03 14:56:43 2015: processing /scratch/ceisenhart/bams/UCSF_SW018_noAdap_noDup.bam.

Wed Jun 03 14:56:43 2015: memory in use = 115.79 GB, peak = 120.98 GB

Wed Jun 03 15:21:40 2015: there are 113,862,306 reads of mean length 232

Wed Jun 03 15:21:40 2015: memory in use = 155.64 GB, peak = 152.54 GB

Wed Jun 03 15:21:48 2015: reads sorted

Wed Jun 03 15:21:48 2015: memory in use = 156.06 GB, peak = 152.83 GB

Wed Jun 03 15:23:10 2015: data stashed in output structures

Wed Jun 03 15:23:10 2015: memory in use = 148.62 GB, peak = 163.18 GB

Wed Jun 03 15:23:10 2015: processing /scratch/ceisenhart/bams/UCSF_SW019_noAdap_noDup.bam.

Wed Jun 03 15:23:10 2015: memory in use = 148.62 GB, peak = 163.18 GB

Wed Jun 03 15:59:42 2015: there are 150,162,130 reads of mean length 249

Wed Jun 03 15:59:42 2015: memory in use = 206.37 GB, peak = 202.74 GB

Wed Jun 03 15:59:53 2015: reads sorted

Wed Jun 03 15:59:53 2015: memory in use = 206.93 GB, peak = 202.74 GB

Wed Jun 03 16:01:41 2015: data stashed in output structures

Wed Jun 03 16:01:41 2015: memory in use = 198.18 GB, peak = 222.28 GB

INPUT FILES:

[1,type=frag,sample=MK,lib=1,frac=1] /scratch/ceisenhart/bams/BS-MK_seqprep_dupRemoved.bam

[2,type=frag,sample=tag,lib=1,frac=1] /scratch/ceisenhart/bams/BS_tag_noAdap_noDup.bam

[3,type=frag,sample=Mi19,lib=1,frac=1] /scratch/ceisenhart/bams/SW019_MiSeq_adapterTrimmed_dupRemoved.bam

[4,type=frag,sample=UC18,lib=1,frac=1] /scratch/ceisenhart/bams/UCSF_SW018_noAdap_noDup.bam

[5,type=frag,sample=UC19,lib=1,frac=1] /scratch/ceisenhart/bams/UCSF_SW019_noAdap_noDup.bam

Wed Jun 03 16:01:41 2015: found 5 samples

Wed Jun 03 16:01:41 2015: starts = 0,249569918,385349736,439859466,553721772

Wed Jun 03 16:11:44 2015: using 703,883,902 reads

Wed Jun 03 16:11:44 2015: data extraction complete, peak mem = 222.28 GB

3.24 hours used extracting reads

Wed Jun 03 16:25:48 2015: see total physical memory of 1,084,169,011,200 bytes

Wed Jun 03 16:25:48 2015: see user-imposed limit on memory of 805,306,368,000 bytes

Wed Jun 03 16:25:48 2015: 4.56 bytes per read base, assuming max memory available

We need 7 passes.

Expect 87893563 keys per batch.

Provide 98925162 keys per batch.

We need 9 passes.

Expect 68361660 keys per batch.

Provide 85294165 keys per batch.

Wed Jun 03 23:16:11 2015: back from buildReadQGraph

Wed Jun 03 23:16:11 2015: memory in use = 249.86 GB, peak = 583.25 GB

checksum_60 = 1500661769504106748

Wed Jun 03 23:21:36 2015: constructing places

Wed Jun 03 23:23:46 2015: sorting places

Wed Jun 03 23:39:21 2015: building all

Thu Jun 04 00:00:20 2015: calling LongReadsToPaths

Thu Jun 04 00:56:43 2015: writing

Thu Jun 04 01:00:50 2015: translating paths

Thu Jun 04 01:12:34 2015: final stage of path translation

Thu Jun 04 02:43:59 2015: writing paths

2.67 minutes used reloading assembly

Thu Jun 04 03:13:06 2015: start walking

memory in use = 217,802,301,440

Thu Jun 04 03:40:40 2015: start walking

memory in use = 206,994,972,672

1.04 hours used cleaning 200-mer graph

11.9 hours used in ReadQGrapher

4.05e-06 seconds used reloading reads

checksum_200 = 1499603233170572930

1 peak mem usage = 583.25 GB

2.56 minutes used loading stuff

2 peak mem usage = 583.25 GB

launching gap assemblies, mem usage = 189,699,788,800

Thu Jun 04 04:25:49 2015: finding unsatisfieds

Thu Jun 04 04:26:44 2015: creating multiplicity map

Thu Jun 04 04:26:53 2015: economizing links

Thu Jun 04 04:26:54 2015: forming neighborhoods

Thu Jun 04 04:28:24 2015: forming initial clusters

Thu Jun 04 04:29:57 2015: start sort

5.34 seconds used sorting

Thu Jun 04 04:30:04 2015: merging clusters

xs.size( ) = 21565962

45.5 minutes used merging

xs.size( ) = 1545255

Thu Jun 04 05:16:33 2015: start overlap-based merging

Thu Jun 04 05:17:51 2015: start overlap-based merging

LR.size( ) = 1327468

LR.size( ) = 665564

Thu Jun 04 05:40:49 2015: now processing 665564 blobs

Thu Jun 04 05:40:49 2015: memory in use = 204.69 GB, peak = 583.25 GB

………. ………. ………. ………. ……….

………. ………. ………. ………. ……….

3.34 days spent in local assemblies, memory in use = 228.88 GB, peak = 583.25 GB

Sun Jun 07 13:44:52 2015: patch reserving space

Sun Jun 07 13:44:52 2015: memory in use = 228.89 GB

35.4 seconds used patching, peak mem usage = 583.25 GB

new_stuff.size( ) = 7741492

Sun Jun 07 13:50:24 2015: building hb2

3.58 minutes used in new stuff 1 test

memory in use now = 211,686,506,496

Warning: HashSet initial size too small.

Sun Jun 07 14:37:56 2015: back from buildBigKHBVFromReads

47.6 minutes used in new stuff 2 test

peak mem usage = 583.25 GB

8 minutes used in new stuff 5

Sun Jun 07 14:51:06 2015: finding interesting reads

Sun Jun 07 14:51:06 2015: memory in use = 179.70 GB, peak = 583.25 GB

Sun Jun 07 15:19:25 2015: building dictionary

Sun Jun 07 15:19:25 2015: memory in use = 179.87 GB, peak = 583.25 GB

Sun Jun 07 15:21:35 2015: reducing

Sun Jun 07 15:21:35 2015: memory in use = 382.82 GB, peak = 583.25 GB

We need 1 passes.

Expect 19835364 keys per batch.

Provide 49588410 keys per batch.

Sun Jun 07 15:26:00 2015: kmerizing

Sun Jun 07 15:26:00 2015: memory in use = 403.09 GB, peak = 583.25 GB

We need 1 passes.

Expect 53444203 keys per batch.

Provide 133610506 keys per batch.

Sun Jun 07 15:31:52 2015: cleaning

Sun Jun 07 15:31:52 2015: memory in use = 403.09 GB, peak = 583.25 GB

Sun Jun 07 15:33:51 2015: finding uniquely aligning edges

Sun Jun 07 15:33:51 2015: memory in use = 403.10 GB, peak = 583.25 GB

1.88 hours used in new phase

hb.N( ) = 63606048, hb.EdgeObjectCount( ) = 48291486

107401 paths improved by rerouting

Sum(invalid) = 1610553, npids = 351941951

22051 edges tamped down

Sun Jun 07 17:00:54 2015: checking involution

Sun Jun 07 17:00:54 2015: done

WARNING: 75298 suspicious read-paths.

Sum(invalid) = 296088, npids = 351941951

11865 edges tamped down

Sun Jun 07 17:34:43 2015: making paths index for pull apart

Sun Jun 07 17:37:11 2015: pulling apart repeats

4.85 seconds used separating paths 1

2.98 minutes used in fixing mToLeft, mToRight, and mEdgeToPathIds

Sun Jun 07 17:44:25 2015: there were 65104 repeats pulled apart.

Sun Jun 07 17:44:25 2015: there were 534710 read paths removed during separation.

Sun Jun 07 17:46:20 2015: improving paths

Sun Jun 07 18:10:36 2015: done

10902131 paths extended

Sun Jun 07 18:26:51 2015: start degloop

Sun Jun 07 18:26:51 2015: creating path index

Sun Jun 07 18:29:43 2015: starting loop

Sun Jun 07 18:34:34 2015: degloop complete

Sun Jun 07 18:40:51 2015: unwinding three-edge plasmids

Sun Jun 07 18:40:59 2015: removing small components

Sun Jun 07 18:50:10 2015: writing a.fin files

Sun Jun 07 19:09:01 2015: determining candidates

Sun Jun 07 19:09:22 2015: determining candidates

Sun Jun 07 19:09:39 2015: determining candidates

Sun Jun 07 19:09:58 2015: determining candidates

Sun Jun 07 19:10:14 2015: determining candidates

CN fraction good = 0.57

Sun Jun 07 19:16:36 2015: deleting 4 gaps and adding 57 gaps to force symmetry

Sun Jun 07 19:20:17 2015: done making gaps, time used = 6 minutes

Sun Jun 07 19:32:19 2015: determining candidates

Sun Jun 07 19:32:35 2015: determining candidates

Sun Jun 07 19:32:47 2015: determining candidates

Sun Jun 07 19:33:01 2015: determining candidates

Sun Jun 07 19:33:14 2015: determining candidates

20.6 seconds using setting up final fasta

3.34 minutes using printing final fasta

assembly has 5580699 edges of mean length 747.2

contig line N50: 10,427

scaffold line N50: 12,549

total bases in 1 kb+ scaffolds: 1,885,373,341

total bases in 10 kb+ scaffolds: 1,106,140,476

There are 703,883,902 reads of mean length 250.9 and mean base quality 31.4.

MPL1 = mean length of first read in pair up to first error = 169

(normal range is 175-225 for 250 base reads)

Estimated chimera rate in read pairs (including mismapping) = 0.08%.

genomic read coverage, using 1 kb+ scaffolds for genome size estimate: 93.7

run started Wed Jun 03 12:55:08 2015, completed Sun Jun 07 19:50:13 2015

peak mem usage = 583.25 GB, total time = 103 hours

final checksum = 638735405986598733

DiscovarDeNovo READS=sample:MK::/scratch/ceisenhart/bams/BS-MK_seqprep_dupRemoved.bam+sample:tag::/scratch/ceisenhart/bams/BS_tag_noAdap_noDup.bam+sample:Mi19::/scratch/ceisenhart/bams/SW019_MiSeq_adapterTrimmed_dupRemoved.bam+sample:UC18::/scratch/ceisenhart/bams/UCSF_SW018_noAdap_noDup.bam+sample:UC19::/scratch/ceisenhart/bams/UCSF_SW019_noAdap_noDup.bam NUM_THREADS=16 MEMORY_CHECK=True OUT_DIR=/hive/users/ceisenhart/06.03.2015Assembly MAX_MEM_GB=750

Sun Jun 07 19:50:29 2015: done