This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
lecture_notes:04-20-2015 [2015/04/25 02:58] calef [Meraculous algorithm] |
lecture_notes:04-20-2015 [2015/04/25 03:06] calef [Running Meraculous] |
||
---|---|---|---|
Line 16: | Line 16: | ||
* Searches unaligned reads as potential gap-closers using mate-pair data | * Searches unaligned reads as potential gap-closers using mate-pair data | ||
=====Meraculous limitations===== | =====Meraculous limitations===== | ||
- | * The assembler relies on data with high quality in order to avoid error correction | + | * The assembler relies on data with high quality in order to avoid error correction, also requires high coverage |
* Initial release did not support polyploid genome assembly due to allowing for linear subgraphs of the de Bruijn graph only | * Initial release did not support polyploid genome assembly due to allowing for linear subgraphs of the de Bruijn graph only | ||
- | * Low memory footprint | + | * High disk space usage |
=====User experience===== | =====User experience===== | ||
* Requires an array of other scripts in other languages | * Requires an array of other scripts in other languages | ||
* Most of high level scripts are written in perl | * Most of high level scripts are written in perl | ||
- | * Tested the program in small dataset and obtained contigs | + | * Tested the program with the packaged test data and obtained contigs |
=====Installation===== | =====Installation===== | ||
- | * Main issue was get all dependencies together | + | * Main issue was new version of GCC and getting all the dependencies together ~16 hrs |
- | * There was one non-standard perl mode needed | + | * There was one non-standard perl module needed |
+ | * Files with carriage returns | ||
* Some scripts contain error but they aren't hard to fix. | * Some scripts contain error but they aren't hard to fix. | ||
=====Running Meraculous===== | =====Running Meraculous===== | ||
- | * Execute run_meraculous.sh scripts along with the configuration file | + | * Execute run_meraculous.sh scripts along with user-provided configuration file |
* Configuration file contains info on where where data is and what format it comes in | * Configuration file contains info on where where data is and what format it comes in | ||
- | * It creates a timestamped folder that includes directories containing results of each step and executables to modify the run | + | * Creates a timestamped folder that includes directories containing results of each step and executables to suspend, resume, or restart the run from that step |
- | * Then you can check the errors that made a run fail and resume the run | + | * Thorough error-logging at each step, allowing you to check the errors that made a run fail and then resume the run after fixing the errors |
- | * Logs are informative | + | * SGE-aware, handles qsub and monitoring jobs |
=====Overall impression===== | =====Overall impression===== | ||
* Straightforward to figure out what went wrong just requiring a basic understanding of Perl | * Straightforward to figure out what went wrong just requiring a basic understanding of Perl |