This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
archive:bioinformatic_tools:euler [2010/04/11 22:25] jstjohn |
archive:bioinformatic_tools:euler [2015/07/28 06:23] (current) ceisenhart ↷ Page moved from bioinformatic_tools:euler to archive:bioinformatic_tools:euler |
||
---|---|---|---|
Line 29: | Line 29: | ||
and symlinked the executables and scripts into that directory. | and symlinked the executables and scripts into that directory. | ||
- | to execute program or scripts one option is to do one of the following. | ||
- | - add /campusdata/BME235/bin/euler to your path | + | ==== Environmental Variables ==== |
+ | |||
+ | Add the following to your path: | ||
+ | /campusdata/BME235/bin/euler | ||
+ | |||
+ | set the following environmental variable: | ||
+ | EULERBIN=/campusdata/BME235/bin/euler | ||
+ | |||
+ | Make sure that your LD_LIBRARY_PATH and LD_RUN_PATH variables do NOT point to our /campusdata/BME235/lib directory. For some reason this seems to get in the way of some of the standard libraries used by the 4.1 compiler. Since this program was compiled against 4.1 this environmental variable needs to stay default. * I think the system has a default LD_LIBRARY_PATH variable that works fine, so an //unset// is not appropriate in this case without re-setting the default minus our /campusdata/BME235/lib/../lib64 path. | ||
+ | |||
+ | ==== Assembly ==== | ||
+ | |||
+ | I am currently doing an assembly on one of the test datasets provided on the euler home page. To see this dataset and the script used to run it go to: | ||
+ | /campusdata/BME235/programs/euler/testData | ||
+ | |||
+ | Once I can get this one test dataset working I won't bother with the other they provided, and move on to testing on Pog. My main goals in running the program on this test data are to eliminate as many confounding variables in getting proper execution, and also to explore and better understand the format that euler expects. Here is what I have discovered about Euler's required data format so far: | ||
+ | |||
+ | * All reads need to be in a single fasta file, however this file may contain reads from different runs/technologies. | ||
+ | * The name of the read matters and is used by Euler to determine whether each sequence has a mate pair, what the mate pair is, and also the distance range of that mate to its pair. This is how you differentiate between reads from different technologies or library protocols. | ||
+ | * How Fasta id lines are deciphered is defined by the user in a file called //names.rul// and this file must be placed in the directory you are running your assembly from (your cwd when you start execution). | ||
+ | |||