====== Parallel Environments (PE) ====== Parallel environments (PE's) are environments used to execute shared and/or distributed memory applications. ===== Documentation ===== [[http://gridscheduler.sourceforge.net/htmlman/htmlman5/sge_pe.html|sge_pe man page]] ===== PE's and Campus Rocks ===== Useful commands: * **qconf -spl** List PE's installed on the queue. * **qconf -sp PE_NAME** List the attributes of a specific PE. * **qconf -Ap PE_FILE** Create a new PE (note that on Campus Rocks, an ITTicket is required to install a new PE). * **-pe PE_NAME** Using the PE option in a qsub job request. * **$-pe PE_NAME** Using the PE in an embedded script qsub job request. ==== PE Attributes ==== * pe_name * slots * user_lists * xuser_lists * start_proc_args * stop_proc_args * allocation_rule * control_slaves * job_is_first_task * urgency_slots * accounting_summary Key attributes to check when using a PE for a parallel application are the //allocation_rule// and //control_slaves// options. Depending on your specific application, you may need to ensure slots are allocated in specific way. === Allocation Types === The //allocation_rule// option of a PE can have several different values: * **$pe_slots**: All slots are allocated to a single node * **$fill_up**: Slots are allocated to fill up the most appropriate node before spilling into the next node. * **$round_robin**: Slots are allocated one at a time, each one to a different node * **[INT value]**: Each node can be allocated with a maximum of the specified number of slots. Choosing the best allocation scheme will of course depend on the structure of the cluster being used. For example, Campus Rocks is divided into two queues, small.q and all.q with 1 node and 6 nodes respectively. The largest nodes have 64 slots available and the smallest have 8. Therefore, depending on the application, more or less nodes would be necessary for a job and thus the allocation scheme would have to be considered. === Tight Integration === The //control_slaves//, when set to TRUE, indicates a parallel application with a tight interface. There are more succinct definitions of this property elsewhere, but certain applications are optimized to run in the cluster environment (in our case SGE) and therefore have the potential to allow for a certain degree of internal management in the execution of the application. One such example is openmpi. ==== Currently Installed PE's ==== user$ qconf -spl mpi orte mpich denovo ==== mpi ==== user$ qconf -sp mpi pe_name mpi slots 9999 user_lists NONE xuser_lists NONE start_proc_args /opt/gridengine/mpi/startmpi.sh $pe_hostfile stop_proc_args /opt/gridengine/mpi/stopmpi.sh allocation_rule $fill_up control_slaves FALSE job_is_first_task TRUE urgency_slots min accounting_summary TRUE ==== orte ==== user$ qconf -sp orte pe_name orte slots 9999 user_lists NONE xuser_lists NONE start_proc_args /bin/true stop_proc_args /bin/true allocation_rule $pe_slots control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary TRUE ==== mpich ==== user$ qconf -sp mpich pe_name mpich slots 9999 user_lists NONE xuser_lists NONE start_proc_args /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile stop_proc_args /opt/gridengine/mpi/stopmpi.sh allocation_rule $fill_up control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary TRUE ==== denovo ==== user$ qconf -sp denovo pe_name denovo slots 9999 user_lists NONE xuser_lists NONE start_proc_args /opt/gridengine/mpi/startmpi.sh $pe_hostfile stop_proc_args /opt/gridengine/mpi/stopmpi.sh allocation_rule $round_robin control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary TRUE