====== Parallel Environments (PE) ======
Parallel environments (PE's) are environments used to execute shared and/or distributed memory applications.
===== Documentation =====
[[http://gridscheduler.sourceforge.net/htmlman/htmlman5/sge_pe.html|sge_pe man page]]
===== PE's and Campus Rocks =====
Useful commands:
* **qconf -spl** List PE's installed on the queue.
* **qconf -sp PE_NAME** List the attributes of a specific PE.
* **qconf -Ap PE_FILE** Create a new PE (note that on Campus Rocks, an ITTicket is required to install a new PE).
* **-pe PE_NAME** Using the PE option in a qsub job request.
* **$-pe PE_NAME** Using the PE in an embedded script qsub job request.
==== PE Attributes ====
* pe_name
* slots
* user_lists
* xuser_lists
* start_proc_args
* stop_proc_args
* allocation_rule
* control_slaves
* job_is_first_task
* urgency_slots
* accounting_summary
Key attributes to check when using a PE for a parallel application are the //allocation_rule// and //control_slaves// options. Depending on your specific application, you may need to ensure slots are allocated in specific way.
=== Allocation Types ===
The //allocation_rule// option of a PE can have several different values:
* **$pe_slots**: All slots are allocated to a single node
* **$fill_up**: Slots are allocated to fill up the most appropriate node before spilling into the next node.
* **$round_robin**: Slots are allocated one at a time, each one to a different node
* **[INT value]**: Each node can be allocated with a maximum of the specified number of slots.
Choosing the best allocation scheme will of course depend on the structure of the cluster being used. For example, Campus Rocks is divided into two queues, small.q and all.q with 1 node and 6 nodes respectively. The largest nodes have 64 slots available and the smallest have 8. Therefore, depending on the application, more or less nodes would be necessary for a job and thus the allocation scheme would have to be considered.
=== Tight Integration ===
The //control_slaves//, when set to TRUE, indicates a parallel application with a tight interface. There are more succinct definitions of this property elsewhere, but certain applications are optimized to run in the cluster environment (in our case SGE) and therefore have the potential to allow for a certain degree of internal management in the execution of the application. One such example is openmpi.
==== Currently Installed PE's ====
user$ qconf -spl
mpi
orte
mpich
denovo
==== mpi ====
user$ qconf -sp mpi
pe_name mpi
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /opt/gridengine/mpi/startmpi.sh $pe_hostfile
stop_proc_args /opt/gridengine/mpi/stopmpi.sh
allocation_rule $fill_up
control_slaves FALSE
job_is_first_task TRUE
urgency_slots min
accounting_summary TRUE
==== orte ====
user$ qconf -sp orte
pe_name orte
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $pe_slots
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary TRUE
==== mpich ====
user$ qconf -sp mpich
pe_name mpich
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile
stop_proc_args /opt/gridengine/mpi/stopmpi.sh
allocation_rule $fill_up
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary TRUE
==== denovo ====
user$ qconf -sp denovo
pe_name denovo
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /opt/gridengine/mpi/startmpi.sh $pe_hostfile
stop_proc_args /opt/gridengine/mpi/stopmpi.sh
allocation_rule $round_robin
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary TRUE