Parallel environments (PE's) are environments used to execute shared and/or distributed memory applications.
Useful commands:
Key attributes to check when using a PE for a parallel application are the allocation_rule and control_slaves options. Depending on your specific application, you may need to ensure slots are allocated in specific way.
The allocation_rule option of a PE can have several different values:
Choosing the best allocation scheme will of course depend on the structure of the cluster being used. For example, Campus Rocks is divided into two queues, small.q and all.q with 1 node and 6 nodes respectively. The largest nodes have 64 slots available and the smallest have 8. Therefore, depending on the application, more or less nodes would be necessary for a job and thus the allocation scheme would have to be considered.
The control_slaves, when set to TRUE, indicates a parallel application with a tight interface. There are more succinct definitions of this property elsewhere, but certain applications are optimized to run in the cluster environment (in our case SGE) and therefore have the potential to allow for a certain degree of internal management in the execution of the application. One such example is openmpi.
user$ qconf -spl mpi orte mpich denovo
user$ qconf -sp mpi pe_name mpi slots 9999 user_lists NONE xuser_lists NONE start_proc_args /opt/gridengine/mpi/startmpi.sh $pe_hostfile stop_proc_args /opt/gridengine/mpi/stopmpi.sh allocation_rule $fill_up control_slaves FALSE job_is_first_task TRUE urgency_slots min accounting_summary TRUE
user$ qconf -sp orte pe_name orte slots 9999 user_lists NONE xuser_lists NONE start_proc_args /bin/true stop_proc_args /bin/true allocation_rule $pe_slots control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary TRUE
user$ qconf -sp mpich pe_name mpich slots 9999 user_lists NONE xuser_lists NONE start_proc_args /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile stop_proc_args /opt/gridengine/mpi/stopmpi.sh allocation_rule $fill_up control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary TRUE
user$ qconf -sp denovo pe_name denovo slots 9999 user_lists NONE xuser_lists NONE start_proc_args /opt/gridengine/mpi/startmpi.sh $pe_hostfile stop_proc_args /opt/gridengine/mpi/stopmpi.sh allocation_rule $round_robin control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary TRUE