[padb] Patch of support of Slurm + Openmpi Orte manager

thipadin.seng-long at bull.net thipadin.seng-long at bull.net
Mon Nov 30 14:36:00 GMT 2009


Hi Ashley,

May I introduce you my patch against padb r341 for supporting Slurm 
combined with openmpi orte manager.
The Key is we use salloc to get resource from slurm and then use it to run 
mpirun of openmpi to start jobs.
This kind of combination is not yet supported by current padb so far. When 
we start padb with
rmgr=slurm in a such job environment we have seen only the stack of orted 
(see below)
So my patch aims to remedy the situation.

Here are what's going on:
salloc -p tsl -w machu139,machu140,machu141
[thipa at machu0 padb_open]$ 
salloc: Granted job allocation 8324
[thipa at machu0 padb_open]$ 
[thipa at machu0 padb_open]$ srun -n1 mpirun -n 9 pp_sndrcv_spbl
srun: Warning: can't run 1 processes on 3 nodes, setting nnodes to 1
I am, process 0 starting on machu139, total by srun  9
I am, process 3 starting on machu139, total by srun  9
I am, process 6 starting on machu139, total by srun  9
I am, process 1 starting on machu140, total by srun  9
I am, process 8 starting on machu141, total by srun  9
I am, process 4 starting on machu140, total by srun  9
I am, process 2 starting on machu141, total by srun  9
I am, process 7 starting on machu140, total by srun  9
I am, process 5 starting on machu141, total by srun  9
Me, process 0, send  1000 to process 2

...........

[thipa at machu0 padb_open]$ squeue
  JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
   8324       tsl     bash    thipa   R      36:33      3 machu[139-141]
[thipa at machu0 padb_open]$ 


padb with rmgr=slurm

[thipa at machu0 padb_open]$ ./padb -O rmgr="slurm" -O stack-shows-locals=no 
-O stack-shows-params=no --debug=verbose=all -tx 8324
DEBUG (verbose):   0: There are 1 processes over 3 hosts
-----------------
[0] (1 processes)
-----------------
main() at ?:?
  orterun() at ?:?
    opal_event_dispatch() at event.c:682
      opal_event_loop() at event.c:746
        poll_dispatch() at poll.c:167
          poll() at ?:?
DEBUG (verbose):   0: Completed command
[thipa at machu0 padb_open]$ 

padb with rmgr="sl-orte"  (my patch)


[thipa at machu0 padb_open]$ ./padb -O rmgr="sl-orte" -O 
stack-shows-locals=no  -O stack-shows-params=no --debug=verbose=all -tx 
8324
DEBUG (verbose):   0: There are 1 processes over 3 hosts
Warning, remote process state differs across ranks
state : ranks
R (running) : [2]
S (sleeping) : [0-1,3-8]
-----------------
[0-8] (9 processes)
-----------------
ThreadId: 1
  -----------------
  [0-1,3-8] (8 processes)
  -----------------
  main() at pp_sndrcv_spbl.c:55
    PMPI_Finalize() at pfinalize.c:46
      ompi_mpi_finalize() at runtime/ompi_mpi_finalize.c:224
        barrier() at grpcomm_bad_module.c:277
          opal_progress() at runtime/opal_progress.c:189
            ThreadId: 2
              start_thread() at ?:?
                btl_openib_async_thread() at btl_openib_async.c:346
                  poll() at ?:?
                    ThreadId: 3
                      start_thread() at ?:?
                        service_thread_start() at btl_openib_fd.c:427
                          select() at ?:?
  -----------------
  [2] (1 processes)
  -----------------
  main() at pp_sndrcv_spbl.c:50
    PMPI_Recv() at precv.c:78
      mca_pml_ob1_recv() at pml_ob1_irecv.c:104
        opal_progress() at runtime/opal_progress.c:207
          ThreadId: 2
            start_thread() at ?:?
              btl_openib_async_thread() at btl_openib_async.c:346
                poll() at ?:?
                  ThreadId: 3
                    start_thread() at ?:?
                      service_thread_start() at btl_openib_fd.c:427
                        select() at ?:?
DEBUG (verbose):   1: Completed command
[thipa at machu0 padb_open]$


Possibility to start jobs as follows:

1-salloc  ... mpirun  -n 6  openmpi_appli
2-salloc ....
    bash:  mpirun  -n 6  openmpi_appli
3-salloc ...
    bash:   srun -n 1 mpirun  -n 6  openmpi_appli

Here is the patch you may commit as is or work over. The patch support all 
possibility above.
I don't use scontrol listpids, because I found this command not a 
universal method (some version doesn't have it),
and may issued error message such as :
slurmd[machu139]: proctrack/pgid does not implement 
slurm_container_get_pids

Thipadin.
More later.





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pittman.org.uk/pipermail/padb-devel_pittman.org.uk/attachments/20091130/e7bf1d14/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slorte.patch
Type: application/octet-stream
Size: 4057 bytes
Desc: not available
URL: <http://pittman.org.uk/pipermail/padb-devel_pittman.org.uk/attachments/20091130/e7bf1d14/attachment.obj>


More information about the padb-devel mailing list