[padb-users] Trouble running padb with intelmpi

Duncan .H harris.duncan at gmail.com
Mon May 12 11:55:09 BST 2014


Hi,
We're having problem running padb with intelMPI (4.x) on our systems
and were hoping for some advice on tracking down the problem.

We keep getting these errors:

--------
host3:~> padb --show-jobs
33224
host3:~> padb -tx 33224
No MPIR_proctable_size symbol found, cannot continue
No suitable backend found (perhaps try installing pdsh or clush ?)!
Fatal problem setting up the resource manager: mpirun
--------

pdsh -is- installed and available to the user.
We have Hydra as our underlying process manager.
We're have gdb version 7.2-48.e16
We've tried intelmpi 4.0.3 and 4.1.1 with the same results.


Explicitly setting the resource manager to mpirun doesn't help:

host3:~> padb  --list-rmgrs
local: 33064 33135 33171 33172 33224 33229 33230 33234 33235 33236
33237 33238 33239 33240 33241 33242 33243 33244 33245 33246 33247
33248 33249 33350 33351 33797
local-fd: No active jobs.
local-qsnet: Not detected on system.
lsf: Not detected on system.
lsf-rms: Not detected on system.
mpd: No active jobs.
mpirun: 33224
orte: Not detected on system.
pbs: Warning, job is listed with unexpected server
Warning, job is listed with unexpected server
794585 794586
rms: Not detected on system.
slurm: Not detected on system.

host3:~> export PADB_RMGR=mpirun
host3:~> padb -tx 33224
No MPIR_proctable_size symbol found, cannot continue
No suitable backend found (perhaps try installing pdsh or clush ?)!
Fatal problem setting up the resource manager: mpirun


Any suggestions?

Thanks,
Duncan




More information about the padb-users mailing list