[padb-users] running with SGE/OMPI

Dave Love d.love at liverpool.ac.uk
Fri Jul 9 15:32:18 BST 2010


Ashley Pittman <ashley at pittman.co.uk> writes:

> On 8 Jul 2010, at 16:38, Dave Love wrote:
>
>> I'd like to use padb with OpenMPI jobs under Gridengine.
>
> In padb-parlance Gridengine would be the scheduler which padb is
> un-interested in and given you mention ompi-ps presumably the resource
> manager is orte.  Padb only interfaces with the resource manager of
> these two.

I assumed Gridengine is relevant (a) in referring to `jobs', and (b) in
that I think the OpenMPI tight integration is relevant, at least because
it seems ompi-ps appears to be looking in the wrong place for files.

> You have two choices here, you can either use "orte" as the resource
> manager in which case a working ompi-ps is required or you can use
> "mpirun" as the resource manager in which case it'll attach to the
> orterun (or mpirun) process with gdb and read the data it needs
> directly.  In both of these cases the data is only available, and
> hence you'll need to run padb, on the node where the orterun process
> is running, given you are using Gridengine finding this node could be
> a non-trivial problem but it depends on your setup.

That's easy, but neither mpirun nor orte work.  With mpirun I get

Error, resource manager "mpirun" not supported

and orte doesn't find any jobs because ompi-ps doesn't.  I'll try to
figure out what's going on when I get some time.

> be aware that using an incorrect ompi-ps version can cause orted to
> crash and running jobs to fail so tread carefully.

Thanks for the warning and the rest.




More information about the padb-users mailing list