[padb] Réf. : Re: Réf. : Réf. : Réf . : Re: Réf. : Re: Réf. : Bullchanges ( with LSF -mpich2 wrapper and -openmpi_wrapper combined)
thipadin.seng-long at bull.net
thipadin.seng-long at bull.net
Tue Feb 16 16:08:55 GMT 2010
On 02/15/2010 19:17 Ashley Pittman <ashley at pittman.co.uk> wrote:
>On 9 Feb 2010, at 16:03, thipadin.seng-long at bull.net wrote:
>> I've eventually combined my previous coding for mpich2 and openmpi
wrapper on LSF as we discussed.
>> I hope you haven't yet commit the previous sending.
>> In the "outer" side we can store differents combined jobs (whatever
mpich2 or openmpi) in the table.
>> Each job is tagged in jobid{lsf_mpi} = 1 for mpich2 and 2 for openmpi.
>> The flag is passed through inner_conf{lsf_mpi} to the inners processus
so they can do differents treatments for each wrapper to find the
processus.
>> The RMGR is 'lsf-mpiwr' as mpi wrapper as it must be lauched by a
wrapper. So It can be used for further mpi wrapper.
>
>I've renamed the rmgr as lsf rather than lsf-mpiwr as the -mpiwr only
serves to add confusion. If and when >better LSF support comes along it
can share the same rmgr setting. I also changed lsf_mpi to lsf_mode and
gave >it string values instead of int values as well as this should make
the code easier to read.
>
>> I've enjoyed meeting you. Hoping you can come often to CEA.
> I hope you'll commit it soon as we expect to deliver to CEA soon.
>
>Thank you very much for the patch, I'm back from Holiday now so have some
time to look at this again.
>
>I've committed a variant as r388. I hope I haven't broken anything but
can you test it please. I'm interested >to see the output if a valid LSF
job is specified but it doesn't use a wrapper of the correct style, is a
>correct and clear error message given in this case? As I said I don't
have access to LSF myself so I've tried >to keep any changes to a minimum.
>Ashley,
I tested the 3.2 beta0 release version, you just missed slurm_cmd at line
919 as below:
[senglont at artemis1 lsf-ompi]$ ./padb -O rmgr=lsf -atx
Undefined subroutine &main::slurm_cmd called at ./padb line 919.
[senglont at artemis1 lsf-ompi]$ ./padb -V
padb version 3.2 (Revision 389)
Written by Ashley Pittman
http://padb.pittman.org.uk
[senglont at artemis1 lsf-ompi]$
sources is:
sub slurp_remote_cmd {
my ( $host, $cmd ) = @_;
return slurm_cmd("ssh $host $cmd");
}
I guess it should have been 'slurp_cmd' instead of 'slurm_cmd'.
I'll modify myself and re-try.
Thipadin.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pittman.org.uk/pipermail/padb-devel_pittman.org.uk/attachments/20100216/9bab5982/attachment.html>
More information about the padb-devel
mailing list