[padb] Réf. : Réf. : Réf. : Re: Réf. : Re: Réf. : Bullchanges ( with LSF -mpich2 wrapper and -openmpi_wrapper combined)

Ashley Pittman ashley at pittman.co.uk
Mon Feb 15 18:17:10 GMT 2010


On 9 Feb 2010, at 16:03, thipadin.seng-long at bull.net wrote:

> I've eventually combined my previous coding for mpich2 and openmpi wrapper on LSF as we discussed. 
> I hope you haven't yet commit the previous sending. 
> In the "outer" side we can store differents combined jobs (whatever mpich2 or openmpi) in the table. 
> Each job is tagged in jobid{lsf_mpi} = 1 for mpich2 and 2 for openmpi. 
> The flag is passed through inner_conf{lsf_mpi} to the inners processus so they can do differents treatments for each wrapper  to find the processus. 
> The RMGR is 'lsf-mpiwr' as mpi wrapper as it must be lauched by a wrapper. So It can be used for further mpi wrapper. 

I've renamed the rmgr as lsf rather than lsf-mpiwr as the -mpiwr only serves to add confusion.  If and when better LSF support comes along it can share the same rmgr setting.  I also changed lsf_mpi to lsf_mode and gave it string values instead of int values as well as this should make the code easier to read.

> I've enjoyed  meeting you. Hoping you can come often to CEA. 
> I hope you'll commit it soon as we expect to deliver to CEA soon. 

Thank you very much for the patch, I'm back from Holiday now so have some time to look at this again.

I've committed a variant as r388.  I hope I haven't broken anything but can you test it please.  I'm interested to see the output if a valid LSF job is specified but it doesn't use a wrapper of the correct style, is a correct and clear error message given in this case?  As I said I don't have access to LSF myself so I've tried to keep any changes to a minimum.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk





More information about the padb-devel mailing list