[padb] Réf. : Re: Réf. : Re: [padb-devel] Patch for Support of PBS Pro resource manager

Ashley Pittman ashley at pittman.co.uk
Wed Nov 18 17:10:10 GMT 2009


On Wed, 2009-11-18 at 16:48 +0100, thipadin.seng-long at bull.net wrote:

> 1- Path of remote padb: 
> 
> [thipa at xn5 padb_open]$ ./padb -O rmgr=pbs -tx 27611 
> einner: xn20: bash: ./padb: No such file or directory 

> As the consequence path is not found, 
> So path to remote host must be a full path

I know of this one and don't have a generic solution, other resource
managers suffer from it as well, mpd springs to mind.  It should only
occur when developing padb as if you aren't running as ./ it's probably
installed somewhere and will also be installed on the remote nodes.

As a workaround I often type `pwd`/padb which causes it to work, it's
not ideal however.

> I did the patch as follows: 
>  [snip]

> If you have another idea i take it. 

How does this work if you do say ./src/padb -axt?  If it works in that
case then I'm happy with the code and I'll commit it, I've not added
anything before as I couldn't think of a generalised solution.

> 2- Use of uninitialized value in subtraction (-) at ./padb line 4077 
> 
>  
> 4077     foreach my $proc ( 0 .. $comm_data->{nprocesses} - 1 ) { 

Are you able to extract the process count from the job id and return it
as "nprocesses" in the hash returned by pbs_setup_job()?  I'm not
familiar with qstat so I don't know how to find this information.

> 3- Question about starting inner padb: 
> 
> How can I start an inner padb by hand on a remote host to debug such
> as: 
> perl -d ./padb --inner --jobid=27611.xn0 --stack-trace -O rmgr="pbs"
> --line-formatted 
> like I did it before, because this command doesn't work anymore. 
> You have changed it with "call back" and communication on ports. 

You're right in that debugging padb in the new model is a lot more
difficult, --debug full_duplex=all will show all comms  between the
inner and the outer process or use --debug all=all and padb will spit
out as much as it can.

I'm not familiar with perl -d so can't help you on that front.

> Here is the diff again r311 (diff r311 newone). 
> 
> So you can integrate my new patch and try to correct the point 2, 
> and send me back the new one, i will test it over. 

I'll be able to take a closer look when I'm back from SC, I only have my
netbook with me and aren't able to test anything from here, the patch
looks good so far however.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk





More information about the padb-devel mailing list