[padb] Réf. : Re: [padb-devel] Patch for Support of PBS Pro resource manager

Ashley Pittman ashley at pittman.co.uk
Wed Nov 18 08:00:54 GMT 2009


On Tue, 2009-11-17 at 17:17 +0100, thipadin.seng-long at bull.net wrote:
>  I have to break on the screen to get prompt: 
> So I guess it is a infinite loop. 
> I have changed  'while(@output)'  for 'foreach(@output)', to correct
> this probleme.

That looks like a simple mistake on my part.  I prefer not to use $_ in
my code (either implicitly or explicitly) as I think it makes it less
readable, once the code works it's easy enough to make variables
explicit however, I just didn't do it in the patch because I prefer to
change as least as I can possibly get away with unless I can test it
immediately.

> 2- Job is not found: 
> 
> So when the loop is disappeared  i can go further: 
> 
> ./padb -O rmgr=pbs -tx 27611.xn0 
> Job 27611.xn0 is not active 
> [thipa at xn5]$ qstat 
> Job id            Name             User              Time Use S Queue 
> ----------------  ---------------- ----------------  -------- - ----- 
> 27611.xn0         STDIN            thipa             00:00:06 R workq

>            
> [thipa at xn5]$ 
> 
> The jobs that are display by qstat have the suffice with .xn0 (which
> is the server), 
> so we used to pick up the whole job id as input jobid. 
> So something have to be changed (code or synopsis).

I guess the job id as you requested it (27611.xn0) does not match what
it returned by pbs_get_jobs(), there is a problem here to do with the
server.  In the past all job id's have been numeric, this hasn't been a
problem but isn't something that I've strived for, it's just that so far
all resource managers have worked that way so that's how I think of it.
There is no technical reason for this to be true however so how about we
just say that in the future jobid's have to be alphanumeric strings,
this would work in this case although would have the downside you
couldn't specify the job as 27611 in the case above.

padb --show-jobs should show you what padb thinks the job id's are and
of course using -a rather than specifying a job tells it to use all jobs
so it'll just attempt to target one in the case above, regardless of
what it thinks it's called.

I'd be happy for a patch supporting either implementation, i.e. I don't
have a strong preference either way.   You can either have the jobid
encompass both the number and the server or you could continue with what
I attempted to encode in the patch I sent you where the job id is the
number and the server becomes a configuration option.

Actually this could make life easier for slurm and the way it handles
job steps, it effectively appends ".0" to the padb job id before handing
it over to slurm so this could probably be simplified if the .0 became a
optional part of the job id itself rather than a separate configuration
option.

> I am waiting for your patch (or reply) to continue.

I hope this helps you along the way, I can't really code anything from
here as I don't have access to a pbs system.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk





More information about the padb-devel mailing list