[padb] [padb-devel] Simple Makefile patch

Ashley Pittman ashley at pittman.co.uk
Mon Nov 2 17:09:25 GMT 2009


On Mon, 2009-11-02 at 11:38 -0500, Ethan Mallove wrote:
> > 
> > http://code.google.com/p/padb/source/detail?r=303
> > 
> 
> Looks like it's failing before minfo.x gets a chance to run:

>   einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8087.
>   einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8087.
>   einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8087.
>   DEBUG (full_duplex):   1: Reply from inner, 84 bytes
>   Warning, failed to locate any ranks

This is the problem, in this case it's failing to find any ranks.  When
we had it running before there wasn't any Solaris code in
get_extended_process_list() which meant it called ps many times, at
least once for each process.  On Linux (which can be tricked into
running in Solaris mode) this slowed it down so I added code to just run
ps once for all processes.

Can you look at the comment at line 863 and in particular the bit about
needing to check that solaris and Linux ps commands pad in the same way.
This patch should show you what you need to know, there should be no
undefined or empty string values in the %process_data hash.

Index: padb
===================================================================
--- padb	(revision 311)
+++ padb	(working copy)
@@ -8074,6 +8074,9 @@
 
     my $ipids = $inner_conf{rmpids};
 
+    print Dumper \%process_data;
+    print Dumper $ipids;
+
     foreach my $pid ( keys %process_data ) {
 
         # The resource manager pid this pid is associated with.


-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk





More information about the padb-devel mailing list