[padb] [padb-devel] Simple Makefile patch

Ethan Mallove ethan.mallove at sun.com
Mon Nov 2 17:24:26 GMT 2009


On Mon, Nov/02/2009 05:09:25PM, Ashley Pittman wrote:
> On Mon, 2009-11-02 at 11:38 -0500, Ethan Mallove wrote:
> > > 
> > > http://code.google.com/p/padb/source/detail?r=303
> > > 
> > 
> > Looks like it's failing before minfo.x gets a chance to run:
> 
> >   einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8087.
> >   einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8087.
> >   einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8087.
> >   DEBUG (full_duplex):   1: Reply from inner, 84 bytes
> >   Warning, failed to locate any ranks
> 
> This is the problem, in this case it's failing to find any ranks.  When
> we had it running before there wasn't any Solaris code in
> get_extended_process_list() which meant it called ps many times, at
> least once for each process.  On Linux (which can be tricked into
> running in Solaris mode) this slowed it down so I added code to just run
> ps once for all processes.
> 
> Can you look at the comment at line 863 and in particular the bit about
> needing to check that solaris and Linux ps commands pad in the same way.
> This patch should show you what you need to know, there should be no
> undefined or empty string values in the %process_data hash.
> 
> Index: padb
> ===================================================================
> --- padb	(revision 311)
> +++ padb	(working copy)
> @@ -8074,6 +8074,9 @@
>  
>      my $ipids = $inner_conf{rmpids};
>  
> +    print Dumper \%process_data;
> +    print Dumper $ipids;
> +
>      foreach my $pid ( keys %process_data ) {
>  
>          # The resource manager pid this pid is associated with.
> 

$ padb --debug=all --config-option rmgr=mpirun --full-report=27047
DEBUG (config):   0: Finished setting configuration options
padb version 3.n (Revision 311)
full job report for job 27047

DEBUG (pcmd):   0: Loaded pcmd data
DEBUG (verbose):   0: There are 1 processes over 1 hosts
DEBUG (verbose):   0: Remote process data available on frontend
DEBUG (show_cmd):   0:  /home/em162155/software/SunOS/sparc/padb/bin/padb --inner
DEBUG (signon):   1: Received last signon, connecting to inner
DEBUG (ctree):   1: connection tree
DEBUG (full_duplex):   1: Sending command to inner, 368 bytes
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
einner: Argument "" isn't numeric in numeric ne (!=) at /home/em162155/software/SunOS/sparc/padb/bin/padb line 8090.
DEBUG (full_duplex):   1: Reply from inner, 84 bytes
Warning, failed to locate any ranks
DEBUG (full_duplex):   1: Sending command to inner, 28 bytes
DEBUG (full_duplex):   1: Reply from inner, 84 bytes
inner: $VAR1 = {
inner:           '1' => '',
inner:           '11888' => '',
inner:           '11895' => '',
inner:           '1746' => '1692',
inner:           '1752' => '1746',
inner:           '23506' => '',
inner:           '23513' => '',
inner:           '23970' => '',
inner:           '23977' => '',
inner:           '23979' => '',
inner:           '24858' => '',
inner:           '24865' => '',
inner:           '24867' => '',
inner:           '2595' => '1',
inner:           '26726' => '',
inner:           '26733' => '',
inner:           '27047' => '',
inner:           '27381' => '',
inner:           '27383' => '',
inner:           '27384' => '',
inner:           '27696' => '',
inner:           '27703' => '',
inner:           '28208' => '',
inner:           '28215' => '',
inner:           '29242' => '',
inner:           '29249' => '',
inner:           '3538' => '1',
inner:           '3540' => '1',
inner:           '3544' => '1',
inner:           '3548' => '1',
inner:           '3550' => '1',
inner:           '5113' => '5106',
inner:           '5115' => '5113',
inner:           '6764' => '6757',
inner:           '6766' => '6764',
inner:           '948' => '1',
inner:           '950' => '1',
inner:           '955' => '1',
inner:           '959' => '1'
inner:         };
inner: $VAR1 = {
inner:           '27049' => {
inner:                        rank => '0'
inner:                      }
inner:         };
DEBUG (verbose):   1: Completed command

-Ethan


> 
> -- 
> 
> Ashley Pittman, Bath, UK.
> 
> Padb - A parallel job inspection tool for cluster computing
> http://padb.pittman.org.uk
> 




More information about the padb-devel mailing list