[padb] Réf. : Réf. : Re: Réf. : Re: Réf . : Réf. : Réf. : Re: Réf. : Re : Réf. : Bullchanges ( with LSF -mpich2 wrapper and -openmpi_wrapper combined)

thipadin.seng-long at bull.net thipadin.seng-long at bull.net
Wed Feb 17 09:12:43 GMT 2010


Hi,
After reflection, i guess you didn't want the headers line of ps command. 
And i think use of multiple renamed Headers in one -o option doesn't work 
as:
[senglont at artemis1 lsf-ompi]$ ps -o ppid=X,comm=Y -u senglont
X,comm=Y
    4080
    4082
    4162
    4164
    4165
   21488
   21491
   21492
    4083

Cause all characters after the first '=' would  be taken as a string, so i 
think you should use one -o option for each renamed Header as:

[senglont at artemis1 lsf-ompi]$ ps -o pid= -o ppid= -o comm= -u senglont
 4082  4080 sshd
 4083  4082 bash
 4164  4162 sshd
 4165  4164 bash
21488  4165 man
21491 21488 sh
21492 21491 sh
21497 21492 less
23548  4083 ps

I'll modify and re-try.

Thipadin,








Thipadin Seng-Long
02/17/2010 12:29 AM


        Pour :  Ashley Pittman <ashley at pittman.co.uk>
        cc :    Andry.Razafinjatovo at bull.net, florence.vallee at bull.net, 
padb-devel at pittman.org.uk, Sylvain Jeaugey <sylvain.jeaugey at bull.net>
        Objet : Réf. : Re: Réf. : Re: Réf. : Réf. : [padb] Réf. : Re:  Réf. : Re: Réf. : 
Bullchanges ( with LSF -mpich2 wrapper and -openmpi_wrapper combined)

On 16 Feb 2010, at 23:16 Ashley Pittman <ashley at pittman.co.uk> wrote:

>On 16 Feb 2010, at 16:08, thipadin.seng-long at bull.net wrote:
>> I guess it should have been 'slurp_cmd' instead of 'slurm_cmd'. 
>> I'll modify myself and re-try. 
>
>Fixed.  It shouldn't affect anyone other than you so I won't make another 
beta release at this stage if you're >happy to make the change locally 
yourself.
>

I was testing further and there's still another problem, i guess it came 
from the ps command you changed.

[senglont at artemis1 lsf-ompi]$ ./padb -O rmgr=lsf -tx 1516
Use of uninitialized value in numeric eq (==) at ./padb line 2896.
Use of uninitialized value in numeric eq (==) at ./padb line 2896.
Use of uninitialized value in numeric eq (==) at ./padb line 2896.
Use of uninitialized value in numeric eq (==) at ./padb line 2896.

Here's the result of the break point after the call to slurp_cmd:

[senglont at artemis1 lsf-ompi]$ perl -d ./padb -O rmgr=lsf -tx 1516

Loading DB routines from perl5db.pl version 1.28
Editor support available.
Enter h or `h h' for help, or `man perldebug' for more help.
main::(./padb:345):     my $svn_revision_string = '$Revision: 389 $';
  DB<1> b 2939
  DB<2> b 2942
  DB<3> c
main::lsfmpi_get_mpiproc(./padb:2939):
2939:       my @handle =
2940:         slurp_remote_cmd( $host, "ps -o pid=,ppid=,cmd= -u 
$target_user" );
  DB<3> c
main::lsfmpi_get_mpiproc(./padb:2942):
2942:       $count_line = @handle;
  DB<3> p @handle
,ppid=,cmd=
      16179
      16180
      16184
      16185
      16187
      16191
      16193
      16194
      16195
      16196
      16201
      16202
      16203
      16207
      16208
      16210
      16214
      16215
      16216
      16217
      16218
      16221
      21554
      21555
  DB<4> 


In my version the ps command was:

my $cmd = "ssh $host ps -o pid,ppid,cmd -u $target_user ";
which display this on host 'artemis4':

[senglont at artemis1 lsf-ompi]$ ssh artemis4 ps -o pid,ppid,cmd -u senglont
  PID  PPID CMD
16179  2787 /usr/share/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/res -d 
/usr/share/lsf/conf -m artemis1 
/home_nfs/senglont/.lsbatch/1266322840.1516
16180 16179 /bin/sh /home_nfs/senglont/.lsbatch/1266322840.1516
16184 16180 /bin/bash /home_nfs/senglont/.lsbatch/1266322840.1516.shell
16185 16184 pam -g 
/usr/share/lsf/7.0/linux2.6-glibc2.3-x86_64/bin/openmpi_wrapper --prefix 
/home_nfs/senglont/ompi_inst/1.3.3/ ./pp_sndrcv_spbl
16187 16185 /bin/sh 
/usr/share/lsf/7.0/linux2.6-glibc2.3-x86_64/bin/openmpi_wrapper --prefix 
/home_nfs/senglont/ompi_inst/1.3.3/ ./pp_sndrcv_spbl
16191 16187 mpirun --app /home_nfs/senglont/.openmpi_appfile_1516 
...................................;
.........................................................................


So can you tell me what you would have wanted to do!!!

Thipadin.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pittman.org.uk/pipermail/padb-devel_pittman.org.uk/attachments/20100217/fce8ccd0/attachment.html>


More information about the padb-devel mailing list