[padb] Réf. : Re: Réf. : Re: Réf. : Réf . : Réf. : Re: Réf. : Re: Réf. : Bullchanges( with LSF -mpich2 wrapper and -openmpi_wrapper combined)
thipadin.seng-long at bull.net
thipadin.seng-long at bull.net
Tue Feb 16 23:29:35 GMT 2010
On 16 Feb 2010, at 23:16 Ashley Pittman <ashley at pittman.co.uk> wrote:
>On 16 Feb 2010, at 16:08, thipadin.seng-long at bull.net wrote:
>> I guess it should have been 'slurp_cmd' instead of 'slurm_cmd'.
>> I'll modify myself and re-try.
>
>Fixed. It shouldn't affect anyone other than you so I won't make another
beta release at this stage if you're >happy to make the change locally
yourself.
>
I was testing further and there's still another problem, i guess it came
from the ps command you changed.
[senglont at artemis1 lsf-ompi]$ ./padb -O rmgr=lsf -tx 1516
Use of uninitialized value in numeric eq (==) at ./padb line 2896.
Use of uninitialized value in numeric eq (==) at ./padb line 2896.
Use of uninitialized value in numeric eq (==) at ./padb line 2896.
Use of uninitialized value in numeric eq (==) at ./padb line 2896.
Here's the result of the break point after the call to slurp_cmd:
[senglont at artemis1 lsf-ompi]$ perl -d ./padb -O rmgr=lsf -tx 1516
Loading DB routines from perl5db.pl version 1.28
Editor support available.
Enter h or `h h' for help, or `man perldebug' for more help.
main::(./padb:345): my $svn_revision_string = '$Revision: 389 $';
DB<1> b 2939
DB<2> b 2942
DB<3> c
main::lsfmpi_get_mpiproc(./padb:2939):
2939: my @handle =
2940: slurp_remote_cmd( $host, "ps -o pid=,ppid=,cmd= -u
$target_user" );
DB<3> c
main::lsfmpi_get_mpiproc(./padb:2942):
2942: $count_line = @handle;
DB<3> p @handle
,ppid=,cmd=
16179
16180
16184
16185
16187
16191
16193
16194
16195
16196
16201
16202
16203
16207
16208
16210
16214
16215
16216
16217
16218
16221
21554
21555
DB<4>
In my version the ps command was:
my $cmd = "ssh $host ps -o pid,ppid,cmd -u $target_user ";
which display this on host 'artemis4':
[senglont at artemis1 lsf-ompi]$ ssh artemis4 ps -o pid,ppid,cmd -u senglont
PID PPID CMD
16179 2787 /usr/share/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/res -d
/usr/share/lsf/conf -m artemis1
/home_nfs/senglont/.lsbatch/1266322840.1516
16180 16179 /bin/sh /home_nfs/senglont/.lsbatch/1266322840.1516
16184 16180 /bin/bash /home_nfs/senglont/.lsbatch/1266322840.1516.shell
16185 16184 pam -g
/usr/share/lsf/7.0/linux2.6-glibc2.3-x86_64/bin/openmpi_wrapper --prefix
/home_nfs/senglont/ompi_inst/1.3.3/ ./pp_sndrcv_spbl
16187 16185 /bin/sh
/usr/share/lsf/7.0/linux2.6-glibc2.3-x86_64/bin/openmpi_wrapper --prefix
/home_nfs/senglont/ompi_inst/1.3.3/ ./pp_sndrcv_spbl
16191 16187 mpirun --app /home_nfs/senglont/.openmpi_appfile_1516
...................................;
.........................................................................
So can you tell me what you would have wanted to do!!!
Thipadin.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pittman.org.uk/pipermail/padb-devel_pittman.org.uk/attachments/20100217/22bd5e63/attachment.html>
More information about the padb-devel
mailing list