[padb-users] Error message from /opt/sbin/libexec/minfo: No DLL to load

Rahul Nabar rpnabar at gmail.com
Thu Aug 19 00:20:42 BST 2010


On Wed, Aug 18, 2010 at 6:03 PM, Ashley Pittman <ashley at pittman.co.uk> wrote:

Thanks again very much Ashley!

> That's odd, I have 158 files in that directory including the libompi_dbg_msgq.la but not libompi_dbg_msgq.a, are you using a static build of OpenMPI or building it with --disable-dlopen?  It looks like you've found a bug with the OpenMPI build though, this .so should exist even for static builds.

I wasn't the one who built our original mpi so unfortunately I can't
answer your question definitively. Is there a way to query out the
build options from the mpi executable? Same for the static vs dynamic
compilation. Can I find that from the executable?

Alternatively, some snooping in the sources does reveal that
config.log has this:

 $ ./configure --prefix=/opt/ompi_new --with-tm=/opt/torque FC=ifort
CC=icc F77=ifort CXX=icpc CFLAGS=-g -O3 -mp FFLAGS=-mp -recurs
ive -O3 CXXFLAGS=-g CPPFLAGS=-DPgiFortran --disable-shared
--enable-static --with-memory-manager --disable-dlopen
--enable-openib-rd
macm --with-openib=/usr

So it could be that we did use the static and disable-dlopen. But I'd
take it with skepticism since I cannot be 100% sure that this indeed
was the source that was used. Sorry. :(

> I would send him what you sent to me in your previous mail.  It's a lot of information and it can be hard to parse because of the long lines so it's best to re-direct it to a file and attach it to avoid line-wrap.

Yup. I already used a tee to a file.


> Yes it's still relevant, it's the complete stack trace of the application, it's just missing some information that could be included.  What is interesting and potentially important is that the stack trace for six processes isn't present, it appears the processes were found or padb would have complained and they give warnings about the DLL but they have no stack trace showing, did you truncate the output in the previous email?

Yes. I had. Here's the full output. I wasn't sure if the list accepts
attachments so I posted it online.

http://dl.dropbox.com/u/118481/padb.log.new.new.txt

> As a final point debugging collectives can be hard, in a deadlock situation it can be hard to tell if all ranks are on the same iteration or if some are ahead of others and some are behind, I have a patch to Open-MPI to add a counter to all collective calls to allow this situation to be detected and reported correctly, if you're still stuck even with the stack trace then you might find this of use.  It'll mean patching you MPI build and fixing the above problem with the DLL.

That would be my next line of attack, thanks! :)

BTW, out of curiosity, is padb an alternative to things like vampir,
totalview etc. or are those a different niche with a different goal?

-- 
Rahul




More information about the padb-users mailing list