[padb] Réf. : Re: tiny bug with--proc-summary

thipadin.seng-long at bull.net thipadin.seng-long at bull.net
Wed Dec 9 14:28:07 GMT 2009


Hi,
I have tested the patch and is OK.
But I could have the main thread to be the second one from the LWP 
(as my in previous example).
So it's hard to say. Consider it works.
Thipadin.





Ashley Pittman <ashley at pittman.co.uk>
12/09/2009 12:41 PM

 
        Pour :  thipadin.seng-long at bull.net
        cc :    florence.vallee at bull.net, francois.wellenreiter at bull.net, 
padb-devel at pittman.org.uk, Sylvain.JEAUGEY at bull.net
        Objet : Re: tiny bug with--proc-summary

On Wed, 2009-12-09 at 11:29 +0100, thipadin.seng-long at bull.net wrote:
> 
> Hi, 
> With --proc-summary option, padb displays pid which is indeed a thread
> PID (LWP) 
> for a process that have some threads as shown: 

> What's do you think. 

I can confirm there's a bug here, I can see it locally when I target a
multi-threaded application on my laptop.

What is happening is that the show_proc function is reporting data for
all tasks in the program, this is probably the right thing for
--proc-info however for --proc-summary it's incorrect in that it's
recording a lot of entries twice for the same process, pid being one of
these.  This duplicate data is then passed back through the network to
the outer process.

At this point the tree_from_namespace function is re-assembling the data
on the assumption that each key only has one value from a given rank, in
the case here where this isn't true it's picking one at random and
reporting that which is what you see.

Attached is a basic patch which fixes the issue by ensuring that only
data from the first thread is forwarded back, this makes padb
deterministic and causes it to show the pid you'd expect.

The wider issue here is how to handle multi-threaded programs, for
example I don't know how to calculate memory usage across threads, I'd
assume they all have the same memory maps with the possible exception of
TLS which means the value is probably both common to all threads and
correct across the process as a whole but the percent cpu usage
calculation is almost certainly wrong, this would need to be calculated
for each thread and summed across threads to get the true value.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pittman.org.uk/pipermail/padb-devel_pittman.org.uk/attachments/20091209/c98ab63f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: padb-proc-format-threads.patch
Type: application/octet-stream
Size: 892 bytes
Desc: not available
URL: <http://pittman.org.uk/pipermail/padb-devel_pittman.org.uk/attachments/20091209/c98ab63f/attachment.obj>


More information about the padb-devel mailing list