From Jie.Cai at anu.edu.au Thu Dec 2 05:31:44 2010 From: Jie.Cai at anu.edu.au (Jie Cai) Date: Thu, 02 Dec 2010 16:31:44 +1100 Subject: [padb-users] Intel Compiler support on PADB? Message-ID: <4CF72F40.8010609@anu.edu.au> Dear Ashley, PADB currently uses gdb to get stack trace. While we have quite a few pre-compiled programs with Intel compilers. For level 2 optimization (default) of Intel compilers, %ebp is often cleared and stored at other place. In this case, gdb has difficulty to trace the stack properly. I am wondering will PADB utilize idb as an command line option in the short future ? Kind Regards, Jie -- Jie Cai Jie.Cai at anu.edu.au ANU Supercomputer Facility NCI National Facility Leonard Huxley, Mills Road Ph: +61 2 6125 7965 Australian National University Fax: +61 2 6125 8199 Canberra, ACT 0200, Australia http://nf.nci.org.au ----------------------------------------------------- From ashley at pittman.co.uk Thu Dec 2 18:10:17 2010 From: ashley at pittman.co.uk (Ashley Pittman) Date: Thu, 2 Dec 2010 18:10:17 +0000 Subject: [padb-users] Intel Compiler support on PADB? In-Reply-To: <4CF72F40.8010609@anu.edu.au> References: <4CF72F40.8010609@anu.edu.au> Message-ID: On 2 Dec 2010, at 05:31, Jie Cai wrote: > Dear Ashley, > > PADB currently uses gdb to get stack trace. While we have quite a few pre-compiled programs with Intel compilers. For level 2 optimization (default) of Intel compilers, %ebp is often cleared and stored at other place. In this case, gdb has difficulty to trace the stack properly. > > I am wondering will PADB utilize idb as an command line option in the short future ? No, padb integrates quite closely with gdb so this would be a significant amount of work, in particular padb uses the "MI" interface to gdb which I don't believe idb emulates? What would be possible is to add a hook to allow idb to be used to query individual processes for their stack trace, like the -X option (show_full_stack line 8875) which would allow you to see the information for individual processes, padb would not be able to parse this output though so you'd have to do it on a per-process basis after discovering the interesting processes with the standard stack trace viewer. Ashley. -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk From ashley at pittman.co.uk Fri Dec 3 18:26:19 2010 From: ashley at pittman.co.uk (Ashley Pittman) Date: Fri, 3 Dec 2010 18:26:19 +0000 Subject: [padb-users] Intel Compiler support on PADB? In-Reply-To: References: <4CF72F40.8010609@anu.edu.au> Message-ID: On 2 Dec 2010, at 18:10, Ashley Pittman wrote: > On 2 Dec 2010, at 05:31, Jie Cai wrote: > >> Dear Ashley, >> >> PADB currently uses gdb to get stack trace. While we have quite a few pre-compiled programs with Intel compilers. For level 2 optimization (default) of Intel compilers, %ebp is often cleared and stored at other place. In this case, gdb has difficulty to trace the stack properly. >> >> I am wondering will PADB utilize idb as an command line option in the short future ? > > No, padb integrates quite closely with gdb so this would be a significant amount of work, in particular padb uses the "MI" interface to gdb which I don't believe idb emulates? It turns out that idb does emulate the MI interface so there is every chance this should work. http://software.intel.com/en-us/forums/showthread.php?t=67877 I don't have the licences to test this but it should be fairly easy to test, you'll need to ensure that your shell sets the required PATH and LD_LIBRARY_PATH for idb but other than that it should be a simple change, if you do experiment with this let me know how you get on. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk From ashley at pittman.co.uk Fri Dec 3 18:35:25 2010 From: ashley at pittman.co.uk (Ashley Pittman) Date: Fri, 3 Dec 2010 18:35:25 +0000 Subject: [padb-users] Upcoming release. In-Reply-To: References: Message-ID: <19A63DCB-946D-417A-B5EE-0C5820676D9A@pittman.co.uk> On 1 Nov 2010, at 19:57, Ashley Pittman wrote: > I'd like to make a formal release in the coming weeks based on the current SVN code, the 3.2 beta has been through an extended testing period and I'm happy that it's ready to move to formal release status. On this basis I propose making a 3.3 release in the next two weeks, probably on Monday the 8th. > > Please test the latest 3.2 beta or trunk and let me know of any problems you have, unless any new issues are reported by the 5th I'll go ahead as planned. A number of issues did come up but I believe these have all been resolved now so the trunk is in a good state for release. I propose a new release date of Wed 8th Dec unless anything further comes up, issues like idb support and changes to the message queue code are liable to be de-stabiling so are best left until after the branch has been made. Ashley. -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk From ashley at pittman.co.uk Fri Dec 3 19:34:14 2010 From: ashley at pittman.co.uk (Ashley Pittman) Date: Fri, 3 Dec 2010 19:34:14 +0000 Subject: [padb-users] configuration file format. In-Reply-To: <0448FA5B-8F7D-47DD-852B-549F17541347@pittman.co.uk> References: <30161_1290477207_4CEB1E96_30161_184309_1_4CEB1E48.20607@anu.edu.au> <4CEB202B.30301@anu.edu.au> <7ACC4803-3BEC-471D-9A8F-E410BC156F89@pittman.co.uk> <4CEB752C.3020309@anu.edu.au> <8A1FC921-C460-40E1-8004-8475E5C9F58F@pittman.co.uk> <0448FA5B-8F7D-47DD-852B-549F17541347@pittman.co.uk> Message-ID: On 23 Nov 2010, at 20:32, Ashley Pittman wrote: > On 23 Nov 2010, at 15:38, Daniel Kidger wrote: >> and how does it compare / contrast with LLNL's pdsh ? perhaps on extreme scalability? > > Mainly the scalability I believe, pdsh has a "sliding window" of hosts it targets at any one time and as one completes it moves onto the next, the size of this window is the "fanout" parameter. This worked well for padb in the 2.x days but now there is full-duplex communication between the inner and outer processes all inner processes have to run simultaneously. > I'm told that clustershell works by calling itself recursively on remote nodes so can scale to much larger hostcounts and still run all commands simultaneously. I've added support for clush to padb in r421, it doesn't seem any different in design to pdsh so probably has the same scalability limits and performance characteristics. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk From ashley at pittman.co.uk Thu Dec 9 00:22:55 2010 From: ashley at pittman.co.uk (Ashley Pittman) Date: Thu, 9 Dec 2010 00:22:55 +0000 Subject: [padb-users] Announcing padb version 3.3. Message-ID: <97F1007F-B1FF-44E7-9725-B5DAA4C3B937@pittman.co.uk> I am pleased to announce that version 3.3 of padb, the first official release of padb in over a year, is now ready for use and has been uploaded to the website this evening. Release 3.3 represents a major step forward in terms of functionality, usability and stability since 3.0 and is a recommended upgrade for all users. Major changes of note are: - The ability to display variables in tree-based stack traces. - Proper support for threaded applications, in particular the tree-based stack trace mode now reports each thread in a rank individually and makes a number of trees, one for each target thread-id. - Significantly better command line parsing, resulting in better error messages and easier configuration. - Miscellaneous performance improvements, both for absolute job size and for larger process counts within individual nodes. - Selection of back-end launch mode: it is now possible to target jobs without having to rely on the resource manager to launch in many cases. - "MPIR" interface support to enable padb to work on many more resource managers which support this standard. - Solaris port. - PBS/PBS Pro/Torque support. - Limited LFS support. For a full list of changes see the "Revision history" in the source. Many of these changes were already present in the 3.2 beta releases. However a number of improvements have been made since the last beta on this branch so existing 3.2 users should also consider upgrading. The source tarball can be downloaded from the usual downloads page on Google code or directly via: http://padb.googlecode.com/files/padb-3.3.tar.gz SHA 1 Checksum: e2ec75f0d78cfff7df1a97f29dab00ddfa24f501 Work has already started on future developments. As well as supporting an ever increasing number of resource managers, the focus is moving to new modes of operation and better ways of reporting collected information to the user. My thanks to everyone who has helped make this release what it is; I appreciate all user reports, both good and bad, and hope to be able to continue bringing you improvements to padb in the future. Ashley Pittman. -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk