[padb-users] vmrss and vmlck question

Duncan Harris harris.duncan at gmail.com
Tue Jul 12 12:12:49 BST 2011


Here's an example of vmlck > vmrss.
We've since had one where vmlck was 4GB for a core and vrmss was still
only around 1.3 GB.

rank  hostname        pid    vmsize      vmrss       vmlck       S
uptime  %cpu  lcore  command
...
...
3924   node822        8265  3125968 kB  1281236 kB  2183152 kB  R
13.14   100      0  ./a.out
3925   node822        8266  3171040 kB  1328928 kB   485320 kB  R
13.14   100      1  ./a.out
3926   node822        8267  3144016 kB  1300564 kB   692028 kB  R
13.14   100      2  ./a.out
3927   node822        8268  3127740 kB  1283676 kB   580352 kB  R
13.14   100      3  ./a.out
3928   node822        8269  3108208 kB  1263544 kB   418840 kB  R
13.14   100      4  ./a.out
3929   node822        8270  3079540 kB  1235072 kB   779184 kB  R
13.14    98      5  ./a.out
3930   node822        8271  3062328 kB  1217008 kB   184380 kB  R
13.14   100      6  ./a.out
3931   node822        8272  3056056 kB  1212048 kB   324308 kB  R
13.14   100      7  ./a.out
3932   node822        8273  3047864 kB  1202824 kB   190376 kB  R
13.14   100      8  ./a.out
3933   node822        8274  3046032 kB  1200984 kB    65964 kB  R
13.14   100      9  ./a.out
3934   node822        8275  3046028 kB  1200880 kB    51928 kB  R
13.14   100     10  ./a.out
3935   node822        8276  3047324 kB  1202960 kB    87784 kB  R
13.14   100     11  ./a.out
...
...

On Mon, Jul 11, 2011 at 3:41 PM, Daniel Kidger
<daniel.kidger at googlemail.com> wrote:
> Duncan,
>
> I tried a short piece of C code that calls mlock() with a user-adjustable
> size, then sleeps.
> Then I queried it using cat /proc/<pid>/status
>
> In this case, it does appear that VmLck is a say a subset of VmRSS (and
> VmSize for that matter)
> If I increase the value in mlock() by say 128MB then both VmLck and VmRSS
> both increase by this amount
>
> Can you post an example where vmrss exceeds vmlck ?
>
> Daniel
>
>
>
> On 11 July 2011 14:06, Duncan Harris <harris.duncan at gmail.com> wrote:
>>
>> Hi.
>> I have a question related to vmrss and vmlck (not technically a padb
>> question I realise).
>>
>> We've modified our padb --proc-summary command to also print out the
>> vmlck value. We did this as we were having a situation where one of
>> our jobs was hitting the vmlck hardlimit on our machine and hanging as
>> a result.
>>
>> However, we have a different code hanging now. Running padb shows that
>> for one of the nodes (which has 12 cores) none of the vmlck values are
>> hitting the limit, however if we sum the vmlck and vmrss values for
>> the whole node, we do exceed the total memory available on the node
>> (49 GB vs 48GB). Running a stack trace shows that 3 cores are stuck in
>> an MPI_Wait_Some and 1 in a later MPI_Waitall. From the code all of
>> the sends have been sent, so we've lost a message somewhere.
>>
>> My question is, how do the vmlck and vmrss values relate to each
>> other? Should we be adding them together, or is the vmlck included in
>> the vmrss value? We're assuming that they are separate as we have some
>> cores where vmlck > vmrss.
>>
>> Thanks,
>> Duncan
>>
>> _______________________________________________
>> padb-users mailing list
>> padb-users at pittman.org.uk
>> http://pittman.org.uk/mailman/listinfo/padb-users_pittman.org.uk
>
>




More information about the padb-users mailing list