[padb-users] vmrss and vmlck question
Duncan Harris
harris.duncan at gmail.com
Tue Jul 12 12:12:49 BST 2011
Here's an example of vmlck > vmrss.
We've since had one where vmlck was 4GB for a core and vrmss was still
only around 1.3 GB.
rank hostname pid vmsize vmrss vmlck S
uptime %cpu lcore command
...
...
3924 node822 8265 3125968 kB 1281236 kB 2183152 kB R
13.14 100 0 ./a.out
3925 node822 8266 3171040 kB 1328928 kB 485320 kB R
13.14 100 1 ./a.out
3926 node822 8267 3144016 kB 1300564 kB 692028 kB R
13.14 100 2 ./a.out
3927 node822 8268 3127740 kB 1283676 kB 580352 kB R
13.14 100 3 ./a.out
3928 node822 8269 3108208 kB 1263544 kB 418840 kB R
13.14 100 4 ./a.out
3929 node822 8270 3079540 kB 1235072 kB 779184 kB R
13.14 98 5 ./a.out
3930 node822 8271 3062328 kB 1217008 kB 184380 kB R
13.14 100 6 ./a.out
3931 node822 8272 3056056 kB 1212048 kB 324308 kB R
13.14 100 7 ./a.out
3932 node822 8273 3047864 kB 1202824 kB 190376 kB R
13.14 100 8 ./a.out
3933 node822 8274 3046032 kB 1200984 kB 65964 kB R
13.14 100 9 ./a.out
3934 node822 8275 3046028 kB 1200880 kB 51928 kB R
13.14 100 10 ./a.out
3935 node822 8276 3047324 kB 1202960 kB 87784 kB R
13.14 100 11 ./a.out
...
...
On Mon, Jul 11, 2011 at 3:41 PM, Daniel Kidger
<daniel.kidger at googlemail.com> wrote:
> Duncan,
>
> I tried a short piece of C code that calls mlock() with a user-adjustable
> size, then sleeps.
> Then I queried it using cat /proc/<pid>/status
>
> In this case, it does appear that VmLck is a say a subset of VmRSS (and
> VmSize for that matter)
> If I increase the value in mlock() by say 128MB then both VmLck and VmRSS
> both increase by this amount
>
> Can you post an example where vmrss exceeds vmlck ?
>
> Daniel
>
>
>
> On 11 July 2011 14:06, Duncan Harris <harris.duncan at gmail.com> wrote:
>>
>> Hi.
>> I have a question related to vmrss and vmlck (not technically a padb
>> question I realise).
>>
>> We've modified our padb --proc-summary command to also print out the
>> vmlck value. We did this as we were having a situation where one of
>> our jobs was hitting the vmlck hardlimit on our machine and hanging as
>> a result.
>>
>> However, we have a different code hanging now. Running padb shows that
>> for one of the nodes (which has 12 cores) none of the vmlck values are
>> hitting the limit, however if we sum the vmlck and vmrss values for
>> the whole node, we do exceed the total memory available on the node
>> (49 GB vs 48GB). Running a stack trace shows that 3 cores are stuck in
>> an MPI_Wait_Some and 1 in a later MPI_Waitall. From the code all of
>> the sends have been sent, so we've lost a message somewhere.
>>
>> My question is, how do the vmlck and vmrss values relate to each
>> other? Should we be adding them together, or is the vmlck included in
>> the vmrss value? We're assuming that they are separate as we have some
>> cores where vmlck > vmrss.
>>
>> Thanks,
>> Duncan
>>
>> _______________________________________________
>> padb-users mailing list
>> padb-users at pittman.org.uk
>> http://pittman.org.uk/mailman/listinfo/padb-users_pittman.org.uk
>
>
More information about the padb-users
mailing list