Monday, November 15, 2010

understand memory usage in Linux

Linux tries to optimize the memory management in different ways, which make it is hard to predicate the real memory usage:
(1) Binaries (including the libraries) are "demanded paged", only part of the library actually used by a process will be loaded from disk.
(2) Two different processes may share the same loaded library.
(3) The writable page will use COW(copy on write). So if you spawn a process, it will not allocate the memory for the sub-process until you try to write that page. Then a private copy of that page will be allocated.

If you run top in Linux:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                     
 3528 xxx      15   0  584m 233m 8752 S  0.7 11.6  32:40.23 gnome-terminal  

Here the virtual set size is 584M (which include all code, data, shared library and swap out pages) and resident set size is 233M (the non-swap physical memory used). Well, we know that also includes some libraries. You can factor them out by print out memory map:
%pmap -d 3528
... ...
000000000d046000  272004 rw--- 000000000d046000 000:00000   [ anon ]
... ....
mapped: 606660K    writeable/private: 284196K    shared: 620K
The biggest one is 272M which is anonymous memory. What is the hell of anonymous memory:

Let's look at smap instead.
% cat /proc/3528/smaps
...
0d046000-1d9e7000 rw-p 0d046000 00:00 0                                  [heap]
Size:            272004 kB
Rss:             229712 kB
Shared_Clean:         0 kB
Shared_Dirty:         0 kB
Private_Clean:     4928 kB
Private_Dirty:   224784 kB
Swap:    40520 kB

Well, the heap is using 270M ... if you are using 2.6.25 kernel, you can use better /proc/$PID/pagemaps as well.

There are a few other things can contribute to the anonymous memory, like thread stack and mmap.

No comments: