Monday, November 15, 2010

understand memory usage in Linux

Linux tries to optimize the memory management in different ways, which make it is hard to predicate the real memory usage:
(1) Binaries (including the libraries) are "demanded paged", only part of the library actually used by a process will be loaded from disk.
(2) Two different processes may share the same loaded library.
(3) The writable page will use COW(copy on write). So if you spawn a process, it will not allocate the memory for the sub-process until you try to write that page. Then a private copy of that page will be allocated.

If you run top in Linux:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                     
 3528 xxx      15   0  584m 233m 8752 S  0.7 11.6  32:40.23 gnome-terminal  

Here the virtual set size is 584M (which include all code, data, shared library and swap out pages) and resident set size is 233M (the non-swap physical memory used). Well, we know that also includes some libraries. You can factor them out by print out memory map:
%pmap -d 3528
... ...
000000000d046000  272004 rw--- 000000000d046000 000:00000   [ anon ]
... ....
mapped: 606660K    writeable/private: 284196K    shared: 620K
The biggest one is 272M which is anonymous memory. What is the hell of anonymous memory:

Let's look at smap instead.
% cat /proc/3528/smaps
...
0d046000-1d9e7000 rw-p 0d046000 00:00 0                                  [heap]
Size:            272004 kB
Rss:             229712 kB
Shared_Clean:         0 kB
Shared_Dirty:         0 kB
Private_Clean:     4928 kB
Private_Dirty:   224784 kB
Swap:    40520 kB

Well, the heap is using 270M ... if you are using 2.6.25 kernel, you can use better /proc/$PID/pagemaps as well.

There are a few other things can contribute to the anonymous memory, like thread stack and mmap.

linux file soft link vs. hard link

Hard link uses different inode and cannot cross devices. Soft link store the file name as its data.

So if you have two files hard linked, even if you delete the origianl file, you can still access it using linked file name.

On the other hand, if you have a soft link to another file. If you deleted the origianl file, you can not access it even if the origianl file has some hard link to other files.

One more thing: If the file is opened by other process and you deleted the file, the file will be still kept in the hard driver until you close it.

Thursday, November 11, 2010

jump through jump box

We had a network settings. You always have to jump to the jump box and then logon other machine(tmachine). Here is the trick to do the job:
ssh tmachine -o ProxyCommand="netcat-proxy-command jumpbox tmachine"


Copy folder via jump box:
tar cvzf - . |  ssh tmachine -o
ProxyCommand="netcat-proxy-command jumpbox tmachine"
cat ">" folder.tar.gz

The netcat-proxy-command:

#!/bin/sh
#http://www.hackinglinuxexposed.com/articles/20040830.html
bouncehost=$1
target=$2
ssh $bouncehost   nc -w 1 $target 22