Software and Design

Tuesday, December 7, 2010

virtual memory vs. RSS in Linux

I wrote a small program to understand the difference between the VSZ and RSS. If you malloc, it only means that you can use the memory address. No real memory will be used until that page is accssed.

First I find out my page size using:
%getconf PAGE_SIZE
4096

(1)
size_t length=0x10000000; //256M ~const size_t pagesize=4096;
char* x=(char*)malloc(length);

after start the program:
%ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user     25125 0.0 0.0 273552   912 pts/1    S+   10:54   0:00 a.out

Here the VSZ just mean the maxium address space you can use, and the real memory RSS is still 912K although the VSZ is 270M
(2) and then execute
for (int i=0; i<1000; i++)
      x[i]='a';

%ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user     25395 0.0 0.0 273552   916 pts/1    S+   10:56   0:00 a.out

Only one page is written, so the RSS just changed for an extra one page space (916K now).

(3) and then execute to write/read one character in 1000 pages
   for (int i=0; i<1000; i++)
      x[i*pagesize]='a';

%ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user     25285 0.0 0.2 273552 4912 pts/1    S+   10:55   0:00 a.out

Now every page written/read will be come the RSS (1000*4K+912K=4912K).

Monday, November 15, 2010

understand memory usage in Linux

Linux tries to optimize the memory management in different ways, which make it is hard to predicate the real memory usage:
(1) Binaries (including the libraries) are "demanded paged", only part of the library actually used by a process will be loaded from disk.
(2) Two different processes may share the same loaded library.
(3) The writable page will use COW(copy on write). So if you spawn a process, it will not allocate the memory for the sub-process until you try to write that page. Then a private copy of that page will be allocated.

If you run top in Linux:
PID USER      PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
3528 xxx      15   0 584m 233m 8752 S 0.7 11.6 32:40.23 gnome-terminal

Here the virtual set size is 584M (which include all code, data, shared library and swap out pages) and resident set size is 233M (the non-swap physical memory used). Well, we know that also includes some libraries. You can factor them out by print out memory map:
%pmap -d 3528
... ...
000000000d046000 272004 rw--- 000000000d046000 000:00000   [ anon ]
... ....
mapped: 606660K    writeable/private: 284196K    shared: 620K
The biggest one is 272M which is anonymous memory. What is the hell of anonymous memory:

Let's look at smap instead.
% cat /proc/3528/smaps
...
0d046000-1d9e7000 rw-p 0d046000 00:00 0                                  [heap]
Size:            272004 kB
Rss:             229712 kB
Shared_Clean:         0 kB
Shared_Dirty:         0 kB
Private_Clean:     4928 kB
Private_Dirty:   224784 kB
Swap:    40520 kB

Well, the heap is using 270M ... if you are using 2.6.25 kernel, you can use better /proc/$PID/pagemaps as well.

There are a few other things can contribute to the anonymous memory, like thread stack and mmap.

linux file soft link vs. hard link

Hard link uses different inode and cannot cross devices. Soft link store the file name as its data.

So if you have two files hard linked, even if you delete the origianl file, you can still access it using linked file name.

On the other hand, if you have a soft link to another file. If you deleted the origianl file, you can not access it even if the origianl file has some hard link to other files.

One more thing: If the file is opened by other process and you deleted the file, the file will be still kept in the hard driver until you close it.

Thursday, November 11, 2010

jump through jump box

We had a network settings. You always have to jump to the jump box and then logon other machine(tmachine). Here is the trick to do the job:
ssh tmachine -o ProxyCommand="netcat-proxy-command jumpbox tmachine"

Copy folder via jump box:
tar cvzf - . | ssh tmachine -o
ProxyCommand="netcat-proxy-command jumpbox tmachine"
cat ">" folder.tar.gz

The netcat-proxy-command:

#!/bin/sh
#http://www.hackinglinuxexposed.com/articles/20040830.html
bouncehost=$1
target=$2
ssh $bouncehost nc -w 1 $target 22

Monday, October 25, 2010

firefox 3.6.x proxy ntlm authenticaiton issue

If you are using NTLM proxy authentication, the 3.6.x will keep on popping up the password dialog box. This is because the firefox switches from their internal NTLM implementation to native NTLM windows API. The workaround is to set: network.auth.force-generic-ntlm to fallback to old ways.

Friday, October 22, 2010

Create a vista Gadget

Got my Windows 7 box in both home and office. It is time to play the Gadget side bar!

As a start project, I want to create a gadget to monitor a web site. Whenever it changes, it will display different icon. So you need a html file, a xml configuration file, and a javascript.

(1) How to access the Internet?
There are two ways: XMLHTTPRequest or create a dll to wrap the logic inside. Here I use the first approach.

(2) The regular request seems never sent out
It is because of the cache. Add the header like:
xmlhttp.setRequestHeader("If-Modified-Since", "Sat 1 Jan 2000 00:00:00 GMT");

(3) How to detect the network connection error
You can hook a timeout. Or if simply the web server did not start up, you will get the status code 12029 (WinInet error of the attmpt to connection to the server failed).

(4) I use:
<body onload="start();windows.setInterval('refresh()',5000)" ....>
to request the web site every 5 seconds.

Saturday, October 9, 2010

decrypt the web page encrypted using HTML Guardian

After open the page, instead of using view the source, in the address bar, input:
javascript:var sorc=document.documentElement.outerHTML;document.open("text/plain");document.write(sorc);