I wrote a small program to understand the difference between the VSZ and RSS. If you malloc, it only means that you can use the memory address. No real memory will be used until that page is accssed.
First I find out my page size using:
%getconf PAGE_SIZE
4096
(1)
size_t length=0x10000000; //256M ~const size_t pagesize=4096;
char* x=(char*)malloc(length);
after start the program:
%ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
user 25125 0.0 0.0 273552 912 pts/1 S+ 10:54 0:00 a.out
Here the VSZ just mean the maxium address space you can use, and the real memory RSS is still 912K although the VSZ is 270M
(2) and then execute
for (int i=0; i<1000; i++)
x[i]='a';
%ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
user 25395 0.0 0.0 273552 916 pts/1 S+ 10:56 0:00 a.out
Only one page is written, so the RSS just changed for an extra one page space (916K now).
(3) and then execute to write/read one character in 1000 pages
for (int i=0; i<1000; i++)
x[i*pagesize]='a';
%ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
user 25285 0.0 0.2 273552 4912 pts/1 S+ 10:55 0:00 a.out
Now every page written/read will be come the RSS (1000*4K+912K=4912K).
Tuesday, December 7, 2010
Monday, November 15, 2010
understand memory usage in Linux
Linux tries to optimize the memory management in different ways, which make it is hard to predicate the real memory usage:
(1) Binaries (including the libraries) are "demanded paged", only part of the library actually used by a process will be loaded from disk.
(2) Two different processes may share the same loaded library.
(3) The writable page will use COW(copy on write). So if you spawn a process, it will not allocate the memory for the sub-process until you try to write that page. Then a private copy of that page will be allocated.
If you run top in Linux:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3528 xxx 15 0 584m 233m 8752 S 0.7 11.6 32:40.23 gnome-terminal
Here the virtual set size is 584M (which include all code, data, shared library and swap out pages) and resident set size is 233M (the non-swap physical memory used). Well, we know that also includes some libraries. You can factor them out by print out memory map:
%pmap -d 3528
... ...
000000000d046000 272004 rw--- 000000000d046000 000:00000 [ anon ]
... ....
mapped: 606660K writeable/private: 284196K shared: 620K
The biggest one is 272M which is anonymous memory. What is the hell of anonymous memory:
Let's look at smap instead.
% cat /proc/3528/smaps
...
0d046000-1d9e7000 rw-p 0d046000 00:00 0 [heap]
Size: 272004 kB
Rss: 229712 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 4928 kB
Private_Dirty: 224784 kB
Swap: 40520 kB
Well, the heap is using 270M ... if you are using 2.6.25 kernel, you can use better /proc/$PID/pagemaps as well.
There are a few other things can contribute to the anonymous memory, like thread stack and mmap.
(1) Binaries (including the libraries) are "demanded paged", only part of the library actually used by a process will be loaded from disk.
(2) Two different processes may share the same loaded library.
(3) The writable page will use COW(copy on write). So if you spawn a process, it will not allocate the memory for the sub-process until you try to write that page. Then a private copy of that page will be allocated.
If you run top in Linux:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3528 xxx 15 0 584m 233m 8752 S 0.7 11.6 32:40.23 gnome-terminal
Here the virtual set size is 584M (which include all code, data, shared library and swap out pages) and resident set size is 233M (the non-swap physical memory used). Well, we know that also includes some libraries. You can factor them out by print out memory map:
%pmap -d 3528
... ...
000000000d046000 272004 rw--- 000000000d046000 000:00000 [ anon ]
... ....
mapped: 606660K writeable/private: 284196K shared: 620K
The biggest one is 272M which is anonymous memory. What is the hell of anonymous memory:
Let's look at smap instead.
% cat /proc/3528/smaps
...
0d046000-1d9e7000 rw-p 0d046000 00:00 0 [heap]
Size: 272004 kB
Rss: 229712 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 4928 kB
Private_Dirty: 224784 kB
Swap: 40520 kB
Well, the heap is using 270M ... if you are using 2.6.25 kernel, you can use better /proc/$PID/pagemaps as well.
There are a few other things can contribute to the anonymous memory, like thread stack and mmap.
linux file soft link vs. hard link
Hard link uses different inode and cannot cross devices. Soft link store the file name as its data.
So if you have two files hard linked, even if you delete the origianl file, you can still access it using linked file name.
On the other hand, if you have a soft link to another file. If you deleted the origianl file, you can not access it even if the origianl file has some hard link to other files.
One more thing: If the file is opened by other process and you deleted the file, the file will be still kept in the hard driver until you close it.
So if you have two files hard linked, even if you delete the origianl file, you can still access it using linked file name.
On the other hand, if you have a soft link to another file. If you deleted the origianl file, you can not access it even if the origianl file has some hard link to other files.
One more thing: If the file is opened by other process and you deleted the file, the file will be still kept in the hard driver until you close it.
Thursday, November 11, 2010
jump through jump box
We had a network settings. You always have to jump to the jump box and then logon other machine(tmachine). Here is the trick to do the job:
ssh tmachine -o ProxyCommand="netcat-proxy-command jumpbox tmachine"
Copy folder via jump box:
tar cvzf - . | ssh tmachine -o
ProxyCommand="netcat-proxy-command jumpbox tmachine"
cat ">" folder.tar.gz
The netcat-proxy-command:
#!/bin/sh
#http://www.hackinglinuxexposed.com/articles/20040830.html
bouncehost=$1
target=$2
ssh $bouncehost nc -w 1 $target 22
ssh tmachine -o ProxyCommand="netcat-proxy-command jumpbox tmachine"
Copy folder via jump box:
tar cvzf - . | ssh tmachine -o
ProxyCommand="netcat-proxy-command jumpbox tmachine"
cat ">" folder.tar.gz
The netcat-proxy-command:
#!/bin/sh
#http://www.hackinglinuxexposed.com/articles/20040830.html
bouncehost=$1
target=$2
ssh $bouncehost nc -w 1 $target 22
Monday, October 25, 2010
firefox 3.6.x proxy ntlm authenticaiton issue
If you are using NTLM proxy authentication, the 3.6.x will keep on popping up the password dialog box. This is because the firefox switches from their internal NTLM implementation to native NTLM windows API. The workaround is to set: network.auth.force-generic-ntlm to fallback to old ways.
Friday, October 22, 2010
Create a vista Gadget
Got my Windows 7 box in both home and office. It is time to play the Gadget side bar!
As a start project, I want to create a gadget to monitor a web site. Whenever it changes, it will display different icon. So you need a html file, a xml configuration file, and a javascript.
(1) How to access the Internet?
There are two ways: XMLHTTPRequest or create a dll to wrap the logic inside. Here I use the first approach.
(2) The regular request seems never sent out
It is because of the cache. Add the header like:
xmlhttp.setRequestHeader("If-Modified-Since", "Sat 1 Jan 2000 00:00:00 GMT");
(3) How to detect the network connection error
You can hook a timeout. Or if simply the web server did not start up, you will get the status code 12029 (WinInet error of the attmpt to connection to the server failed).
(4) I use:
<body onload="start();windows.setInterval('refresh()',5000)" ....>
to request the web site every 5 seconds.
As a start project, I want to create a gadget to monitor a web site. Whenever it changes, it will display different icon. So you need a html file, a xml configuration file, and a javascript.
(1) How to access the Internet?
There are two ways: XMLHTTPRequest or create a dll to wrap the logic inside. Here I use the first approach.
(2) The regular request seems never sent out
It is because of the cache. Add the header like:
xmlhttp.setRequestHeader("If-Modified-Since", "Sat 1 Jan 2000 00:00:00 GMT");
(3) How to detect the network connection error
You can hook a timeout. Or if simply the web server did not start up, you will get the status code 12029 (WinInet error of the attmpt to connection to the server failed).
(4) I use:
<body onload="start();windows.setInterval('refresh()',5000)" ....>
to request the web site every 5 seconds.
Saturday, October 9, 2010
decrypt the web page encrypted using HTML Guardian
After open the page, instead of using view the source, in the address bar, input:
javascript:var sorc=document.documentElement.outerHTML;document.open("text/plain");document.write(sorc);
javascript:var sorc=document.documentElement.outerHTML;document.open("text/plain");document.write(sorc);
Subscribe to:
Posts (Atom)