With today’s big disks and extremely powerful CPUs out there, memory is probably the more expensive resource. When it comes to Virtual Hosts, also CPU power could be a problem, I have been running this site at Linode for a year now with no big problems, the only ones were because of some lack of memory, which caused the CPU usage to scale up, until I needed to restart the server.

Because of that experience I was investigating about vmstat. you can read vmstat man page at any time.

First let’s check how memory works in Linux, memory is made of two components, the physical memory and swap memory (Disk), as you may know the disk is slower than the physical memory of your server so we all want to avoid memory paging or memory swapping.

So, we will use vmstat to check the status of the memory in our server and determine if we need to add more memory to it.

Syntax

   vmstat [-a] [-n] [delay [ count]]
   vmstat [-f] [-s] [-m]
   vmstat [-S unit]
   vmstat [-d]
   vmstat [-D]
   vmstat [-p disk partition]
   vmstat [-V]

We will focus on memory and cpu, so the syntax we will use more is.

vmstat [delay [count]]

Where delay is the time between samples, (if you want more than one) and count is the number of samples (if you omit it, it will run continuously).

This is a sample of the the output of this command:

vmstat 5 10

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  1      0 986900 103532 492096    0    0    24    31  320  943 10  2 87  1
 0  0      0 986760 103536 492212    0    0    21     0  714 1706  7  2 91  0
 0  0      0 986760 103544 492212    0    0     0     4  877 3647 13  3 85  0
 0  0      0 985884 103552 492232    0    0     0    32  913 3435 11  3 86  0
 0  0      0 984644 103564 492232    0    0     0    16  752 2002 12  3 85  0
 1  0      0 983528 103576 492232    0    0     0    64  667 1816  6  1 93  0
 1  0      0 984768 103588 492232    0    0     0     9  767 1979 12  3 85  0
 1  0      0 984892 103592 492236    0    0     0    34  773 2670  8  3 89  0
 1  0      0 979064 103604 492392    0    0    32    10  901 3133 14  4 82  0
 0  0      0 984396 103612 492388    0    0     0    90  942 3215 19  3 77  0

Now let’s describe the fields under memory and under cpu.

Memory

  • swpd: the amount of virtual memory used.
  • That is the portion of swap partition or swap file in use, to have this used is not bad at all, and it is kind of normal.
  • free: the amount of idle memory.
  • Your physical free memory, the same that the free command will output, if you constantly have this in a number close to 0 is bad.
  • buff: the amount of memory used as buffers.
  • Is the memory that is being used as a "virtual" disk, when data is read from the disk, the kernel tries to keep it in the memory in case of future need, this will be done only in physical memory is available, to speed up processes. (That is why if you open OpenOffice and close it and open it again the second time it goes faster.
  • cache: the amount of memory used as cache.
  • The cache is where Linux puts the data that is intended to go to the disk, this is once again to speed up processes, as memory is faste that the disk the data is first put in this cache, and the flushed to the disk in the background, this is also useful if the process that wrote this info to the disk (cache) needs it again, it will be read from the cache.
  • inact: the amount of inactive memory. (-a option)
  • active: the amount of active memory. (-a option)

Swap

  • si: Amount of memory swapped in from disk (/s).
  • so: Amount of memory swapped to disk (/s).

So, where do you need to look at to detect if you have memory issues in your Linux system.

A good place to start is free memory, but consider that also buffered and cached memory is somehow “free” memory, so do not run so fast to buy more memory yet, maybe the most important place to check is si and so, they show the data transferred in and out from the memory to the disk, if you find these counters always changing mean that you system is constantly paging the memory to the disk, which of course mean that you need more memory, or less processes running at a time.

You can find the culprit of this paging using, this command.

ps -eo pmem,pcpu,args | sort -k 1 -r | less

This will have an output like this.

%MEM %CPU COMMAND
 4.2  2.8 X :0
 3.8  3.6 kwin -session 1061726368000124320935300000029140000_1264439629_828678
 2.8  0.1 kdeinit4: plasma-desktop [kdeinit]             
 2.2  0.4 kdeinit4: konsole [kdeinit]                    
 2.0  0.1 /usr/lib/chromium/chromium
 2.0  0.0 kdeinit4: krunner [kdeinit]                    
 1.8  2.9 /usr/bin/knotify4
 1.7  0.0 kdeinit4: kmix [kdeinit] -session 1061726368000124011843900000031580010_126443962
 1.6  0.0 /usr/bin/virtuoso-t +foreground +config /tmp/virtuoso_hX1798.ini +wait
 1.6  0.0 kdeinit4: kded4 [kdeinit]                      
 1.5  0.0 /usr/lib/chromium/chromium --channel=13877.b2454868.257731315 --type=renderer --lang=en-US --force-fieldtest=AsyncSlowStart/_AsyncSlowStart/CacheSize/CacheSizeGroup_0/DnsImpact/_disabled_prefetch/GlobalSdch/_global_enable_sdch/SocketLateBinding/_disable_late_binding/
 1.4  0.0 okular file:///tmp/redp4285.pdf -caption 
 1.4  0.0 kdeinit4: ksmserver [kdeinit]                  
 1.4  0.0 kdeinit4: klipper [kdeinit]                    
 1.4  0.0 kdeinit4: kglobalaccel [kdeinit]               
 1.3  0.0 kdeinit4: kdeinit4 Running...                  
 1.3  0.0 kdeinit4: kaccess [kdeinit]                    
 1.2  0.0 kdeinit4: nepomukserver [kdeinit]              
 1.1  0.0 /usr/bin/nepomukservicestub nepomukstrigiservice
 1.1  0.0 kdeinit4: klauncher [kdeinit] --fd=10          
10.3  9.9 /usr/bin/firefox
 0.8  0.0 /usr/bin/nepomukservicestub nepomukontologyloader
 0.7  0.0 /usr/bin/nepomukservicestub nepomukstorage
 0.7  0.0 /usr/bin/korgac -icon korgac
 0.6  0.0 /usr/lib/chromium/chromium --type=zygote
 0.6  0.0 /usr/bin/nepomukservicestub nepomukremovablestorageservice
 0.6  0.0 /usr/bin/nepomukservicestub nepomukqueryservice
 0.6  0.0 /usr/bin/nepomukservicestub nepomukfilewatch
 0.3  0.0 /usr/bin/kwrited
 0.2  0.0 /usr/sbin/hald
 0.1  0.0 /usr/sbin/console-kit-daemon --no-daemon
 0.1  0.0 /usr/lib/polkit-1/polkitd
 0.1  0.0 /usr/lib/gvfs/gvfs-gphoto2-volume-monitor
 0.1  0.0 /usr/lib/gvfs/gvfs-gdu-volume-monitor
 0.1  0.0 /usr/lib/gvfs//gvfs-fuse-daemon /home/ggarron/.gvfs
 0.1  0.0 /usr/lib/GConf/gconfd-2
 0.1  0.0 /usr/lib/DeviceKit/devkit-disks-daemon
 0.1  0.0 /usr/lib/chromium/chromium
 0.1  0.0 ssh -l root linux.alketech.com
 0.1  0.0 ssh 69.164.209.24 -l root
 0.0  0.0 xinit
 0.0  0.0 xclip
 0.0  0.0 [watchdog/1]
 0.0  0.0 [watchdog/0]
 0.0  0.0 /usr/sbin/syslog-ng
 0.0  0.0 /usr/sbin/sshd
 0.0  0.0 /usr/sbin/crond -S -l info
 0.0  0.0 /usr/lib/kde4/libexec/start_kdeinit +kcminit_startup
 0.0  0.0 /usr/lib/gvfs/gvfsd
 0.0  0.0 /usr/bin/ssh-agent -s
 0.0  0.0 /usr/bin/gpg-agent --daemon --pinentry-program /usr/bin/pinentry-qt4
 0.0  0.0 /usr/bin/dbus-daemon --system
 0.0  0.0 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
 0.0  0.0 [sync_supers]
 0.0  0.0 supervising syslog-ng
 0.0  0.0 sort -k 1 -r
 0.0  0.0 [scsi_eh_7]
 0.0  0.0 [scsi_eh_6]
 0.0  0.0 [scsi_eh_5]
 0.0  0.0 [scsi_eh_4]
 0.0  0.0 [scsi_eh_3]
 0.0  0.0 [scsi_eh_2]
 0.0  0.0 [scsi_eh_1]
 0.0  0.0 [scsi_eh_0]
 0.0  0.0 /sbin/udevd --daemon
 0.0  0.0 /sbin/udevd --daemon
 0.0  0.0 /sbin/udevd --daemon
 0.0  0.0 /sbin/agetty -8 38400 tty6 linux
 0.0  0.0 /sbin/agetty -8 38400 tty5 linux
 0.0  0.0 /sbin/agetty -8 38400 tty4 linux
 0.0  0.0 /sbin/agetty -8 38400 tty3 linux
 0.0  0.0 /sbin/agetty -8 38400 tty2 linux
 0.0  0.0 ps -eo pmem,pcpu,args
 0.0  0.0 [pm]
 0.0  0.0 [migration/1]
 0.0  0.0 [migration/0]
 0.0  0.0 kwrapper4 ksmserver
 0.0  0.0 [kthreadd]
 0.0  0.0 [kswapd0]
 0.0  0.0 [ksuspend_usbd]
 0.0  0.0 [ksoftirqd/1]
 0.0  0.0 [ksoftirqd/0]
 0.0  0.0 [ksmd]
 0.0  0.0 [kseriod]
 0.0  0.0 [kpsmoused]
 0.0  0.0 [kjournald]
 0.0  0.0 [khungtaskd]
 0.0  0.0 [khubd]
 0.0  0.0 [khelper]
 0.0  0.0 [kblockd/1]
 0.0  0.0 [kblockd/0]
 0.0  0.0 [kacpi_notify]
 0.0  0.0 [kacpi_hotplug]
 0.0  0.0 [kacpid]
 0.0  0.0 ini 
 0.0  0.0 [hd-audio0]
 0.0  0.0 hald-runner
 0.0  0.0 hald-addon-storage: polling /dev/sr0 (every 2 sec)
 0.0  0.0 hald-addon-input: Listening on /dev/input/event1 /dev/input/event3 /dev/input/event4 /dev/input/event13
 0.0  0.0 hald-addon-acpi: listening on acpi kernel interface /proc/acpi/event
 0.0  0.0 [flush-8:32]
 0.0  0.0 [events/1]
 0.0  0.0 [events/0]
 0.0  0.0 devkit-disks-daemon: polling /dev/sr0 
 0.0  0.0 dbus-launch --sh-syntax --exit-with-session
 0.0  0.0 [crypto/1]
 0.0  0.0 [crypto/0]
 0.0  0.0 /bin/sh /usr/bin/startkde
 0.0  0.0 /bin/login --        
 0.0  0.0 /bin/bash
 0.0  0.0 /bin/bash
 0.0  0.0 /bin/bash
 0.0  0.0 /bin/bash
 0.0  0.0 /bin/bash
 0.0  0.0 /bin/bash
 0.0  0.0 /bin/bash
 0.0  0.0 [bdi-default]
 0.0  0.0 -bash
 0.0  0.0 [ata_aux]
 0.0  0.0 [ata/1]
 0.0  0.0 [ata/0]
 0.0  0.0 [async/mgr]
 0.0  0.0 [aio/1]
 0.0  0.0 [aio/0]

So you can see which process is “eating” your memory, you can also use htop once in your screen press F6 and select MEM% as the sort column, if you prefer top once it is running press “F” or “O” (with no quotes) and next “n” to sort by the Memory column, will have an output like this.

Tasks: 127 total,   2 running, 125 sleeping,   0 stopped,   0 zombie
Cpu(s):  7.9%us,  1.9%sy,  0.7%ni, 89.1%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2064156k total,  1408676k used,   655480k free,   122628k buffers
Swap:  2650684k total,        0k used,  2650684k free,   681644k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1916 ggarron   20   0  540m 211m  40m R   22 10.5  37:48.52 firefox
 1773 ggarron   20   0  351m  78m  32m S    8  3.9  14:05.72 kwin
 1695 root      19  -1  246m  85m 9752 S    6  4.2  11:15.78 X
    1 root      20   0  1688  612  544 S    0  0.0   0:00.78 init
    2 root      20   0     0    0    0 S    0  0.0   0:00.00 kthreadd
    3 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/0
    4 root      20   0     0    0    0 S    0  0.0   0:00.36 ksoftirqd/0
    5 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/0
    6 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/1
    7 root      20   0     0    0    0 S    0  0.0   0:00.06 ksoftirqd/1
    8 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/1
    9 root      20   0     0    0    0 S    0  0.0   0:00.20 events/0
   10 root      20   0     0    0    0 S    0  0.0   0:00.04 events/1
   11 root      20   0     0    0    0 S    0  0.0   0:00.00 khelper
   12 root      20   0     0    0    0 S    0  0.0   0:00.00 async/mgr
   13 root      20   0     0    0    0 S    0  0.0   0:00.00 pm
   14 root      20   0     0    0    0 S    0  0.0   0:00.00 sync_supers
   15 root      20   0     0    0    0 S    0  0.0   0:00.00 bdi-default
   16 root      20   0     0    0    0 S    0  0.0   0:00.00 kblockd/0
   17 root      20   0     0    0    0 S    0  0.0   0:00.00 kblockd/1
   18 root      20   0     0    0    0 S    0  0.0   0:00.00 kacpid
   19 root      20   0     0    0    0 S    0  0.0   0:00.00 kacpi_notify
   20 root      20   0     0    0    0 S    0  0.0   0:00.00 kacpi_hotplug
   21 root      20   0     0    0    0 S    0  0.0   0:00.02 kseriod
   24 root      20   0     0    0    0 S    0  0.0   0:00.00 khungtaskd
   25 root      20   0     0    0    0 S    0  0.0   0:00.00 kswapd0
   26 root      25   5     0    0    0 S    0  0.0   0:00.00 ksmd
   27 root      20   0     0    0    0 S    0  0.0   0:00.00 aio/0
   28 root      20   0     0    0    0 S    0  0.0   0:00.00 aio/1
   29 root      20   0     0    0    0 S    0  0.0   0:00.00 crypto/0
   30 root      20   0     0    0    0 S    0  0.0   0:00.00 crypto/1
  504 root      20   0     0    0    0 S    0  0.0   0:03.27 ata/0
  510 root      20   0     0    0    0 S    0  0.0   0:00.06 ata/1
  519 root      20   0     0    0    0 S    0  0.0   0:00.00 ata_aux
  543 root      20   0     0    0    0 S    0  0.0   0:04.38 scsi_eh_0
  560 root      20   0     0    0    0 S    0  0.0   0:00.00 scsi_eh_1

Now por the CPU these are the fields in vmstat

All fields are percentages of total CPU time.

  • us: Time spent running non-kernel code. (user time, including nice time)
  • sy: Time spent running kernel code. (system time)
  • id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
  • wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.

If while running vmstat with a delay option, you see that constantly the idle time is near to 0, you may have problems, usually this means a program is eating too much memory and the CPU is busy paging from memory to swap, and back again. But you can once again use ps (as shown above) or htop or top, (remember to sort by CPU time), the ps is like this to sort by CPU

ps -eo pcpu,pmem,args | sort -k 1 -r | less