Linux Tuning The VM (memory) Subsystem

by on October 16, 2009 · 6 comments· LAST UPDATED October 16, 2009

in , ,

I've fast RAID-10 disk subsystem with multiple SCSI disks. Apps running under modern Linux kernel don't write directly to the disk. They write it to the file system cache which is managed by Linux kernel virtual memory manager. Since I've high performance RAID controller I need to decrease the number of flushes. How do I tune virtual memory subsystem under Linux operating systems for better performance?

Linux allows you to tune the VM subsystem. However, tuning the memory subsystem is a challenging task. Wrong settings can affect the overall performance of your system. I suggest you modify one setting at a time and monitor your system for sometime. If performance increased keep the settings else revert back.

Say Hello To /proc/sys/vm

The files in this directory can be used to tune the operation of the virtual memory (VM) subsystem of the Linux kernel:
cd /proc/sys/vm
ls -l

Sample outputs:

total 0
-rw-r--r-- 1 root root 0 Oct 16 04:21 block_dump
-rw-r--r-- 1 root root 0 Oct 16 04:21 dirty_background_ratio
-rw-r--r-- 1 root root 0 Oct 16 04:21 dirty_expire_centisecs
-rw-r--r-- 1 root root 0 Oct 16 04:21 dirty_ratio
-rw-r--r-- 1 root root 0 Oct 16 04:21 dirty_writeback_centisecs
-rw-r--r-- 1 root root 0 Oct 16 04:21 drop_caches
-rw-r--r-- 1 root root 0 Oct 16 04:21 flush_mmap_pages
-rw-r--r-- 1 root root 0 Oct 16 04:21 hugetlb_shm_group
-rw-r--r-- 1 root root 0 Oct 16 04:21 laptop_mode
-rw-r--r-- 1 root root 0 Oct 16 04:21 legacy_va_layout
-rw-r--r-- 1 root root 0 Oct 16 04:21 lowmem_reserve_ratio
-rw-r--r-- 1 root root 0 Oct 16 04:21 max_map_count
-rw-r--r-- 1 root root 0 Oct 16 04:21 max_writeback_pages
-rw-r--r-- 1 root root 0 Oct 16 04:21 min_free_kbytes
-rw-r--r-- 1 root root 0 Oct 16 04:21 min_slab_ratio
-rw-r--r-- 1 root root 0 Oct 16 04:21 min_unmapped_ratio
-rw-r--r-- 1 root root 0 Oct 16 04:21 mmap_min_addr
-rw-r--r-- 1 root root 0 Oct 16 04:21 nr_hugepages
-r--r--r-- 1 root root 0 Oct 16 04:21 nr_pdflush_threads
-rw-r--r-- 1 root root 0 Oct 16 04:21 overcommit_memory
-rw-r--r-- 1 root root 0 Oct 16 04:21 overcommit_ratio
-rw-r--r-- 1 root root 0 Oct 16 04:21 pagecache
-rw-r--r-- 1 root root 0 Oct 16 04:21 page-cluster
-rw-r--r-- 1 root root 0 Oct 16 04:21 panic_on_oom
-rw-r--r-- 1 root root 0 Oct 16 04:21 percpu_pagelist_fraction
-rw-r--r-- 1 root root 0 Oct 16 04:21 swappiness
-rw-r--r-- 1 root root 0 Oct 16 04:21 swap_token_timeout
-rw-r--r-- 1 root root 0 Oct 16 04:21 vfs_cache_pressure
-rw-r--r-- 1 root root 0 Oct 16 04:21 zone_reclaim_mode

pdflush

Type the following command to see current wake up time of pdflush:
# sysctl vm.dirty_background_ratio
Sample outputs:

sysctl vm.dirty_background_ratio = 10

vm.dirty_background_ratio contains 10, which is a percentage of total system memory, the number of pages at which the pdflush background writeback daemon will start writing out dirty data. However, for fast RAID based disk system this may cause large flushes of dirty memory pages. If you increase this value from 10 to 20 (a large value) will result into less frequent flushes:
# sysctl -w vm.dirty_background_ratio=20

swappiness

Type the following command to see current default value:
# sysctl vm.swappiness
Sample outputs:

vm.swappiness = 60

The value 60 defines how aggressively memory pages are swapped to disk. If you do not want swapping, than lower this value. However, if your system process sleeps for a long time you may benefit with an aggressive swapping behavior by increasing this value. For example, you can change swappiness behavior by increasing or decreasing the value:

# sysctl -w vm.swappiness=100

dirty_ratio

Type the following command:
# sysctl vm.dirty_ratio
Sample outputs:

vm.dirty_ratio = 40

The value 40 is a percentage of total system memory, the number of pages at which a process which is generating disk writes will itself start writing out dirty data. This is nothing but the ratio at which dirty pages created by application disk writes will be flushed out to disk. A value of 40 mean that data will be written into system memory until the file system cache has a size of 40% of the server's RAM. So if you've 12GB ram, data will be written into system memory until the file system cache has a size of 4.8G. You change the dirty ratio as follows:
# sysctl -w vm.dirty_ratio=25

Making Changes To VM Permanently

You need to add the settings to /etc/sysctl.conf. See our previous FAQ making changes to /proc filesystem permanently.

References:

  1. The /proc filesystem
  2. man page sysctl
TwitterFacebookGoogle+PDF versionFound an error/typo on this page? Help us!

{ 6 comments… read them below or add one }

1 Philippe Petrinko October 16, 2009 at 11:03 am

In “Say Hello” chapter: Typo: may be: “ls -al /proc/sys/vm” instead of just: “ls -l” ?

Reply

2 nixCraft October 16, 2009 at 11:43 am

Thanks for the heads up!

Reply

3 Philippe Petrinko October 16, 2009 at 11:54 am

Fine. But then concerning your “cd” and then “ls”,
I am afraid I disagree. It’s a better way to keep it short and simple.
This said “ls -al /proc/sys/vm” is better than “cd” and then “ls”
as far as none of your commands need to be run later in this directory.
[Your site is read by many, so please see my concern as an expression of my admiration for your work Teach them the KISS principle ;-) ]

Reply

4 Sheldon October 18, 2009 at 10:44 pm

You need to be careful with increasing these values. You may flush less frequently, but when you do you’ll have more data to write to disk and it will take longer, which could impact system latency and interactivity. Also, if your system crashes and you have more in RAM that is not flushed to disk you risk losing that data.

Reply

5 Alex April 23, 2010 at 4:18 pm

Some of these changes made some absolutely remarkable improvements in performance on a few heavily-loaded mail servers and large berkeley db’s storing bayes data. Well worth experimenting. I lowered cache pressure to 40 and set fs.aio-max-nr = 1048576.

Reply

6 Petre January 6, 2014 at 12:37 am

Alex I am a student at Columbia University and doing research on the Linux kernel that involves trying to find the best values for the tuning parameters of the vm subsystem and would like to ask you more details about the increase in performance after you changed the two parameters you mentioned. I would really appreciate it!

Reply

Leave a Comment

Tagged as: , , , , , , , , , , , , , , , , , ,

Previous Faq:

Next Faq: