Linux HugeTLBfs: Improve MySQL Database Application Performance

Posted on in Categories CentOS, Hardware, High performance computing, Howto, MySQL, RedHat/Fedora Linux last updated May 20, 2009

Applications that perform a lot of memory accesses (several GBs) may obtain performance improvements by using large pages due to reduced Translation Lookaside Buffer (TLB) misses. HugeTLBfs is memory management feature offered in Linux kernel, which is valuable for applications that use a large virtual address space. It is especially useful for database applications such as MySQL, Oracle and others. Other server software that uses the prefork or similar (e.g. Apache web server) model will also benefit.

The CPU’s Translation Lookaside Buffer (TLB) is a small cache used for storing virtual-to-physical mapping information. By using the TLB, a translation can be performed without referencing the in-memory page table entry that maps the virtual address. However, to keep translations as fast as possible, the TLB is usually small. It is not uncommon for large memory applications to exceed the mapping capacity of the TLB. Users can use the huge page support in Linux kernel by either using the mmap system call or standard SYSv shared memory system calls (shmget, shmat).

Only selected hardware and operating system support memory pages greater than the default 4KB. The following configuration tested on RHEL 5.3 64 bit using a stock kernel with tons of RAM and multiple CPUs.

How do I verify that my kernel supports hugepage?

Type the following command:
$ grep -i huge /proc/meminfo
Sample output:

HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

The kernel built with hugepage support should show the number of configured hugepages in the system. Otherwise, you need to be built Linux kernel with the CONFIG_HUGETLBFS option.

How do I configure HugeTLBfs?

The HugeTLBfs feature permits an application to use a much larger page size than normal, so that a single TLB entry can map a larger address space. A HugeTLB entry can vary in size. For example, i386 architecture supports 4K and 4M (2M in PAE mode) page sizes, ia64 architecture supports multiple page sizes 4K, 8K, 64K, 256K, 1M, 4M, 16M, 256M and ppc64 supports 4K and 16M. To allocate hugepage, you can define the number of hugepages by configuring value at /proc/sys/vm/nr_hugepages, enter:
# sysctl -w vm.nr_hugepages=40
Above command will try to configure 40 hugepages in the system. Now, run the following again:
# grep -i huge /proc/meminfo
Sample output:

HugePages_Total:    40
HugePages_Free:     40
HugePages_Rsvd:      0
Hugepagesize:     2048 kB


  • HugePages_Total: 40 – The size of the pool of hugepages. On busy server with 16/32GB RAM, you can set this to 512 or higher value.
  • HugePages_Free: 40 – The number of hugepages in the pool that are not yet allocated.
  • HugePages_Rsvd: 0 – The number of hugepages for which a commitment to allocate from the pool has been made, but no allocation has yet been made.
  • Hugepagesize: 2048 kB

Configure MySQL to use HugeTLBfs

In MySQL, large pages can be used by InnoDB, to allocate memory for its buffer pool and additional memory pool. Find mysql user id:
# id mysql
Sample output:

uid=27(mysql) gid=27(mysql) groups=27(mysql)
Open /etc/sysctl.conf: # vi /etc/sysctl.conf Add the following configuration:
# Set the number of pages to be used.
# Each page is normally 2MB, so a value of 40 = 80MB.
# Set it 512 or higher if you have lots of memory
# Set the group number (mysql group number is 27) that is allowed to access this memory. The mysql user must be a member of this group.

# Increase the amount of shmem allowed per segment
# This depends upon your memory, remember your
kernel.shmmax = 68719476736

# Increase total amount of shared memory.
kernel.shmall = 4294967296

Save and close the file. Reload settings:
# systclt -p
Open /etc/my.cnf:
# vi /etc/my.cnf
Add large-pages options

# rest of config...

Save and close the file. Open /etc/security/limits.conf, enter:
# vi /etc/security/limits.conf
Append the following line to set max locked-in-memory address space to unlimited:

@mysql      soft    memlock         unlimited
@mysql      hard    memlock         unlimited

Save and close the file. Finally, restart the mysql server:
# /etc/init.d/mysqld restart

A note about mount command option

If your application uses huge pages through the mmap() system call, you have to mount a file system of type hugetlbfs like this:
# mount -t hugetlbfs none /myapp
Another example, with more control over uid, gid and other options:

# mount -t hugetlbfs -o uid={value},gid={value},mode={value},size={value},nr_inodes={value} none /myapp

Further readings:

  1. Please refer to kernel documentation in Documentation /vm/hugetlbpage.txt for more information. MySQL large memory support help page.
  2. man page - mount

Posted by: Vivek Gite

The author is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter, Facebook, Google+.

11 comment

  1. I am implementing this under Ubuntu 9.0.4 with MySQL 5.0.75-0ubuntu10, few points I would like to clarify:

    1. the command of “grep -i huge /proc/meminfo” returns the following:

    HugePages_Total: 0
    HugePages_Free: 0
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    Hugepagesize: 4096 kB

    2. my.cnf should be located under /etc/mysql/my.cnf

    3. the command to restart mysql should be # /etc/init.d/mysql restart

    1. The FreeBSD virtual memory subsystem now supports fully transparent use of superpages for application memory; application memory pages are dynamically promoted to or demoted from superpages without any modification to application code. This change offers the benefit of large page sizes such as improved virtual memory efficiency and reduced TLB (translation lookaside buffer) misses without downsides like application changes and virtual memory inflexibility. This is disabled by default and can be enabled by setting a loader tunable vm.pmap.pg_ps_enabled to 1. Add vm.pmap.pg_ps_enabled=1 to /boot/loader.conf.

      However, I’ve not tested it with MySQL or anything else…


  2. I got an Innodb Error message:
    [[email protected] ~]# tail -n 10 /var/log/mysqld.log
    090523 01:58:57 mysqld started
    InnoDB: HugeTLB: Warning: Failed to allocate 8404992 bytes. errno 12
    InnoDB HugeTLB: Warning: Using conventional memory pool
    090523 1:58:57 InnoDB: Started; log sequence number 0 43655
    Warning: Failed to allocate 8388608 bytes from HugeTLB memory. errno 12
    Warning: Using conventional memory pool
    090523 1:58:57 [Note] /usr/libexec/mysqld: ready for connections.
    Version: ‘5.0.45’ socket: ‘/var/lib/mysql/mysql.sock’ port: 3306 Source distribution

  3. nonsense…

    have you noticed 68719476736 translates to 65gb?

    huge pages are only supported by the innodb engine , and vm.nr_hugepages should be set to match the memory used (i.e. buffer pool size) by it accordingly , not some arbitrary value as you’re suggesting.

    And why raise the shared memory limit to 16gb if you’re going to use only 1 gb (512 pages) (8gb is the default in linux 2.6).

    your post does not provide any useful information over what is already in the mysql documentation, and by omitting cricital details it is actually harmful. If you’re going to blindly copy+paste existing documentation as if it was new, at least get it right.

    The mysql docs are at, for those interested

  4. Hi , thanks for sharing the config guide on large page size for mysql . I followed your guide and had a few questions and was wondering if you would be kind enough to help me understand better .this is how my kernel parm. look like
    [[email protected] /]# sysctl -p
    net.ipv4.ip_forward = 0
    net.ipv4.conf.default.rp_filter = 1
    net.ipv4.conf.default.accept_source_route = 0
    kernel.sysrq = 0
    kernel.core_uses_pid = 1
    net.ipv4.tcp_syncookies = 1
    kernel.msgmnb = 65536
    kernel.msgmax = 65536
    kernel.shmmax = 68719476736
    kernel.shmall = 4294967296
    vm.nr_hugepages = 512
    vm.hugetlb_shm_group = 27
    vm.nr_hugepages = 1024

    i have set vm.nr_hugepages value to 512 and 1024 MB and added mysql group 27.NOW i am not able to figure out what should be the value of kernel.shmmax and kernel.shmall ? also this is what musql variable show .

    mysql> show variables like ‘%large%’;
    | Variable_name | Value |
    | large_files_support | ON |
    | large_page_size | 2097152 |
    | large_pages | ON |
    what does 2097152 indicate ? please help me understand . thank you in advance 🙂

  5. @ Taylan Develioglu

    Right comment however I think these articles are supposed to be hints and then people should understand what and why and tune it to reflect their needs. Copying any configuration file/how to with no understanding (in general) is just the worst.

  6. I really do have a questions,
    Can I control the size of TLB from the linux, I do understand this is like the Patetable Cache and it is totally is controlled by the CPU.
    But I really do want to play around with the TLB trying to do some changes and try to mess with it, any body here has any idea how to do that, not mandatory on linux may be on one of the new OSes for the multi/many core processors ?

Comments are closed.