Kernel Samepage Merging

In the past, I sometimes noticed that after updating and rebooting my cloud server that runs around 12 virtual machines, memory use would decrease a few hours after reboot. The reason for this is that the kernel looks for duplicate pages and combines them. And when running 12 virtual machines, most with the same operating system and applications, a lot of optimization is possible. But that’s about all I knew about it so far. Recently, however, I stumbled across the kernel feature that perform this optimization and reports interesting details to userspace upon request: The Kernel Samepage Merging.

The mechanism is described on kernel.org and here’s what I found on my cloud server with the afore mentioned 12 VMs:

  • cat /sys/kernel/mm/ksm/pages_shared
    256448
  • cat /sys/kernel/mm/ksm/pages_sharing
    1249897
  • cat /sys/kernel/mm/ksm/pages_unshared
    2143748
  • cat /sys/kernel/mm/ksm/full_scans
    143
  • cat /sys/kernel/mm/ksm/pages_to_scan
    100
  • cat /sys/kernel/mm/ksm/sleep_millisecs
    200

The values above indicated that KSM has detected 1249897 pages that can be shared and has reduced them to 256448 pages. As my server uses a page size of 4k that’s around 5 GB of RAM that have been compacted into around 1 GB of RAM, i.e. a ratio of 1:5 and around 4 GB less RAM used. Very nice!

The pages_to_scan and sleep_millisecs explain why it takes a few hours after starting all VMs at once after a reboot before one sees a significant reduction in memory use: Around 16 GB of RAM are used on my server, i.e. 4 million pages (a very rough approximation, agreed). With a scan rate of 500 pages a second, it takes around 2.5 hours to go through memory once. This is confirmed by the full_scans value which increased by 3 after around 7 hours. It all adds up nicely.