AW: Vmware ESXi (3.5) and RHEL (5.1) : Timekeeping Woes

Dirk Heindel Dirk.Heindel at
Mon Sep 28 04:55:48 CDT 2009

Sorry for the late response....

About memory on ESX servers...

VMkernel swapping occurs if there is no other way to get physical memory. A possible other way is the use of the vmmemctl driver aka Ballooning-Driver that is delivered with the VMware-Tools. If there is not enough physical memory on the ESX available, the vmkernel tries to retrieve some memory by requesting memory from Guest-OS through the Ballooning-Driver inside VMs. To say it very simple: The ballooning driver behaves like a normal application, that requests memory from OS. The difference to normal application is, that the Ballooning driver request non-pageable memory from the (Guest-)OS. The received memory will then be "forwarded" and used by the vmkernel for other purposes (e.g. other VMs).
If the Guest-OS has not the requested amount of memory available, it usually will swap some pages to it own OS-pagefile, like it would do, when you start a lot of big applications.
The ballooning driver can request up to 65% of virtual machines memory. This is the default. You can increase this up to 75% by changing the advanced settings.
This limit exists because a OS cannot swap out itself.
If the ESX still needs more memory, vmkernel-swapping takes place. This swapping is done by the vmkernel and not by Guest-OS. VMkernel picks some pages and will write these pages to vmkernel-swap-file (*.vswp), which was created and allocated at the starttime of the VM.
The size of the swap file will be the difference between reserved memory for the VM and configured memory.
If you configure a memory reservation for a VM, the VM has the guarantee to use that amount of physical memory. That memory will never be "ballooned" or paged.
The use of vmkernel-swap is much more slower than doing ballooning (and maybe swapping to Guest-OS-pagefile/pagepartition), because the Guest-OS knows better what pages it could swap out and what pages it should leave in memory.

If there is no ballooning driver installed in the VM,  VMkernel-swapping takes place instead of ballooning. So be sure you have the VMwareTools including the vmmemctl installed in every Guest-OS.

To see the utilization of ballooning and swapping you can use vSphere Client (VI3: VI-Client) and use the performance charts. Select memory and in the "chart Options" select "Ballooned", "Ballooned Target", "Swapped" and "Swapped Target".
If you see on VM a "Swapped Target" which is nonzero and a "Ballooned" with zero, you probably don't have the VMware Tools installed or the tools are not running on that VM.
In normal circumstances a VMkernel-Swap should not happen (except for VMs, where no VMware-Tools are available).


Dirk Heindel,

Von: linux-poweredge-bounces at [mailto:linux-poweredge-bounces at] Im Auftrag von Roehrig, Jack (John)
Gesendet: Mittwoch, 16. September 2009 17:05
An: Brian O'Mahony; linux-poweredge at
Betreff: RE: Vmware ESXi (3.5) and RHEL (5.1) : Timekeeping Woes

Without knowing how the swapping algorithm works, it's difficult to be certain under what conditions the swap will be used. The conditions look correct though. I have seen many 2, 4, 6, and 8GB RHEL5.1 guests whose swap file is utilized by their hosts. These machines usually use between 100MB and 2GB of actual RAM. Certain conditions seem to exacerbate the problem as well. For example, a cluster of 45 machines all with the same load, purpose, and memory configuration will not experience synchronized global swapping issues. Instead, combinations of guests seem to create more of an issue. However, the problem still occurs even when the sum of total allocated guest memory is less than the available-to-guests physical ESX memory.  Perhaps this is caused by paging out unused portions of RAM to disk while a VM resides on an over-committed ESX host, followed by a migration to an ESX host which is not overcommitted, and later page fault resulting in access of the swap.

In any case, convincing many developers that despite all of their mlocking, ESX will still swap out their LRU junk to the SAN, is much more difficult than setting resource allocations for each guest. Hard-allocating the total requested guest memory to a guest will cause the swap file to be created with a zero length, and thus I assume disabling its use.

In any event, it's worth a shot. Try setting a resource allocation for a troublesome VM and see if your time stops skewing. If it doesn't work, experiment with combinations of other methods discussed in this thread. An associated, but much more devastating manifestation of the swapping problem is horrible VM performance. During periods of bad performance, most system reporting data will report normal conditions (memory, CPU, disk I/O, etc), but load average will skyrocket. This problem may be more noticeable on VMs whose swap resides on an oversaturated, slow SAN. If swapping was the issue, resource allocations may increase overall application response time and a better service to your customers.

-Jack Roehrig

-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Linux-PowerEdge mailing list