Do not over-commit your virtualized environment, whether it is in memory or CPU resources. As Rhapsody is an enterprise-level Java application, performance will be severely impacted if the virtual machine hosting Rhapsody does not have sufficient dedicated resources at all times.
Performance profiling should always be conducted with production-level loads prior to deploying Rhapsody configurations to a production environment in order to allocate adequate resources. This is even more crucial when hosting Rhapsody in a virtualized environment.
If you intend to set up Rhapsody on a virtual machine, you must consider the following configuration recommendations in order to optimize performance:
- Do not assign more virtual CPUs (vCPUs) to one virtual machine than you have physical CPUs available
- Determine the optimum number of vCPUs by testing performance of virtual machines with different numbers of vCPUs using the expected Rhapsody load
- Do not overcommit CPUs to your virtual machine
- Ensure CPU ready time metric is low (below 20 per cent)
- Allocate at least enough RAM to cover the JVM memory allocation
- Do not over-commit JVM memory
- Ensure you have enough memory on the physical machine to accommodate the active memory of all virtual machines
- Host cluster settings
- Prevent memory ballooning and memory resource contention
- The number of Garbage Collection threads should be less than or equal to the number of vCPUs
- General virtualization best practices
The information in this section generally applies to VMware. For specific details on running applications on virtual machines, refer to the virtual machine applications' documentation. General literature on virtual machines is available on the internet.
This section is intended for the VMware and system administrators in charge of managing the Rhapsody server.
Do not assign more virtual CPUs (vCPUs) to one virtual machine than you have physical CPUs available
This best practice is dictated by the physical resource constraint on your host because a vCPU is one core in your physical processors. Therefore, the maximum vCPUs that can be specified to a virtual machine is equal to the number of cores on the host. Assigning six virtual vCPUs to your virtual machine when only four physical CPUs are available does not increase performance; you are still limited to the performance that the four physical CPUs are able to provide. If you have multiple virtual machines on one host, you can have more total vCPUs allocated across all virtual machines than you have physical CPUs because the physical machine allocates virtual machine processes to any idle physical CPU. Allocating more vCPUs in this case reduces the physical CPU idle time. However, allocating significantly more virtual CPUs than physical CPUs (roughly more than four or five vCPUs per pCPU) can increase the load on the physical CPUs to the point that they are all running at a high percentage (90-100%). This reduces performance across all virtual machines, including the one running Rhapsody.
Determine the optimum number of vCPUs by testing performance of virtual machines with different numbers of vCPUs using the expected Rhapsody load
The optimal vCPU configuration for your virtual machine is dependent upon the expected Rhapsody load. We recommend at least two vCPUs be allocated, but for larger Rhapsody loads, more vCPUs may give better performance. In the past problems have been identified that were resolved with additional processors. However, allocating more vCPUs to a Rhapsody configuration with a low load is not guaranteed to increase performance and may actually reduce performance, as described in the following best practice. If the CPU usage value for your virtual machine is consistently above or about 80%, then the load is too high for the current CPU configuration to handle (since it would struggle to handle bursts of messages). You should therefore assign additional CPUs.
Do not overcommit CPUs to your virtual machine
Assigning more CPUs than are required to your virtual machine gives you no added performance benefits. Running a small Rhapsody load that could be handled easily by two vCPUs on a virtual machine with four vCPUs means that the virtual machine is asking the physical device for four physical CPU cores. The virtual machine will have to wait longer for four physical CPU cores to become available than for two to become available. This increases the wait time for each multi-threaded action the virtual machine tries to do and so overall, the performance will drop. If the exact workload is not known, start with a smaller number of vCPUs and increase later as necessary.
Ensure CPU ready time metric is low (below 20 per cent)
CPU ready time is the duration of the time spent by the virtual machine being ready to run against the physical CPUs when none are available, in other words the time spent waiting for an available core to perform the work. If your CPU Ready metric is high, this means your virtual machine is finding it hard to get physical resources. This could be due to several reasons, for example, that you over-committed your physical CPUs. In order to mitigate this scenario, either the CPU resources allocated to the virtual machine need to be reduced or additional hardware resources need to be added to the cluster.
Allocate at least enough RAM to cover the JVM memory allocation
The JVM has a memory heap that is accessed frequently and so should be present in physical memory. Not allocating enough memory to your virtual machine to cover the heap space will result in frequent page swapping that decreases performance. To be confident that you have enough RAM, only allocate 50% of the memory to the Java Heap. JVM memory allocation can be significantly larger than the heap, especially on 64-bit operating systems, when hyperthreading is a feature on the physical CPUs. With hyperthreading, allocating 50% of the virtual machine’s RAM to the Java heap can equate to the JVM taking 70% of the virtual machine’s RAM. Refer to Allocating Memory to Rhapsody for details.
Do not over-commit JVM memory
JVM memory is an active space that is constantly being accessed and garbage collected which requires the memory to be available all the time. Assigning more memory than required, specifically, more memory than virtually allocated or physically present, may lead to memory ballooning or swapping, reducing performance.
Ensure you have enough memory on the physical machine to accommodate the active memory of all virtual machines
If the physical memory is too small to assign specific memory space to the active memory of all virtual machines, then the virtual machines experience ballooning or swapping which decreases performance on these machines. If the physical machine has less than 6% free memory, this indicates it is struggling to meet all the memory requirements.
Host cluster settings
It is a VMware best practice to load balance the virtual machines in your vCenter environment by grouping your hosts into a cluster and enabling the Dynamic Resource Scheduler (DRS). DRS will determine both the best host for a virtual machine to run on at its boot time, and will move virtual machines with vMotion between the hosts to allow additional servers to run. It is recommended to set DRS to automatic mode and the migration threshold to the second level (one level removed from the most conservative setting). This initial setting is recommended for most instances, but it can be changed based on the load and performance in your environment. vMotion can be manually executed on a running or powered off Rhapsody virtual machine when the VMware administrator deems it necessary. This should be avoided during periods of long-running garbage collections. If you are installing Rhapsody in an existing VMware environment, follow the guidance of your local VMware administrator for the best setting for your virtualization usage. Optionally, for additional control over migrating virtual machines, resource pools can be configured. Refer to VMware documentation for all of the available options and settings for resource pools.
Prevent memory ballooning and memory resource contention
Memory ballooning allows the virtualization host to redistribute memory between active virtualized environments sharing the same physical memory pool. If you have memory ballooning enabled, then it is possible that another virtual machine could request more memory which would result in memory being re-allocated away from the environment hosting Rhapsody. This ultimately results from over-committing memory to your virtual machines.
Due to memory management and garbage collection techniques employed by Java, Java application performance is sensitive to changes in available memory. Virtual machine ballooning can lead to the reallocation of memory from the system hosting Rhapsody to other system memory that has not yet been committed to the Java Heap.
The operating system may then page memory out to disk/swap space, usually according to an LRU (least recently used) algorithm, because there is insufficient physical memory for its needs. If ballooning occurs, this would likely trigger an exhaustion of physical memory on the system hosting Rhapsody, which would then lead to memory in the old/tenured space being swapped out to disk. A full garbage collection would then need to page this back into memory in order to perform garbage collection. The resulting extra I/O makes a full garbage collection slower by an order of magnitude. Since locking techniques in the garbage collector are optimized for physical memory, this can cause a long garbage collection pause in application processing in the Rhapsody engine.
Simply setting a memory reservation on the virtual machine equal to the entire allocated memory of the virtual machine can mitigate this issue because ballooning will never be used if all of the virtual machine’s memory is reserved. The benefit of this method is that all the memory the virtual machine needs is reserved and therefore the hypervisor does not consider that virtual machine for ballooning. This method continues to allow the hypervisor to use memory ballooning for the other running virtual machines if needed.
The number of Garbage Collection threads should be less than or equal to the number of vCPUs
If you have more garbage collection threads than virtual CPUs, you will experience context switching between your garbage collection threads. Some garbage collection threads will be held up waiting for other threads to complete. Since context switching incurs a time penalty, this slows down the time to complete a garbage collection run. To set the number of garbage collection threads on the JVM use -XX:ParallelGCThreads=<n>
. Generally on a virtual machine with only one JVM, this parameter is automatically configured, but it can be explicitly specified to finer tune your environment. In the case where there are multiple JVMs on the virtual machine, you must adjust this parameter on all of your JVMs to optimize CPU usage for each one. If a full garbage collection takes more than five minutes, Rhapsody assumes that the garbage collection has hung and thus restarts the Rhapsody engine. When a full garbage collection is underway, all other threads are paused, in effect pausing Rhapsody and stopping any messages from being processed. The duration of garbage collection can be reduced significantly by increasing the vCPUs to the virtual machine and adjusting the JVM -XX:ParallelGCThreads=<n>
parameter accordingly. This is especially useful in high throughput environments and ones where the JVM is allocated larger amounts of RAM. It is important to remember that CPU processing in a virtualized environment only occurs when the number of allocated cores (vCPUs) to the virtual machine are available simultaneously. So the larger the number of vCPUs allocated to the virtual machine, the longer virtual machine may have to wait to process on the physical hardware. It should be noted that when hyper-threading is available on the physical CPU that the number of cores does not equal the number of threads available, but when doing computing resource allocation to the JVM it is advised to consider the number of threads equal to the count of vCPUs in the virtual machine. You should conduct performance testing to determine what the right settings are for your environment. This effect can be further mitigated by configuring CPU reservations and shares.
General virtualization best practices
- Only run on versions of VMware products that are actively supported.
- Install VMware Tools.
- Set all of your ESXi hosts to use the same internal NTP server for time synchronization.
- Guest virtual machines should be configured to use the same NTP source, or alternatively have VMware Tools synchronize the time with the host.
- Configure DNS servers for short and fully qualified domain names for your ESXi hosts.
- Configure DNS servers for your Rhapsody virtual machines for both short and fully qualified domain name.
- When allocating virtual machine memory resources, allocate enough for the operating system and all running applications, including the Rhapsody JVM.
- When using virtualized disks on supported physical architecture for the Rhapsody datastore, provision them as thick provision with the eager zero option for maximum performance.
- Limit the use of snapshots in production as they can negatively impact disk I/O performance.
- Utilize a system monitoring product to proactively monitor the health of your hosts, storage, virtual machines, and applications.
- Keep all drivers, firmware, and BIOS up to date on your ESXi hosts.
- Test and apply all VMware software patches in a timely manner.
- Unless you plan to move a virtual machine to older versions of ESXi hosts, always use the highest version of virtual hardware compatible with all of your hosts and keep VMware tools installed with the version applicable to your firmware level.
Refer to Best practices for running Java in a virtual machine (1008480) for additional information.