The technology supports a variety of usage and operating modes. This blog is focused on how using persistent memory, specifically Intel Optane DC persistent memory, as a block device affects the performance of Hazelcast’s persistence engine: Hot Restart Store.
Hazelcast In-Memory Data Grid (IMDG) runs in RAM or in-memory, delivering critical data processing workloads with microsecond latencies. As a clustering technology, the amount of data held in RAM can get quite large, ranging from tens of gigabytes up to terabytes. The data itself is pulled from systems of record (databases) into the IMDG, which performs high-speed data processing for business-critical applications. If a restart is required, the recomputation could take days; even if the data is from a data store, re-loading hundreds of gigabytes might take hours.
Because this downtime is typically unacceptable for high-performance workloads, Hazelcast designed Hot Restart Store, a log-structured storage engine that minimizes the time for Hazelcast IMDG to restart.
Hazelcast compared the restart times of a single node cluster using a modern SSD versus Intel Optane DC persistent memory. We executed the restart using only a single node cluster to avoid measuring any network coordination between nodes. Hot Restart can be configured using the “parallelism” configuration parameter, which determines the number of threads concurrently writing to or reading from persistent storage.
- 2x Intel® Xeon® E5-2687W v3 @ 3.10GHz
- HP Fusion ioMemory™ 1.0TB HH/HL Light Endurance (LE) PCIe Workload Accelerator
- 2 X 2nd Generation Intel® Xeon® Scalable processors (56 cores total)
- 6 X Intel® Optane™ DC Persistent Memory (128GB) connected to a single CPU in AppDirect Interleaved mode
We start by inserting 5.5 million entries with values of 16k bytes each, resulting in a total data size of about 87 GB for this single node. Since this was one node in a 12 node cluster, this represents a 1TB cluster. We then take measurements by shutting down the instance and observing the time it takes to reload this data into memory and become operational, each time changing the parallelism property. The measurement is repeated 4 times for each parallelism property and show the average of these restart times for each. The results for both SSD and persistent memory are shown in the chart below.
From our measurements, we determined that the best restart time for the SSD was 47 seconds with 8 read threads set in parallelism while Intel Optane DC persistent memory was able to achieve a restart time of 19 seconds with parallelism of 12 which is 2.5 times faster.
Faster restarts reduce downtime, whether from a cluster-wide crash or a planned shutdown. Intel Optane DC persistent memory clearly enables Hazelcast Hot Restart to achieve dramatically better restart times, enabling substantial business benefit for time-critical applications.
Implications for Operations Departments
Because Hazelcast runs in RAM all the time for maximum performance, on restart it will not become available until all data is loaded. Ops Departments have Time To Recovery (“TTR”) SLAs. With Intel Optane DC Persistent Memory, restart times are significantly faster. Restart times for the cluster as a whole can now be brought down to meet even the most stringent TTR SLAs.
Hot Restart using Intel Optane DC persistent memory is now supported on Hazelcast clusters.
Where to Next
Using technology as block storage may not unlock all of the potentials of persistent memory. Another way of using Intel DC persistent memory is by using it in Direct Access (DAX) mode. DAX mode is available for both Linux and Windows and enables you to benefit fully from the performance by subscribing to a different programming paradigm (as described in the Storage Networking Industry Association’s (SNIA) Non-Volatile Memory (NVM) Programming Model specification). In this mode, I/O bypasses the storage stack of the kernel, and therefore, many Device Mapper drivers are unavailable.
We have also done comprehensive testing in this mode and found a very high throughput. Consequently, we are in the process of prototyping a new storage layer in Hazelcast, using DAX mode. Keep an eye out for more updates.