Sending data to storage can be tricky and may require additional efforts to maintain system performance and reliability. Cache is an efficient tool to raise storage speed, but what happens to the data in case of power loss? We have checked three options of handling write cache in order to find out, whether all of them are as reliable and productive as they must be.
Top-class storage systems have proprietary technologies to secure the cache in case of power loss, that employ non-volatile memory or UPS devices. Without these features, data corruption may be caused by lost cache segments of data stored in RAM, as well as RAID Write Hole issues. One of the key components to solve the problems are modules of non-volatile memory, installed in controllers of storage systems: they are just supercapacitors and flash to host the data, rewritten during power loss.
Non-volatile memory/UPS are absolutely fine for classic storage systems, but still a bit unsuitable for SDS: developers of software-defined storage systems just cannot use proprietary components, as their software is installed on the standard x86 hardware. Therefore, in case of power shutdown data loss is to be prevented with software methods that significantly reduce overall system performance, and sometimes by turning down write-back caching (Write Through or simply WT).
Non-volatile memory is becoming more affordable, that is good. But volumes of data and corresponding cache are constantly growing, and this demands additional storage capacity — that is worse. Classic RAM, if exceeding several TB, requires costly hardware — that is the worst.
At this point Intel Optane persistent memory comes into the play. Taking its place between high-performance NVMe drives and RAM, this technology allows major increase of memory volume and makes it non-volatile at the same time.
What is Persistent Memory
Persistent memory (PMEM) is a solid-state high-performance byte-addressable memory device. It has nearly the same speed and latency of DRAM and the non-volatility of NAND flash. Intel Optane persistent memory, a-DIMM-form factor module based on 3D XPoint memory device, are one of the most well-known examples of persistent memory technology.
PMEM needs less time to get to flash, and the access to large datasets is actually ultrafast. It is cheaper compared to DRAM and still can be used for cache. And speaking about power loss challenge — data persists in PMEM even after a power interruption.
About Intel Optane Persistent Memory
The keys to Intel Optane memory technology and storage media: a revolutionary memory architecture that stacks memory grids in a three-dimensional matrix to improve density, increase performance, and provide persistence. Intel Optane technology allows memory cells to be individually accessed and updated, so there’s no need to collect excessive data. As a result, Intel Optane memory and storage media offers speeds close to DRAM, with the persistence of traditional SSDs.
Significant simplification of datapath lowers the latency massively.
Here comes a bit of theory and surprise: Intel Optane persistent memory supports two modes, Memory Mode and AppDirect Mode.
With Memory Mode, Intel Optane persistent memory is an inexpensive memory, that increases performance of selected applications (that do not need to be rewritten by the way). When data is requested from memory, the memory controller checks the DRAM cache first. If the data is there, the response latency is identical to DRAM. If the data is not in the DRAM cache, it is read from the Intel Optane Persistent Memory with slightly extended latency.
With App Direct Mode the system gets to know that there are two types of direct load/store memory in the platform and it knows which type of data read or write is suitable for DRAM or Intel Optane persistent memory. Operations that require the lowest latency and don’t need a permanent data storage can be executed on DRAM. Data, that needs to be made persistent, or structures, that are very large, can be routed to the Intel Optane persistent memory.
At RAIDIX we were amazed by App Direct features! So, we have decided to implement it in our SDS and see what happens.
Intel Optane Persistent Memory and RAIDIX SDS
We consider non volatility as the key feature of Intel Optane persistent memory, therefore we use it in AppDirect mode.
We have developed:
- Intel Optane persistent memory modules for operation as caching devices
- Modules that realize functions similar to the library operating in the userspace. Actually we HAD TO do it as RAIDIX execute caching in Linux kernel
- We have changed the logic of caching module so it could save the operations into non volatile memory and recover the data from it in case of power loss.
Here is our technological preview on the application of Intel Optane persistent memory.
Intel Optane Persistent Memory and RAIDIX SDS
Let’s return to the beginning of this post: remember three ways to keep you cache data safe and consistent? So here they are:
- RAID with RAM cache: very fast, but unreliable in case of power shut down
- RAID with NV cache: very reliable, but a bit slower than RAM
- RAID with Write Through cache: reliable too, but MUCH slower than other options
We have decided to see the actual speed with each of the options. Server systems used are described in the solution brief we published with Intel. (https://builders.intel.com/docs/datacenterbuilders/high-performance-raid-software-for-fast-nvme-storage-systems-with-intel-optane-dc-storage.pdf)
We have conducted 3 sets of tests.
- Performance test of non-volatile memory to make a conclusion whether it can be used.
- Performance test of our cache with Intel Optane persistent memory DIMMs together with other options.
- Consistency and non-volatility test.
First we’ll start with memory configuring: ipmctl create -goal PersistentMemoryType=AppDirect
Reboot the server and then check how the system operates the memory in this mode. Create the namespaces (there are 2 in our case, as we have 2 NUMA nodes).
“size”:”756.00 GiB (811.75 GB)”,
[root@raidix ~]# ndctl create-namespace -f –type=pmem –map=mem -m fsdax
“size”:”756.00 GiB (811.75 GB)”,
For the tests we have chosen the file system that supports DAX – xfs.
[root@raidix ~]# mount | grep dax
/dev/pmem1 on /data2 type xfs (rw,relatime,attr2,dax,inode64,noquota)
/dev/pmem0 on /data type xfs (rw,relatime,attr2,dax,inode64,noquota)
And here are the tests results:
And now let’s see how RAIDIX 5.X operates Intel Optane persistent memory-based non-volatile cache.
First, remove namespace, that was created in the previous stage, as RAIDIX does all the necessary work itself during the module initialization.
On the page of RAID management hit “Initialize NVDIMM” button for confirmation: necessary namespace will be created, and the metadata will be prepared and recorded during initialization. While creating the storage resources we state that we’d like to use NVDIMM in case of power loss. Important note: RAIDIX allows to enable the function for every RAID separately.
It’s important that in the real world there are no loads of random-write-only type. We can expand cache size with the same costs (as Intel Optane persistent memory is cheaper than RAM) and the total performance will rise. The purpose of the test is to show the peak performance in the worst case.
Let’s run performance tests at 4k random write, RAID 5, and compare RAID performance with RAM cache, with no cache (Write Through mode) and Intel Optane persistent memory as NV cache:
RAIDIX with Intel Optane persistent memory shows 18% less performance against RAID with RAM-based cache. RAIDIX with Intel Optane persistent memory shows 78% better performance than WT cache. As a result: Intel Optane persistent memory protects the data in case of power loss. With that, cache expansion increases the overall performance of the system and lowers the latency.
And the third test. We emulated power loss of controller and turned the power off during intensive data record. After restoring the power, we checked the system logs, integrity of data and control sums. We repeated the test more than 300 times and every time the data was consistent, zero losses.
The tests got us prove that performance-wise NV, or Intel Optane persistent memory, is great for storage. With that, RAID with NV cache is almost as fast as RAID with RAM cache and almost as reliable as WT. And we are sure that it will keep the data comprehensive during the power outage.