In the era of Big Data, speed is critical for all applications and use cases building private and public clouds, running databases, VDI, hosting services, websites, and ecommerce. And storage is the underlying layer and one of the most critical components of any business. By using best-of-breed software-defined storage (SDS), enterprises can gain a powerful competitive advantage for their services and business.
One of the main capabilities for which StorPool is recognized globally is the ability to deliver unmatched storage performance and reliability. Starting from small scale and growing to the petabyte range, it delivers astonishing performance. StorPool scales linearly as the storage system and your business needs grow. Blazing fast storage speed and rich feature set make StorPool a preferred choice for companies, aiming to achieve great performance and high availability at a competitive price point. Even a small StorPool setup can replace a multi-million dollar all-flash storage array or SAN.
There are two important metrics, which businesses should pay attention to when choosing a storage solution – IOPS and latency. The number of IOPS which a storage system can sustain is often used as a differentiator. However it is low latency that is the most important metric for a storage system. This is so, because real production workloads are typically with low queue depth, i.e. latency sensitive, while also experience occasional bursts (lots of operations submitted at the same time). Therefore, for the production workloads, both latency under light load and burst handling capability are very important.
During performance tests StorPool often demonstrates its ability to achieve astonishing results. The latest performance tests by the leader in the block storage SDS show what NVMe drives can do.
The results from the performance tests show a latency as low as 0.06 milliseconds and nearly 1 million IOPS form a small test set-up. While IOPS numbers are impressive for a setup of this size, it is the latency that is mesmerizing. All this for a feature-rich shared storage system, which delivers enterprise features like high-availability and live migration.
The aggregated results of all initiators are as follows:
Even more impressive are the graphs showing IOPS vs. latency from an individual initiator:
IOPS vs. Latency: 4KB random read/write 50/50:
IOPS vs. Latency: 4KB random write:
IOPS vs. Latency: 4KB random read:
This SDS-powered shared storage system delivers extremely low latency under heavy workloads.
Really impressive is the fact that as the load skyrockets, the latency of the system stays practically unchanged, up to the point, where the system gets saturated. After this point the system delivers more IOPS, but naturally at a higher latency. Even then this small system is able to deliver 350,000 IOPS to a single VM!
The storage system under test consisted of three very lean standard servers, acting as storage servers.
Specs of the tested storage system:
3 storage servers (nodes), each with:
– 1RU chassis with 4x hot-swap NVMe drive bays
– CPU: 1x Intel® Xeon® W-2123 4-core CPU
– Memory: 16 GB RAM
– Boot drive: Intel® S3520 150GB
– NIC: Mellanox ConnectX-5 dual-port 100G QSFP28, PCIe x16 – used as dual-port 40G
– Pool drives: 4x Intel P4500 1TB (SSDPE2KX010T7)
– Network/Switches: Dual Mellanox SX1012 with QSFP+ DAC cables
The total capacity of the system equals to 12 TB raw = 3.6 TB usable. It supports a three-way replication, and 10% for CoW (Copy on Write) on-disk format and safety checksums.
The hardware used in the specific performance test is optimized for a small lab setup. Production systems usually have more and much bigger NVMe drives, in order to improve density and optimize $/IOPS and $/GB.
To run the storage performance test, StorPool created 3x StorPool volumes (LUNs) as follows:
– volume size 100 GB (larger than available RAM, so results are not skewed). Total active set for tests 300GB
– total cache in StorPool servers across 3 servers 6.3 GB (2% of active set), to minimize cache effects
– each volume with 3 copies on drives in 3 different storage servers
– each volume is attached to a different initiator host
Then StorPool simulated 3x FIO workloads:
– FIO workloads, initiators and storage servers running on the same physical servers (hyper-converged deployment)
– FIO parameters: ioengine=aio, direct=1, randrepeat=0
– rw parameter = read, write, randread, randwrite, randrw – depending on the test
– Block size = 4k, 1M – depending on the test
– Queue depth = 1, 32, 128 – depending on the test
Some tests were limited by the number of initiators and CPU resources allocated to the workload, initiators and storage software, as this system was originally designed as storage-only lab, but was then reused as a converged system (storage + compute) to run these benchmarks.
The same system would be able to deliver higher throughput to a larger number of external compute servers/workloads.
For more information or any questions contact StorPool at firstname.lastname@example.org or visit www.storpool.com.
The best storage solution when building a cloud.