Data Centre · Storage

All Flash, NVMe and the Real Economics of Modern Storage

All flash is now the default for primary storage, and the argument has moved on from whether to which flash, where, and at what real cost. The datasheets quote headline numbers and guaranteed data reduction ratios. Here is the honest version, from people who have architected and sold these arrays, on where the economics genuinely stack up and where disk still earns its place.

A few years ago, going all flash for primary workloads was a premium choice you justified case by case. That argument is over. For most primary and mixed workloads the all flash array is now the sensible default, and the interesting questions are the ones underneath: what NVMe actually buys you, how much usable capacity you really get after data reduction, which flash type belongs where, and where spinning disk still quietly wins on cost. This guide answers those, vendor neutral, without the datasheet gloss.

The question is no longer whether to go flash, it is which flash and where

The cost gap between flash and disk has closed far enough that, once you account for data reduction, power, cooling, rack space and the operational simplicity of a single tier, all flash wins the total cost argument for most primary workloads outright. That does not make it the right answer for every terabyte you own. The skill now is matching the right media and the right architecture to each class of data, rather than buying one headline number for everything.

What NVMe actually changes, and what it does not

NVMe is a protocol designed for flash, where the older SAS and SATA protocols were designed for disk and then inherited by flash. The practical effect is lower latency, far deeper parallelism, and much less CPU overhead per input output operation. On a busy array that matters, because the bottleneck has moved off the media. The flash itself has been fast for years. What has been catching up is the controller, the protocol and the network in front of it.

That is the honest framing of NVMe over Fabrics too. Extending NVMe across the network, whether over TCP, over Fibre Channel or with a remote direct memory access transport, removes the storage network as the new bottleneck and lets a host talk to a shared array at close to local latency. It is genuinely useful for latency sensitive workloads such as large databases and dense virtualisation. It is not magic, and most estates do not need to rush to it. If your current fabric is not saturated and your latency is fine, NVMe over Fabrics is a roadmap item, not an emergency.

The honest version

NVMe and NVMe over Fabrics move the bottleneck, they do not remove it. The win is real for dense, latency sensitive workloads. For a general purpose estate that is comfortably served today, the urgency is low. Buy it because a workload needs it, not because it is on the cover of the brochure.

Effective capacity, and how the ratio is really claimed

This is where most storage quotes quietly mislead, and where knowing the kit pays for itself. Vendors price and position on effective capacity, the usable terabytes after deduplication and compression, not raw. A guaranteed ratio such as four to one or five to one sounds like a discount, but the ratio you actually achieve depends entirely on your data, not on the marketing.

The pattern is consistent once you have seen enough estates. Virtual desktop and virtual server estates deduplicate well, because there is genuine repetition across similar images. General file data compresses moderately. Databases vary, and anything already compressed or encrypted upstream, such as images, video, or a database doing its own compression, will barely reduce at all, because you cannot squeeze data twice. So a guaranteed four to one can be perfectly real on one workload and pure fiction on another.

The trap in a quote is buying capacity on the vendor's assumed ratio and discovering your real ratio is lower, which means buying more sooner. Read every effective capacity number as a question, not a promise: what ratio is assumed, on what data, and what happens commercially if you do not hit it. A good guarantee will hold the vendor to the number on your actual data, not theirs.

QLC, TLC and where each belongs

Not all flash is the same, and the difference is a cost and endurance decision, not a brand one. The two types you will be quoted are TLC, which stores three bits per cell, and QLC, which stores four. QLC packs more capacity into the same physical space and costs less per terabyte, but it is slower on writes and wears out faster under heavy write workloads. TLC is the workhorse for mixed and write heavy primary workloads. QLC has become genuinely useful for read heavy and capacity oriented workloads, where its lower cost per terabyte starts to challenge disk on a footprint that is far denser and far less power hungry.

The mistake is treating QLC as a cheaper drop in for everything, or dismissing it as not enterprise grade. The right answer is workload led. Put write heavy, latency sensitive data on TLC, and use QLC to attack the bulk, read mostly capacity that used to justify disk.

Where all flash genuinely wins now

For the majority of primary storage the case is clear, and it is not only about speed. Consolidating several older arrays onto one dense all flash system reduces rack space, power and cooling, often substantially, which matters as much for a constrained data centre as the performance does. A single high performance tier removes the operational drag of tiering and the guesswork of deciding what lives where. Predictable low latency improves the workloads sitting on top, from databases to virtual estates. And once data reduction is working in your favour, the cost per effective terabyte is competitive with the hybrid systems it replaces.

Where disk and hybrid still earn their place

Flash has not killed disk, it has pushed it down the stack to where it still wins. For cold data, archive, backup repositories and very large bulk capacity that is rarely read, disk remains materially cheaper per raw terabyte, and that gap is real at petabyte scale. Object storage on disk is still the sensible home for huge, infrequently accessed datasets. The honest position is that all flash owns primary and most secondary storage, while disk holds the deep, cold, capacity heavy tail where cost per terabyte beats everything else and the performance simply is not needed.

A simple way to decide

Performance sensitive or mixed primary workload: all flash on TLC. Large read mostly capacity you want off disk: all flash on QLC. Cold, archive and bulk that is rarely touched: disk or object still wins on cost. Match the media to the data, do not buy one number for the whole estate.

The total cost picture beyond the per terabyte sticker

If you compare flash and disk on raw cost per terabyte alone, disk still looks cheaper and you will reach the wrong conclusion. The real comparison has to include the data reduction that flash delivers and disk largely does not, the power and cooling a dense flash system saves, the rack space it frees, and the support and operational cost of running fewer, simpler systems. On that basis, the cost per effective terabyte for primary workloads usually favours all flash, sometimes by a clear margin. Where it does not, you are almost always looking at cold or bulk capacity, which is exactly where disk should be anyway.

One more cost that hides in plain sight is maintenance over the life of the array. A consolidated all flash platform usually carries a lower running and support burden than the several hybrid systems it replaces, and that recurring saving compounds across the refresh cycle. It rarely shows up in the headline quote, which is precisely why it is worth modelling properly.

How C4C helps

This is our home ground. We spent years on the vendor side of the enterprise storage market, architecting and selling these platforms, so we know how the arrays really perform, how the effective capacity numbers are constructed, and where the cost case is genuine versus where it is optimistic. We will model the real economics for your data, not the datasheet, separate the flash you genuinely need from the capacity you should leave on disk, and make sure the effective capacity you are quoted is the effective capacity you will actually get. Independent, with no platform to defend.

Weighing an all flash refresh?

Send us your situation, your current arrays, the workloads, and what is prompting the review. We will give you an evidence based view of where all flash genuinely pays, what data reduction you can realistically expect on your data, and what to leave on disk. Independent, with no platform to sell. We have architected and sold these arrays from the inside.

Prefer email? Reach us directly at hello@c4cgroup.co.uk.

Frequently asked questions

Is all flash storage worth it?

For most primary and mixed workloads, yes. Once you account for data reduction, power, cooling, rack space and the operational simplicity of a single tier, all flash usually wins the total cost argument, not just the performance one. The exception is cold, archive and bulk capacity that is rarely read, where disk is still cheaper per terabyte and remains the right home.

What does NVMe actually improve?

NVMe is a protocol built for flash rather than inherited from disk, so it delivers lower latency, far more parallelism and less CPU overhead per operation. The win is real for dense, latency sensitive workloads such as large databases and heavy virtualisation. For an estate that is comfortably served today, it is a roadmap item rather than an urgent change.

Are vendor data reduction ratios realistic?

It depends entirely on your data. Virtual desktop and server estates deduplicate well, file data compresses moderately, and anything already compressed or encrypted will barely reduce at all. A guaranteed four to one can be genuine on one workload and fiction on another. Treat every effective capacity number as a question: what ratio is assumed, on what data, and what happens commercially if you do not hit it.

What is the difference between TLC and QLC flash?

TLC stores three bits per cell and is the workhorse for mixed and write heavy workloads. QLC stores four bits, costs less per terabyte and packs in more capacity, but is slower on writes and wears faster under heavy write load. Use TLC for performance sensitive data and QLC for read mostly, capacity oriented data that used to justify disk.

Is disk storage dead?

No, it has moved down the stack. Flash owns primary and most secondary storage, but disk is still materially cheaper per raw terabyte for cold data, archive, backup and very large bulk capacity, and that gap is real at petabyte scale. The right estate uses flash where performance and density matter and disk where cost per terabyte is all that counts.

What does all flash really cost compared with hybrid?

Compared on raw cost per terabyte, disk and hybrid look cheaper and lead you to the wrong answer. The real comparison includes the data reduction flash delivers, the power, cooling and rack space a dense system saves, and the lower support burden of running fewer arrays. On cost per effective terabyte, all flash usually wins for primary workloads, sometimes clearly. Where it does not, you are looking at cold or bulk capacity that belongs on disk anyway.