Data Centre · Storage

Block, File and Object Storage: Which You Actually Need

Block, file and object are not three products you pick between once and forget. They are three different ways of presenting storage, each built for different work. Match them to your workloads well and everything downstream is simpler and cheaper. Match them badly and you pay for it in performance, cost or complexity for years. Here is how to tell which your workloads genuinely need, from a team that has designed and run all three at scale.

One of the most common and most expensive mistakes in infrastructure is choosing a storage type by habit, by whatever the last array did, or by whatever a vendor happened to lead with. The three types are not interchangeable. They sit at different points on the trade off between speed, sharing and scale, and the right one is decided by the workload, not the logo on the array.

So before any conversation about Dell, NetApp, Pure or anyone else, the first question is which of these three you actually need, and for which data. Get that right and the vendor choice becomes far easier. Get it wrong and no amount of premium hardware will rescue the design.

Start with the workload, not the protocol

Block, file and object are all just ways of presenting the same underlying capacity to whatever is consuming it. What separates them is the unit they hand out and the path the data takes. That unit decides the latency, the way data is shared, and how far the system can scale. So the honest way to choose is to start from the workload and let it pull you toward the right protocol, rather than starting from a protocol you already own and bending the workload to fit.

Block storage: the database and virtual machine workhorse

Block storage presents raw volumes to a host, which then formats them with its own file system and treats them as if they were local disks. There is very little between the application and the media, which is exactly why block delivers the lowest and most predictable latency of the three. This is the storage that databases, virtual machine datastores and any latency sensitive transactional system want underneath them.

Block is carried over a storage network, historically Fibre Channel and increasingly Ethernet using iSCSI or NVMe over Fabrics. That network, the SAN, is what people are really talking about when they say SAN storage. The strength of block is performance and control. The cost is that a block volume is typically owned by one host or cluster at a time, so block is not how you share files between many users.

Rule of thumb

If the workload is a database, a virtual machine estate or anything where a millisecond of latency matters, start with block. It is the default for primary, performance sensitive data, and the reason almost every enterprise array leads with it.

File storage: shared access for people and applications

File storage presents a shared file system over the network that many clients can mount and use at the same time. It speaks the protocols people already know, NFS in the Linux and virtualisation world and SMB in the Windows world. The storage itself manages the directory tree, the permissions and the locking, so clients simply read and write files as if the share were local.

This is the natural home for anything genuinely shared: user home directories, departmental shares, application data that several servers read at once, media and engineering working files, and many virtualisation datastores presented over NFS. File storage trades a little of the raw performance of block for the enormous convenience of shared, concurrent access with familiar permissions. Network attached storage, or NAS, is simply the common name for storage that serves files this way.

Block or file for virtual machines?

Both work. Many platforms run happily on block datastores for performance, and many teams prefer file based NFS datastores because they are simpler to provision and grow. Neither is wrong. It is a genuine operating model choice, and a good array will let you do either.

Object storage: scale, durability and cheap capacity

Object storage takes a completely different shape. Instead of volumes or a file tree, it holds data as objects in a flat namespace, each with a unique identifier and its own metadata, reached over HTTP using the S3 protocol that has become the de facto standard. There is no traditional file system in the path, which is what lets object storage scale to billions of objects and petabytes of capacity on commodity hardware, with durability built in by spreading copies or erasure coded fragments across many nodes.

The trade off is latency. Object storage is built for throughput, scale and cost, not for fast random reads and writes. You would not put a transactional database on it. But for the right data it is unbeatable on cost per terabyte and on resilience. The classic fits are backup and archive targets, media and content libraries, log, sensor and analytics data, and cloud native applications that were written to speak S3 from the start.

Where object earns its place

Large volumes of unstructured data where scale, durability and cost matter more than latency. If you are sizing a backup target, an archive tier or a data lake, object is usually the right and the cheapest answer. If you are sizing a database, it is the wrong one.

Where people get it wrong

Most storage regret traces back to one of a few familiar mismatches. Each is easy to avoid once you have seen it.

  • Object for transactional work. Object storage is cheap and durable, which tempts teams to push latency sensitive applications onto it. The applications then crawl, because object was never built for fast random access. The saving evaporates into a performance problem.
  • File for absolutely everything. A NAS is convenient, so it quietly becomes the home for databases and virtual machines that would have been faster and more predictable on block. It works until it does not, usually under load at the worst possible moment.
  • Block where file would be simpler. Carving up block volumes and layering a clustered file system on top to share data between hosts, when a file share would have done the same job with a fraction of the complexity.
  • Buying the protocol you already own. Extending the incumbent platform because it is there, rather than because it fits the new workload. This is how estates drift into the wrong shape one project at a time.

How modern arrays blur the lines

To complicate the neat picture, many modern platforms are unified. They present block and file from the same system, and a growing number add object too, so a single array can serve a database over block, a department share over SMB and a backup target over S3 at once. That is genuinely useful. It simplifies operations, consolidates the estate and reduces the number of things to manage and support.

But unified hardware does not repeal the rules. A single array that can speak object does not make object suitable for your transactional database. The protocol still has to match the workload. The right way to use a unified platform is to choose each protocol on its workload first, and only then ask whether one platform can serve several of them well. Consolidation is a benefit you collect after the design is right, not a substitute for getting it right.

The honest framing

Unified arrays are about operational simplicity and consolidation, not about making one protocol do another protocol's job. Decide the workload to protocol match on its own merits, then let consolidation be the bonus.

How to decide, in practice

You do not need a long study to get this right. For each significant workload or data set, walk three questions.

  • How latency sensitive is it? Transactional and performance critical data points to block. Tolerant, throughput oriented data opens up file and object.
  • How is it shared? Owned by one host or cluster suits block. Read and written by many clients at once suits file. Reached by applications over HTTP at scale suits object.
  • How big does it get, and how cheap must it be? Modest and performance led stays on block or file. Very large, growing and cost sensitive moves toward object.

Run every workload through those three and the estate sorts itself into a sensible shape. Primary databases and virtual machines on block, shared and collaborative data on file, bulk unstructured and backup and archive on object. Then, and only then, you look at which platforms can serve those tiers well, and how few of them you can get away with.

Not sure which storage your workloads actually need?

Send us your workload mix or your current situation and we will give you an independent, vendor neutral view: what belongs on block, file and object, where the estate has drifted out of shape, and how to consolidate it sensibly. We have designed and run all three at scale.

Prefer email? Reach us directly at hello@c4cgroup.co.uk.

Frequently asked questions

What is the difference between block, file and object storage?

They are three ways of presenting the same underlying capacity. Block storage hands the host raw volumes that it formats and controls, which gives the lowest latency. File storage presents a shared file system over the network that many clients mount at once. Object storage holds data as objects in a flat namespace reached over HTTP, built for massive scale rather than speed.

Which is fastest, block, file or object?

Block storage is normally the fastest and lowest latency, which is why databases and virtual machines sit on it. File storage is fast enough for shared access but carries more overhead. Object storage is the slowest to respond per request and is built for throughput and scale, not for low latency transactional work.

When should I use object storage?

Use object storage for large volumes of unstructured data where scale, durability and cost matter more than latency. Backup and archive, media libraries, log and sensor data, and cloud native applications that speak the S3 protocol are the classic fits. It is the wrong choice for databases or anything needing fast random writes.

Can one array do block, file and object?

Many modern arrays present block and file from the same platform, and some add object too. That unified approach simplifies operations, but the protocol still has to match the workload. A unified array does not make object storage suitable for a transactional database. Choose the protocol on the workload first, then decide whether one platform can serve several.

Is NAS the same as file storage?

In practice yes. Network attached storage, or NAS, is the common name for file storage presented over the network using NFS or SMB. A SAN, by contrast, is the network that carries block storage. People often confuse the two, but a NAS serves files and a SAN serves blocks.

What storage do virtual machines and databases need?

Both generally want block storage for its low latency and predictable performance, presented as datastores or volumes. Virtual machine platforms can also use file based datastores over NFS, which many teams find simpler to manage. Databases with heavy transactional load almost always belong on block.