Inside Else Inside TEMPlate====>
 

How Flash is Changing Data Storage

By Paul Rubens

Smarter flash drives will make storage simpler, faster and cheaper.

Moving storage services into storage devices may not be what storage software vendors want, but it will offer many benefits to enterprises.

Falling processor and memory prices mean it's economically feasible to beef up the computing power on storage hardware devices. That's opening up some exciting possibilities for smart flash drives.

To understand why, you need to consider what exactly goes on in solid state drives. Unlike the spinning hard drives, flash drives can't overwrite any arbitrary area of the storage medium.  In particular, they can’t write new data to a partially used  block – they have to write to a previously erased (or never used) block.

The upshot of this is that to look like a traditional disk to the user, to the operating system, to the file system and to any applications, the flash drive's firmware has to do some pretty clever stuff. This includes virtualizing physical blocks and keeping track of the mapping between the physical locations and these virtualized blocks. This is done through a Flash Translation Layer (FTL).

"Because the FTL is like a file system, and because flash drives are getting increasingly clever software and more computational power than in the past, many good things can happen!" says Christos Karamanolis, the chief architect and principal engineer for storage and availability at VMware.

What kind of good things? One example is related to the atomicity of data writes, which guarantees the integrity of data written to a storage medium. Today, protocols like SCSI guarantee that either all or none of a data write will actually happen. Without that there might be a mixture of old and new data written to a disk, or data may be written but its metadata may not be updated. And that leads to data corruption.

File systems take advantage of atomicity to avoid corruption, but to achieve it requires a good deal of computation.  That creates processor overhead, and it adds a high level of complexity to the storage software system.

"Many storage researchers are asking: What if the storage device could guarantee the atomicity of multiple physical blocks, so the file system could just say, Either update both the data and the metadata, or don't do anything?" says Karamanolis.

The benefit of that, he believes, would be a significant reduction in software complexity. This would result in more reliable software, and cheaper software and services (as vendors don't have to invest in such long development cycles.)

"At first these drives would be more expensive, but as they become commoditized  and everyone writes to the common interfaces, they would become cheaper," he adds.

In fact this move towards smarter drives has already started, Karamanolis points out, and cites the example of Seagate's Kinetic disk drive which offers an Ethernet interface and a Key: Value store. Other examples include solid state drives that offer built in encryption and compression in the firmware of the devices themselves.

Another example, for non-volatile storage, is the development of the NVM Express (NVMe) interface for PCI Express (PCIe) storage devices. This brings many advantages over SAS and SATA, because it has been designed specifically for flash storage devices.

"This will introduce stronger semantics for the operating system and file system, like Atomic Test and Set, which is very helpful for scattered writes and gathered reads," says Karamanolis. "This is convenient for building less complex, more efficient software."

These are non-trivial features, which will require some serious software to run in each storage device. That means, inevitably, that the CPU and memory – and associated buses – will have to be upgraded in future solid state storage devices to support this.

"The drives of today are already mini computers in their own right, and what we will continue to see is processing power moving close to where the data is actually stored," said Karamanolis.

In fact this idea is not altogether new: engineers at Carnegie-Melon University looked at the idea of intelligent storage devices that implemented software features in hardware back in the nineties, but the scale and cost of hardware was such that it simply wasn't feasible.

Looking a little way in to the future, there is no technical reason why solid state drives won't be powerful enough to offer a range of storage services like snapshotting , cloning, and deduplication within their own firmware.

And if storage devices are really doing all the work, then storage subsystems may not be needed at all. Servers would simply talk to direct attached storage or networked drives, tell them what storage services to carry out, and leave them to it.

"I certainly think it's possible," says Mark Peters, senior analyst at Enterprise Strategy Group. "After all, one of the main reasons we moved the storage stack "out" was simply pragmatic – there used not to be sufficient engine power on mainframes and then servers to do it more centrally. That has now changed and of course we want more storage functionality to be closer to the apps and processing."

But one reason why this is unlikely to happen in the immediate future is these services need to be standardized, and that is a time consuming task.

"If Intel or Seagate made devices that can take snapshots, it would be pretty useless to me as a software developer," says Karamanolis. "I don't want a single vendor API, I want to write software that works on all hardware. So it's going to take time before these features are supplied by all hardware vendors."

He adds that the choice of features that will be implemented in hardware will largely be driven by what software vendors need. From VMware's perspective, the company's Virtual SAN product  is designed with flash caches for fast access, disks for mass storage and software to carry out snapshots and other services. 

"One day these could move from the virtual abstraction layer into hardware, but that would require that hardware vendors work with us and others, find a common denominator, and implement those requirements," he says. "And that could take some years."

Once again, the benefit of this would be the reduced complexity of storage software. If the devices themselves offer snapshot services then all the storage software would be required to do is some degree of coordination when a higher level object (like a multi-disk volume) is being snapshotted. That means the CPU would consume less cycles doing storage-related tasks, so the entire software stack would work more quickly.

Inevitably there will be drawbacks to this approach, and the most obvious one has to do with the increased complexity of the storage device firmware. More complexity means more opportunities for bugs and security vulnerabilities in the drives themselves, and the concomitant need to ensure that drive firmware is updated to avoid these problems.

"Software engineering is an art, so you have to suspect that early on there will be issues with this type of firmware," says Karamanolis.

On the other hand, you could also argue that by removing functions like snapshotting from the storage system software and moving it to the storage device, you end up with two relatively simple pieces of software compared to one monolithic, complex and difficult to manage application.

In the medium term we may well end up seeing more and more of the storage software stack ending up inside smart solid state drives. That sounds almost like a reversion to big proprietary storage systems, and away from the current trend for storage systems made up of commodity disks, commodity servers, and clever storage software.

Almost, but not quite. That's because the world has moved on from the proprietary storage approach, Karamanolis believes. "We are beyond the point that customers are stuck with one hardware vendor providing the solution. That won't fly any more," he says. "Customers will want the same features irrespective of the hardware vendor, and a commoditized interface. I don't think we will see customers locked in to hardware vendors."

Although some vendors will object as they would rather offer their own additional value with "multi-million line" proprietary software, Karamanolis expects that there to be more combinations of smart drives and open source storage software to control it.

Ultimately the whole move toward smart storage devices has one simple root cause: at the moment, the software stack has to do all sorts of clever tricks and storage operations: a single I/O operation is magnified, consuming memory, CPU cycles and bandwidth.

If storage services are moved out to the storage devices then this consumption of resources won't be needed, and storage operations will end up being faster, simpler, and cheaper.

Photo courtesy of Shutterstock.

  This article was originally published on Tuesday Feb 10th 2015
Home
Mobile Site | Full Site