The Future of Cold Storage

Monday Feb 24th 2014 by Paul Rubens
Share:

Several data storage technologies are competing to be the leader in the storage of infrequently accessed data.

When it comes to cold storage, Blu-ray discs are probably not the way forward.

That's despite Facebook demonstrating a prototype Blu-ray cold storage system that uses 10,000 disks to hold around a million gigabytes of data in January at the Open Compute Project summit. The company needs cold storage  to store backups of users' videos and photos –  vast amounts of cold data that will rarely if ever need to be accessed again.

Facebook's Blu-ray system yields some impressive figures:  Facebook told the summit that its optical disk system is half the price, and five times more energy efficient, than the disk based cold storage it uses at the moment.

But is Blu-ray the future? Almost certainly not, according to Mark Peters, a senior analyst at Enterprise Strategy Group. "Some of what Facebook was doing was laying out a challenge to storage vendors," he says. "I think what they were really communicating was that there is a need for low cost reliable storage."

By setting a benchmark with its Blu-ray system, Facebook is really telling vendors to go and make something that is some combination of lower cost per gigabyte, more reliable, more energy efficient and smaller form factor than anything currently available, in other words.

So what are the options then? There are a number of possibilities, and each may be the best solution in certain use cases, according to Henry Baltazar, a senior analyst at Forrester. That's because not all cold storage needs are alike, he says: you may need to be able to store very large files, you may need very fast access to cold storage files on the occasions that you need them, or you may be happy to wait a considerable amount of time before files in cold storage are accessible.

One cold storage medium that Facebook mentioned it may consider for cold storage in the future is flash storage. On the face of it that sounds absurd, as flash is expensive and usually used for hot storage. For example, in caches where a changing pool of data is needed frequently and very quickly.

But the cost of flash is connected to its reliability and its ability to be written to without wearing out, so in theory it may be possible to make very cheap, very low quality storage that can only be written to once or twice. For normal flash storage uses that would be no use at all, but for it might be well suited to cold storage duties.

There are two potential problems here. At the moment this type of cheap solid state storage product simply doesn't exist – although it may be in the future, perhaps as bi-product of a manufacturing process that yields a very low proportion of high quality flash.

The other is that it would still likely be more expensive than many other storage mediums such as tape and optical drives.

But Baltazar points out that it would have benefits, too. "Even if is still relatively expensive per gigabyte of storage,  the advantage would be that it could be lit up in milliseconds, so data access times from "cold" would be quick. There are use cases where you need this type of high end cold storage."

This would include cases where large quantities of data are held for Big Data analysis. "You don't need that data to be available all the time, but when you do want it you want it quickly. Ideally you need to light it up quickly for Big Data analysis, then let it go cold again," he says.

Using cheap flash in this way would have another advantage too, according to ESG's Mark Peters. "Flash is a mainstream storage medium, and we know how to integrate it, and we have software for it already. The only challenge is to get "crappy flash" at a low enough price."

The next cold storage tier down is likely to be tape or disk – or a hybrid of both – Baltazar believes. "That's what I would use for large video files and HPC data," he says. "These may well be files that are too big to store on Blu-ray disks, which have a relatively low capacity. Blu-Ray may suit Facebook (for backing up users' photos) but it certainly wouldn’t suit everyone."

The benefit of tape is that it offers large storage capacities at low cost and data can be streamed off it at a high rate. Baltazar also highlights hybrid storage devices that offer the best of the benefits of disks and tape. "You can use the disk space as a cache, with the main storage on tape, or you could use the disk space to store metadata like thumbnails on disk, while the high res video that the thumbnails represent is stored on tape," he adds.

To many people, tape storage represents the past rather than the future, but Mark Peters believes that they don't realize just how far tape has come in the last twenty years or so. "Why would people not use tape for cold storage? I struggle with this question because everything I have seen seems to suggest that tape is between one half and one tenth the price as anything else I have seen that can store data for a long time," he says.

"Tape used to have a reputation for being unreliable but this is no longer true," he adds. "It is reliable, it is inexpensive, and you can get between 3 and 8 terabytes of data in a single cartridge."

Conventional disks can also be used for long term storage, although they are susceptible to mechanical failure and data decay or bit rot, and can be costly to run if kept in a spun-up state. But while hard drives will always have a high failure rate, their storage densities and energy consumption figures are improving all the time. "There is a lot of R&D going on in the hard drive business – particularly on the capacity side," says Peters.

Recent innovations include helium filled drives, which consume far less energy than drives filled with relatively denser air and which can also store 50% more data per enclosure than conventional drives. HGST's  Ultrastar He6 drives run 4-5 degrees cooler than air filled drives, and because there is less drag in helium than in air they consume 23% less power, resulting in a 49% improvement in power consumption per terabyte of storage, the company claims.

There have also been innovations in the way that data is written to drives, including Shingled Magnetic Recording (SMR) drives which manufacturers, including Western Digital and Seagate, already have in production. These use relatively wide data tracks that are written to disk in such a way that they partially overlap previous ones, in a manner similar to the way that roof shingles are laid down. (Normally there is a small gap between tracks.)

SMR is particularly suited to continuous writing or erasing rather than small random access reads and writes, making it well suited to archiving data. SMR is expected to enable data densities of as much as three trillion bits per square inch, according to Seagate.

Further out it's likely that another way of laying down data called Heat-assisted magnetic recording (HAMR) will be adopted. HAMR records data on high stability magnetic media such as iron-platinum alloy using laser thermal assistance to heat the material. HAMR allows much higher areal densities than conventional or SMR disks, and Seagate, Western Digital and TDK have all demonstrated the technology although no HAMR disks are available yet. Seagate predicts that by 2020 it will be able to offer  conventional 3.5" drives with a capacity of 20TB or more using HAMR.

The benefit of these technologies is that by increasing areal density they have the potential to change the economics of disk drives. Even if the price per gigabyte is still relatively high, such drives can or will be able to store large amounts of data using comparatively little rack space, and if kept spun down then they consume no energy. Although a large array of these disks may take several minutes to spin up, once running they offer fast access to any cold storage data using a technology that is well understood.

One further option is to use cloud services for cold storage – a market pioneered by Amazon with its Glacier storage service. Glacier is certainly cheap: from 1c per gigabyte per month, but its name – as well as hinting that it is aimed at the cold storage market – also intimates that it is slow. Once data is requested, in what Amazon calls initiating a retrieval job, it typically takes between 3 and 5 hours before it is available for download.

What the underlying storage for Glacier is a mystery that Amazon hasn't revealed, Mark Peters suspects that the service may be tape-based. "Why the 3-5 hour delay? It may be that the system is not completely automated, and there are tape monkeys running around grabbing tapes of shelves. On the other hand, making customers wait may be a deliberate move."

Whatever the reason, it has given Seagate's EVault an opportunity to offer an alternative cloud-based cold storage service that it calls LTS2, or Long Term Storage Service. LTS2 is slightly more expensive than Glacier at 1.5c per gigabyte per month, but data is available for retrieval in a matter of seconds. The service uses spinning disk drives for storage, and there is some speculation that it actually uses Seagate's own SMR drives, though this has not been confirmed.

So when it comes to the future of cold storage, there's certainly no shortage of options: low quality flash; tape; helium, SMR and HAMR disks; tape/disk hybrids; and cloud-based cold storage services of differing speeds.

Each has its own distinct speed, power and space characteristics and cost per gigabyte, and that means there may be demand for them all. For the type of cold storage of relatively small files that Facebook has in mind, there may even be a place for Blu-ray based storage as well.

Photo courtesy of Shutterstock.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved