Inside Else Inside TEMPlate====>
 

Data-Aware Storage: Taming Data Sprawl Using Real-Time Analytics

By Guest Author

New data-aware storage systems will help companies reduce storage management costs, improve business insights and reduce business risk.

by Jeff Kato, Senior Analyst, The Taneja Group

 

Let’s face it: Storage is dumb today. Mostly it is a dumping ground for data. As we produce more and more data we simply buy more and more storage and fill it up. We don’t know who is using what storage at a given point on time, which applications are hogging storage or have gone rogue, what and how much sensitive information is stored, moved or accessed, and by whom, and so on. Basically, we are blind to whatever is happening inside that storage array.

Am I exaggerating? Of course, I am, but only to a degree.

Can we extract information from the storage array today? Yes, we can. But one has to use a myriad of tools from a variety of vendors and do a lot of heavy lifting to get some meaningful information out of storage. The information is buried deep inside and some external application has to work hard to expose it. This activity is generally so cumbersome that most users simply don’t use it, unless it is required by law. In such cases (compliance or governance, for instance), external software is used to pull relevant information at great expense and time.

Of course, over the past decade, technologies such as auto-tiering have helped in moving less active data to lower-cost storage, and one may even find software that automatically deletes files, when their retention period has expired. But these are all one-off solutions, and the basic premise still stands: storage today is basically dumb.

What if storage were aware of the data it stored? What if all data were catalogued upon creation, indexed and analyzed? What if analytics were built-in and real-time? What if storage were aware of all activity taking place inside? What if data protection were an inherent part of storage and there was no need for media servers and tapes and separate disk systems? What if search and discovery were an integral part of the array?

Wouldn’t smart storage like this be a paradigm shift? Wouldn’t it fundamentally change how we manage, protect and use storage?

Of course, it would.

Welcome to the new era of data-aware storage.

The Need for Data-Aware Storage

This advance could not have come at a better time. Storage growth, as we all know, is out of control. Granted, the cost per gigabyte keeps falling at about 40 percent per year, but we keep growing capacity at about a 60 percent growth rate. This causes both the cost and capacity to keep increasing every year.

While the cost increase is certainly an issue, the bigger issue is manageability. And not knowing what we have buried in those mounds of data, if anything, is an even bigger issue. Instead of data being an asset, it is a dead weight that keeps getting heavier. If we don't do something about it we will simply be overwhelmed, if we are not already.

Why is it possible to develop data-aware storage today when we couldn’t yesterday? Flash technology, virtualization and the availability of "free" CPU cycles make it possible for us to build storage today that can do a lot of heavy lifting from the inside. While this was possible in the past, if implemented, it would have slowed down the performance of primary storage to a point where it would have been useless. But today we can build in a lot of intelligence without impacting performance or quality of service. We call this new type of storage data-aware storage.

When implemented correctly, data-aware storage can provide insights that were not possible yesterday. It can reduce risk for non-compliance and improve governance. It can automate many of the storage management processes that are manual today. It can provide insights into how well the storage is being utilized. It can identify if a dangerous situation were about to occur, either for compliance or capacity or performance or SLA.

In this article we will define the attributes of data-aware storage, examine the business benefits of deploying these systems and provide an industry landscape of up-and-coming storage companies that are introducing these pioneering products.

Data-Aware Storage Defined

All storage systems are getting smarter with each new generation, but to be categorized as data-aware storage, Taneja Group believes they must meet most, if not all, the criteria described below:

  • Increased Awareness: The storage understand more about the content or attributes of the data stored on the device than traditional storage devices do. Examples include enhanced metadata about quality of service, file attributes and application-aware metrics, as well as actually scanning the data real-time looking for contextual patterns or keywords for security and regulatory compliance.
  • Real-Time Analytics: It is not enough for these storage systems to gather enhanced metadata without making it useful in real-time. Therefore these systems must provide instantaneous updates of the enhanced analytics such that administrators or policy engines can react before issues become critical. An example would be the detection and suppression of a rogue application before it can sap IOPS from a more important application Another example would be understanding who is accessing which files and their relationship to others accessing the same files; this would help a business understand which types of data are more important and to which groups of people.
  • Advanced Data Services: In addition, the storage system should have additional data services that enable better business outcomes based on the increased awareness. Examples would be the availability of archiving functions for dormant data, bursting the application to cloud once a threshold has been met, or balancing QoS across different application workloads. Other examples could include triggering compliance workflows or alerts or even built-in intelligent data protection.
  • Open and Accessible APIs: In order for this new category of storage to flourish all the capabilities of these new systems must be open and available to enable a rich ecosystem of integrated applications and tools to come alongside and complement the data-aware storage. There are far too many vertical application requirements that could take advantage of unique data-aware features such that no one company could provide it all. Over time, de facto industry standard APIs will emerge for the most popular enhanced capabilities, similar to how the Amazon S3 data protocol became a standard.

Business Benefits

The key business benefits that come from this data-aware storage systems include the following:

  • Customized Business Outcomes: These advanced data-awareness features should be tailored to a business’s needs and customized through open APIs. Data-aware storage can provide this capability by enabling easier integration into a business’s unique process management, allowing a business to get more value out of the storage.
  • Mitigate Business Risk: Data-aware storage systems can provide compliance and risk mitigation features. Systems can provide alerts if the wrong type of information is stored in the wrong place. Other systems can provide fine-grained user access tracking. In addition, these systems can enforce retention policies based on compliance or security needs. Yet other systems can provide advanced data protection policies based on unique data awareness.
  • Bending the Storage Management Cost Curve: Data-aware storage provides the extra information in real-time such that storage issues can be mitigated immediately and other storage management tasks can be automated through open APIs. Having a real-time pulse on all storage activity along with advanced analytics provides a more proactive management approach to storage and actually increases the value of storage as it grows.
  • Cost-optimized Storage: This long overdue promise will now become increasingly true. The key to solving this problem is having the real-time metrics on the entire storage environment, which a data-aware system provides. Knowing which storage is being consumed by which business applications and at what quality of service is critical. In addition, tools are needed to move data to the appropriate type of storage to optimize cost. The only way to solve this problem is to have a data storage system that provides additional storage metrics in real-time all the time. In addition, some systems can provide very unique data optimization, archiving, compliance, and protection schemes.

Data-Aware Storage Chart

Data-Aware Versus Application-Aware Storage

We should differentiate between data-aware storage and application-aware storage. Both terms are used in the industry to position products with any level of intelligence. Application-aware storage has some similarities with data-aware storage but also some differences. Application-awareness implies that the storage array is aware of some application attributes and/or the application is aware of some storage attributes in such a way that a) makes the interaction more intelligent and b) triggers some actions automatically to optimize/improve application performance and/or storage performance/utilization. Note that while data-aware storage attributes apply to all applications that are being served by that storage, application-aware storage is application specific.

Perhaps the clearest example of application-aware storage in the industry is that of Oracle's FS1 and ZS3/4 product lines. These products are general purpose and offer standard interfaces to support all applications. However, for Oracle applications they invoke special features and procedures that make those applications perform better than they would without them. For example, Oracle implements a special protocol called Oracle Intelligent Storage Protocol (OISP) that enables Oracle Database specific performance and capacity optimizations and assist with provisioning and management in the storage array. This allows Oracle to sell these storage products in the open market for all applications but provide enhanced performance, cost and manageability advantages for their own applications.

Data-awareness and application-awareness can, of course, coexist. A data-aware storage product can also be application-aware for certain applications. But an application-aware storage product may or may not be data-aware as we have defined it here.

Meet the Data-Aware Storage Players

Over time, we expect many more vendors to embrace data-aware product capabilities as they re-architect their products; however, at the time of the writing, Taneja Group considers the following companies at the forefront of data-aware storage:

Data-Aware Storage Vendors

Each of these companies is taking a unique approach to where they want to apply data-aware methods to solve very real business storage issues. They do this while also creating business value through data analytics not previously available. For instance, Qumulo is focused on solving the problems for the largest media, life sciences and oil & gas companies (initial markets) with petabytes of data. They emphasize scalability into many billions of files. DataGravity, on the other hand, is more focused on the mid-market and perhaps solving a broader set of problems for such customers. Tarmin is focused on use case-specific capabilities, such as a data-aware storage platform focused on archiving or backup optimization that can simultaneously perform e-discovery, compliance and archive. Taneja Group fully expects that each will add more data-aware capabilities as they evolve their products to meet unique customer demands.

Summary

Storage has been dumb long enough.

The time is ripe for storage to become data-aware and thereby radically reduce administrative costs while unlocking the value of the data stored. All the right key technologies are now readily available to make storage smart. As exemplified by Qumulo, Tarmin and DataGravity, data-aware storage is not only possible but already delivering serious benefits to customers, especially those that otherwise would be buried under mountains of data and losing control fast.

The data-aware category of storage is in the early stage of development. These companies are all pioneers. They have put a stake in the ground, but a lot of learning is ahead of us. But the time to look seriously at data-aware storage is now. Waiting for perfection is a fool’s paradise, as we have learned again and again in this industry. These companies represent enough of a leapfrog that they are worthy of consideration. Go ahead, get started. We at Taneja Group think you will reduce management costs, improve business insights and reduce business risk, all at the same time. That’s more than you could say about any technology of the past two decades.

In the future, we believe that there will be multiple players who emerge clustered around unique data-aware features that resonate most with customers. Look for 2015 to be the year that the data-aware storage category starts to take shape as a key emerging technology. We also expect most, if not all existing storage vendors to embrace data-awareness; however they will have to significantly re-architect their current products to create offerings equal to those from the pioneers mentioned above.

by Jeff Kato, Senior Analyst, The Taneja Group

Photo courtesy of Shutterstock.

  This article was originally published on Tuesday Mar 31st 2015
Home
Mobile Site | Full Site