Inside Else Inside TEMPlate====>
 

The New Server-Side Storage

By Guest Author

The distinctions between servers and storage arrays are growing fuzzier.

By Mike Matchett, Sr. Industry Analyst, Taneja Group

At Taneja Group we are seeing a major trend within IT to leverage server and server-side resources to the maximum extent possible. Servers themselves have become commodities, and dense memory, server-side flash, even compute power continue to become increasingly powerful and cost-friendly. Many datacenters already have a glut of CPU that will only increase with newer generations of faster, larger-cored chips, denser packaging and decreasing power requirements. Disparate solutions from in-memory databases (e.g. SAP HANA) to < ahref="https://www.vmware.com/products/nsx">VMware’s NSX are taking advantage of this rich excess by separating out and moving functionality that used to reside in external devices (i.e. SANs and switches) up onto the server.

Within storage we see two hot trends – hyperconvergence and software defined – getting most of the attention lately. But when we peel back the hype, we find that both are really enabled by this vastly increasing server power – in particular server resources like CPU, memory and flash are getting denser, cheaper and more powerful to the point where they are capable of hosting sophisticated storage processing capabilities directly. Where traditional arrays built on fully centralized, fully shared hardware might struggle with advanced storage functions at scale, server-side storage tends to scale functionality naturally with co-hosted application workloads. The move towards “server-siding” everything is so talked about that it seems inevitable that traditional physical array architectures are doomed.

Yet moving all of an enterprise storage array’s functionality into a server essentially requires adopting and migrating the whole IT stack over to fully hyperconverged solutions (e.g. SimpliVity, GridStore, Nutanix, Scale Computing and various reference architectures) or risk having heavy storage workloads compete with and impact production application performance. Hyperconverged appliances offer great opportunity to simplify the whole stack and optimize TCO, although there can be challenges with wholesale migration, potential vendor lock-in and aligning everything to available appliance SKUs.

Short of hyperconvergence, virtualized storage solutions hosted within application servers can be locally efficient and convenient, but can sometimes (or often?) hinder the optimal global sharing of persisted data, increase the total new infrastructure required and spawn islands of isolated capacity. And those solutions with a more distributed virtual storage grid design can overwhelm networks not designed for massive amounts of IO-heavy east-west traffic flowing between servers.

Overall, both hyperconverged solutions and virtualized storage have a big role to play in the future IoT, hybrid cloud and increasingly distributed/mobile/ROBO world (e.g. Riverbed’s edge hyperconverged SteelFusion). Still, they will not meet the needs of everyone. Hyperconvergence is about replacing the entire infrastructure, and there are indeed situations where this is not warranted, at least for specific applications.

The question is, in those situations, is there a better storage alternative?

Distributing Infrastructure Functions Intelligently

Some storage vendors are now exploring a new, optimally balanced approach, perhaps following the example of network function virtualization (NFV). With NFV, compute-intensive network “functions” are modularized, removed from their previously tight embedding in hardware (e.g. switches) and hosted virtually. This lets key network functions like security sit close to applications, become software upgradeable, offer cloud-like service and scale naturally. As a bonus, network hardware can then be built more simply and cheaply.

In a similar fashion, new array designs are emerging that first smartly modularize storage functions and then intelligently host those components in different layers of the infrastructure. These distributed array designs cleverly move only key “modules” of performance-enhancing storage functionality up into each server client while still maintaining data persistence in a central pool of capacity. In this way they leverage both scale-out commodity server resources and the shared access, optimized capacity and data protection of centralized storage. These new arrays achieve truly scalable performance at an effective price – all without having to re-envision or re-architect the storage array-structured data center.

As an example, turning on global inline deduplication can overwhelm a traditional array design controller. As with scale-out big data architectures, it now makes sense to farm out and “push” compute-intensive processing like deduplication up the stack into each client server. And by deduplicating upstream near the consuming application, everything IO-related downstream becomes even more efficient – including network transmission, data persistence, and protection tasks. Likewise, it makes most sense to persist data in a centralized, shared pool of protected storage. This provides for the easiest global access and shared data workflows, the most resiliency and the lowest TCO for a given capacity.

Server-Side Resource Leverage

As an example, consider flash. Flash is a big resource investment for IT shops these days, deployed to pump up IO performance. But a big question is where should the flash investment best be made – server cache, server storage, network cache, array cache or array tiers? There are arguments for each, but key is to balance cost versus the performance benefit. For maximum performance one might deploy flash as cache closest to needy workloads as possible, while traditional/hybrid/all flash arrays argue that core flash at the shared pool level provides the most leverage for the investment.

We’ve noticed that server-side flash is available in many sizes, formats and options, and it is almost always cheaper (per GB) than costly array vendor SSDs. The trick is to be able to leverage commodity server-side resources like flash (and increasingly dense RAM) intelligently. For example, there are many focused vendors (e.g. Infinio, Pernix, SanDisk, PrimaryIO) offering server-side flash and/or RAM caching for IO hungry applications. These server-side solutions avoid the need to invest in costly performance-oriented resources in underlying shared array storage. By server-hosting key storage functionality like IO acceleration, IT can invest separately (due to specific needs, budgets or timing) in either more performance by adding flash/RAM at the server or in more capacity by adding cost-effective large disks to arrays.

However, there are other considerations to bear in mind. Because these solutions are not necessarily integrated with a persistent store, they don’t always allow end-to-end data services to be as rich as with a storage array. Deduplication, compression, snaps and clones, if they exist in both the server side solution and an underlying array, don’t necessarily synchronize to share benefits. If dedupe is done on a cache, it usually has to be done separately on the array. That’s inefficient, and often these features don’t exist consistently across the two domains. As always, there are pluses and minuses for each approach.

Network Efficiency Is (Still) Key

One of the key enablers to the effective distribution of functionality is optimizing all the storage traffic across the network. Dedicated high-end SANs like FC (or InfiniBand) traditionally stitch together enterprise servers with shared storage, but fundamentally add significant cost and complexity (and often lowering ultimate agility). ISCSI may be just fine for virtual clients accessing shared storage, but in these new intelligent designs where storage functionality is split between servers and centralized disks it falls down. There is room yet for a more highly optimized “inter-array” network protocol. Here is where new innovative storage array vendors like Datrium provide real differentiation. Between their server-side storage layer that provides scalable performance using local flash and compute, and their cost-optimized shared capacity storage nodes (simplified two-controller capacity-oriented array shelves), they’ve implemented a distributed filesystem design with a customized network protocol. This optimized “internal storage” data network is designed to increase IO performance, avoid many of the IO impacting issues with standard Ethernet-based protocols, and yet still take advantage of commodity networking infrastructure (i.e. Ethernet).

By smartly splitting array functionality across servers and storage sub-systems, IT architects are free to take advantage of and deploy existing infrastructure or new kinds and formats of servers (e.g. blades) and server resources (e.g. PCIe flash v. SSD flash) when and where they wish without impacting the underlying data store. This also splits storage host-based performance provisioning from appliance-based data durability, so IT can dynamically manage hot spots in mixed-VM environments – a design element that is unlike both hyperconverged approaches and arrays.

We think of this new style of storage architecture as Server Powered Storage (SPS) and we expect a number of startups are building products using these principles. But to our knowledge Datrium is leading the pack.

Doing The Right Things at the Right Place and Time

In summary, we think the traditional monolithic storage array is doomed. The line between compute servers and storage nodes is getting fuzzier every day whether we are talking about the best infrastructure for big data, mission-critical (RDBMS-based) applications, virtual hosting or cloud building. Any technology development that enables hosting modular pieces of formerly monolithic functionality at the best places in the IO lifecycle and workflow path is worth evaluating.

With these new distributed function storage systems, IT can leverage expensive protected storage in a shared pool manner, while taking full advantage of relatively inexpensive server-side assets to really ramp up local performance.

Photo courtesy of Shutterstock.

  This article was originally published on Thursday May 5th 2016
Home
Mobile Site | Full Site