I was fortunate enough to spend an hour or so with Amitabh Srivastava of EMC; Amitabh is responsible for the Advanced Software division in EMC and one of the principal architects behind ViPR. It was an open discussion about the inspiration behind ViPR and where storage needs to go. And we certainly tried to avoid the ‘Software Defined’ meme.
Amitabh is not a storage guy; in fact his previous role with Microsoft sticks him firmly in the compute/server camp but it was his experience in building out the Azure Cloud offering which brought him appreciation of the problems that storage and data face going forward. He has some pretty funny stories about how the Azure Cloud came about and the learning experience it was; how he came to realise that this storage stuff was pretty interesting and more complex that just allocating some space.
Building dynamic compute environments is pretty much a solved problem; you have a choice of solutions and fairly mature ones. Dynamic networks are well on the way to being solved.
But building a dynamic and agile storage environment is hard and it’s not a solved problem yet. Storage and more importantly the data it holds has gravity or as I like to think of it, long-term persistence. Compute resource can be scaled up and down; data rarely has the idea of scaling down and generally hangs around. Data Analytics just means that our end-users are going to hug data for longer. So you’ve got this heavy and growing thing…it’s not agile but there needs to be some way of making it appear more agile.
You can easily move compute workloads and it’s relatively simple to change your network configuration to reflect these movements but moving large quantities of data around, this is a non-trivial thing to do…well at speed anyway.
Large Enterprise Storage environments are heterogeneous environments, dual supplier strategies are common; sometimes to keep vendors honest but often there is an acceptance the different arrays have difference capabilities and use-cases. Three or four years ago, I thought we were heading towards general purpose storage arrays; we now have more niche and siloed capabilities than ever before. Driven by developments in all-flash arrays, commodity hardware and new business requirements; the environment is getting more complex and not simpler.
Storage teams need a way of managing these heterogenous environments in a common and converged manner.
And everyone is trying to do things better, cheaper and faster; operational budgets remain pretty flat, headcounts are frozen or shrinking. Anecdotally, talking to my peers; arrays are hanging around longer, refresh cycles have lengthened somewhat.
EMC’s ViPR is attempt to solve some of these problems.
Can you lay a new access protocol on top of already existing and persistent data? Can you make so that you don’t have to migrate many petabytes of data to enable a new protocol? And can you ensure that your existing applications and new applications can use the same data without a massive rewrite? Can you enable your legacy infrastructure to support new technologies?
The access protocol in this case is Object; for some people Object Storage is religion…all storage should be object, why the hell do you want some kind of translation layer. But unfortunately, life is never that simple; if you have a lot of legacy applications running and generating useful data, you probably want to protect your investment and continue to run those applications but you might want to mine that data using newer applications.
This is heresy to many but reflects today’s reality; if you were starting with a green-field, all your data might live in an object-store but migrating a large existing estate to an object-store is just not realistic as a short term proposition.
ViPR enables your existing file-storage to be accessible as both file and object. Amitabh also mentioned block but I struggle with seeing how you would be able to treat a raw block device as an object in any meaningful manner. Perhaps that’s a future conversation.
But in the world of media and entertainment, I could see this capability being useful; in fact I can see it enabling some workflows to work more efficiently, so an asset can be acquired and edited in the traditional manner; then ‘moving’ into play-out as an object with rich-metadata but without moving around the storage environment.
Amitabh also discussed possibilities of being able to HDFS your existing storage, allowing analytics to be carried out on data-in-place without moving it. I can see this being appealing but challenges around performance, locking and the like become challenging.
But ultimately moving to an era where data persists but is accessible in appropriate ways without copying, ingesting and simply buying more and more storage is very appealing. I don’t believe that there will ever be one true protocol; so multi-protocol access to your data is key. And even in a world where everything becomes objects, there will almost certainly be competing APIs and command-sets.
The more real part of ViPR; when I say real, I mean it is the piece I can see huge need for today; is the abstraction of the control-plane and making it look and work the same for all the arrays that you manage. Yet after the abomination that is Control Center; can we trust EMC to make Storage Management easy, consistent and scalable? Amitabh has heard all the stories about Control Center, so lets hope he’s learnt from our pain!
The jury doesn’t even really have any hard evidence to go on yet but the vision makes sense.
EMC have committed to open-ness around ViPR as well; I asked the question…if someone implements your APIs and makes a better ViPR than ViPR? Amitabh was remarkably relaxed about that, they aren’t going to mess about with APIs for competitive advantage and if someone does a better job than them; then that someone deserves to win. They obviously believe that they are the best; if we move to a pluggable and modular storage architecture, where it is easy to drop-in replacements without disruption; they better be the best.
A whole ecosystem could be built around ViPR; EMC believe that if they get it right; it could be the on-ramp for many developers to build tools around it. They are actively looking for developers and start-ups to work with ViPR.
Instead of writing tools to manage a specific array; it should be possible to write tools that manage all of the storage in the data-centre. Obviously this is reliant on either EMC or other storage vendors implementing the plug-ins to enable ViPR to manage a specific array.
Will the other storage vendors enable ViPR to manage their arrays and hence increase the value of ViPR? Or will it be left to EMC to do it; well, at launch, NetApp is already there. I didn’t have time to drill into which versions of OnTap however and this where life could get tricky; the ViPR-control layer will need to keep up with the releases from the various vendors. But as more and more storage vendors are looking at how their storage integrates with the various virtualisation-stacks; consistent and early publications of their control functionality becomes key. EMC can use this as enablement for ViPR.
If I was a start-up for example, ViPR could enable me to fast-track management capability of my new device.I could concentrate on storage functionality and capability of the device and not on the periphery management functionality.
So it’s all pretty interesting stuff but it’s certainly not a forgone conclusion that this will succeed and it relies on other vendors coming to play. It is something that we need; we need the tools that will enable us to manage at scale, keeping our operational costs down and not having to rip and replace.
How will the other vendors react? I have a horrible suspicion that we’ll just end up with a mess of competing attempts and it will come down to the vendor who ships the widest range of support for third party devices. But before you dismiss this as just another attempt from EMC to own your storage infrastructure; if a software vendor had shipped/announced something similar, would you dismiss it quite so quickly? ViPR’s biggest strength and weakness is……EMC!
EMC have to prove their commitment to open-ness and that may mean that in the short term, they do things that seriously assist their competitors at some cost to their business. I think that they need to almost treat ViPR like they did VMware; at one point, it was almost more common to see a VMware and NetApp joint pitch than one involving EMC.
Oh, they also have to ship a GA product. And probably turn a tanker around. And win hearts and minds, show that they have changed…
Finally, let’s forget about Software Defined Anything; let’s forget about trying to redefine existing terms; it doesn’t have to be called anything…we are just looking for Better Storage Management and Capability. Hang your hat on that…