Storagebod Rotating Header Image

Big Ideas

#storagebeers – September 25th – London

So as the evenings draw in; what could be nicer than a decent pint of beer with great company?

Well, this isn’t that…it’s a #storagebeers to be held in London on September 25th. There’s a few storage events around this date and we thought that it would be an ideal opportunity to bring the community together.

So if you are a storage admin, a vendor, a journo or perhaps you work for EMC Marketing and you want come along and tell me why the megalaunch was awesome and not tacky….please come along.

We’ll be in the Lamb and Flag near Covent Garden from about 17:30, may be earlier.

There is a rumour that Mr Knieriemen will be there and buying at least one drink…

Such Fun…

With EMC allegedly putting the VMAX into the capacity tier and suggesting that performance cannot be met by the traditional SAN; are we finally beginning to look at the death of the storage array?

The storage array as a shared monolithic device came about almost directly as the result of distributed computing; the necessity for a one-to-many device was not really there when the data-centre was dominated by the mainframe. And yet as computing has become ever more distributed; the storage array has begun to struggle more and more to keep up.

Magnetic spinning platters of rust have hardly increased in speed in a decade or more; their capacity has got ever bigger tho’; storage arrays have got denser and denser from a capacity point of view, yet real-world performance has just not kept pace. More and more cache has helped to hide some of this; SSDs have helped but to what degree?

It also has not helped that the plumbing for most SANs is Fibre-channel; esoteric, expensive and ornery, the image of the storage array is not good.

Throw in the increased compute power and the ever incessant demands for more data processing, coupled with an attitude to data-hoarding at a corporate scale which would make even the most OCD amongst of us look relatively normal.

And add the potential for storage-arrays to become less reliable and more vulnerable to real data-loss as RAID becomes less and less of an viable data-protection methodology at scale.

Cost and complexity with a sense of unease about the future means that storage must change. So what are we seeing?

A rebirth in DAS? Or perhaps simply a new iteration of DAS?

From Pernix to ScaleIO to clustered-filesystems such as GPFS; the heart of the new DAS is Shared-Nothing-Clusters. ex-Fusion-IO’s David Flynn appears to be doing something to pool storage attached to servers; you can bet that there will be a Flash part to all this.

We are going to have a multitude of products; interoperability issues like never before, implementation and management headaches…do you implement one of these products or many? What happens if you have to move data around between these various implementations? Will they present as a file-system today? Are they looking to replace current file-systems; I know many sys-admins who will cry if you try to take VxFS away from them.

What does data protection look like? I must say that the XIV data-protection methods which were scorned by many (me included) look very prescient at the moment (still no software XIV tho’? What gives IBM…).

And then there is application specific nature of much of this storage; so many start-ups are focused on VMware and providing storage in clever ways to vSphere…when VMware’s storage roadmap looks so rich and so aimed taking that market, is this wise?

The noise and clamour from the small and often quite frankly under-funded start-ups is becoming deafening…and I’ve yet to see a compelling product which I’d back my business on. The whole thing feels very much like the early days of the storage-array; it’s kind of fun really.

You Will be Assimilated.

So why are the small Flash vendors innovating and the big boys not? Why are they leaving them for dust? And do the big boys care?

Innovation in large companies is very hard; you have all the weight of history pressing down on you and few large companies are set-up to allow their staff to really innovate. Even Google’s famous 20% time has probably not born the fruit that one would expect.

Yet innovation does happen in large companies; they all spend a fortune on R&D; unfortunately most of that tends to be making existing products better rather than come up with a new product.

Even when a new concept threatens to produce a new product; getting an existing sales-force to sell a new product…well, why would they? Why would I as a big-tin sales-droid try and push a new concept to my existing customer base? They probably don’t even want to talk about something new; it’s all about the incremental business.

I have seen plenty of concepts squashed which then pop up in new start-ups having totally failed to gain traction in the large company.

And then there are those genuinely new ideas that the large vendor has a go at implementing themselves; often with no intention of releasing their own product, they are just testing the validity of the concept.

Of course, then there is the angel funding that many larger vendors quietly carry out; if you follow the money it is not uncommon to find a large name sitting somewhere in the background.

So do the big boys really care about the innovation being driven by start-ups…I really don’t think so. Get someone else to take the risk and pick-up the ones which succeed at a later date.

Acquisition is a perfectly valid R&D and Innovation strategy. Once these smaller players start really taking chunks of revenue from the big boys…well, it’s a founder with real principles who won’t take a large exit.

Of course, seeing new companies IPO is cool but it’s rarely the end of the story.

 

 

The Landscape Is Changing

As the announcements and acquisitions which fall into the realms of Software Defined Storage or Storage as I like to call it continue to come; one starts to ponder how this is all going to work and work practically.

I think it is extremely important to remember that firstly, you are going to need hardware to run this software on and this although is trending towards a commodity model; there are going to be subtle differences that are going to need accounting for. And as we move down this track, there is going to be a real focus on understanding workloads and the impact of different infrastructure and infrastructure patterns on this.

I am seeing more and more products which enable DAS to work as shared-storage resource; removing the SAN from the infrastructure and reducing the complexity. I am going to argue that this does not necessarily remove complexity but it shifts it. In fact, it doesn’t remove the SAN at all; it just changes it.

It is not uncommon now to see storage vendor presentations that show Shared-Nothing-Cluster architectures in some form or another; often these are software and hardware ‘packaged’ solutions but as end-users start to demand the ability to deploy on their own hardware, this brings a whole new world of unknown behaviours into play.

Once vendors relinquish control of the underlying infrastructure; the software is going to have to be a lot more intelligent and the end-user implementation teams are going to have to start thinking more like the hardware teams in vendors.

For example, the East-West traffic models in your data-centre become even more important and here you might find yourself implementing low-latency storage networks; your new SAN is no longer a North-South model but Server-Server (East-West). This is something that the virtualisation guys have been dealing with for some time.

Understanding performance and failure domains; do you protect the local DAS with RAID or move to a distributed RAIN model? If you do something like aggregate the storage on your compute farm into one big pool, what is the impact if one node in the compute farm starts to come under load? Can it impact the performance of the whole pool?

Anyone who has worked with any kind of distributed storage model will tell you that a slow performing node or a failing node can have impacts which far exceed that you believe possible. At times, it can feel like the good old days of token ring where a single misconfigured interface can kill the performance for everyone. Forget about the impact of a duplicate IP address; that is nothing.

What is the impact of the failure of a single compute/storage node? Multiple compute/storage nodes?

In the past, this has all been handled by the storage hardware vendor and pretty much invisibly at implementation phase to the local Storage team. But you will need now to make decisions about how data is protected and understand the impact of replication.

In theory, you want your data as close to the processing as you can but data has weight and persistence; it will have to move. Or do you come up with a method that allows you in a dynamic infrastructure that identifies where data is located and spins/moves the compute to it?

The vendors are going to have to improve their instrumentation as well; let me tell you from experience, at the moment understanding what is going on in such environments is deep magic. Also the software’s ability to cope with the differing capabilities and vagaries of a large-scale commodity infrastructure is going to be have to be a lot more robust than it is today.

Yet I see a lot of activity from vendors, open-source and closed-source; and I see a lot of interest from the large storage consumers; this all goes to point to a large prize to be won. But I’m expecting to see a lot of people fall by the road.

It’s an interesting time…

 

 

From Servers to Service?

Should Enterprise Vendors consider becoming Service Providers? When Rich Rogers of HDS  tweeted this and my initial response was

This got me thinking, why does everyone think that Enterprise Vendors shouldn’t become Service Providers? Is this a reasonable response or just a knee-jerk, get out of my space and stick to doing what you are ‘good’ at.

It is often suggested that you should not compete with your customers; if Enterprise Vendors move into the Service Provider space, they compete with some of their largest customers, the Service Providers and potentially all of their customers; the Enterprise IT departments.

But the Service Providers are already beginning to compete with the Enterprise Vendors, more and more of them are looking at moving to a commodity model and not buying everything from the Enterprise Vendors; larger IT departments are thinking the same. Some of this is due to cost but much of it is that they feel that they can do a better job of meeting their business requirements by engineering solutions internally.

If the Enterprise Vendors find themselves squeezed by this; is it really fair that they should stay in their little box and watch their revenues dwindle away? They can compete in different ways, they can compete by moving their own products to more of a commodity model, many are already beginning to do so; they could compete by building a Service Provider model and move into that space.

Many of the Enterprise Vendors have substantial internal IT functions; some have large services organisations; some already play in the hosting/outsourcing space.  So why shouldn’t they move into the Service Provider space? Why not leverage the skills that they already have?

Yes, they change their business model; they will have to be careful that they ensure that they compete on a level playing field and look very carefully that they are not utilising their internal influence on pricing and development to drive an unfair competitive advantage. But if they feel that they can do a better job than the existing Service Providers; driving down costs and improving capability in this space….more power to them.

If an online bookstore can do it; why shouldn’t they? I don’t fear their entry into the market, history suggests that they have made a bit of a hash of it so far…but guys fill your boots.

And potentially, it improves things for us all; as the vendors try to manage their kit at scale, as they try to maintain service availability, as they try to deploy and develop an agile service; we all get to benefit from the improvements…Service Providers, Enterprise Vendors, End-Users…everyone.

 

More Thoughts On Change…

This started as a response to comments on my previous blog but seemed to grow into something which felt like a blog entry in it’s own right. And it allowed me to rethink a few things and crystalise some ideas.

Enterprise Storage is done; that sounds like a rash statement, how can a technology ever be done? So I better explain what I mean. Pretty much all the functionality that you might expect to be put into a storage array has been done and it is now done by pretty much every vendor.

Data Protection – yep, all arrays have this.

Clones, Snaps – yep, all arrays have this and everyone has caught up with the market-leader.

Replication – yep, everyone does this but interestingly enough, I begin to see this abstracted away from array

Data Reduction – mostly, dedupe and compression are on almost every array; slightly differing implementations, some architectural limitations showing.

Tiering – mostly, yet again varying implementations but fairly comparable.

And of course, there is performance and capacity. This is good enough for most traditional Enterprise scenarios; if you find yourself requiring something more, you might be better at looking at non-traditional Enterprise storage. Scale-Out for capacity and All-Flash for performance. Now, the traditional Enterprise Vendors are having a good go at hacking in this functionality but there is a certain amount of round pegs, square holes and big hammers going on.

So the problem for the Enterprise Storage vendors is as their arrays head towards functionality completeness is how they compete. Do we end up in a race to the bottom? And what is the impact of this? Although their technology still has value, it’s differentiation is very hard to quantify. It’s become commodity.

And as we hit functionality completeness; it is more likely that open-source technologies will ‘catch-up’; then you end up competing with free. How does one compete with free?

You don’t ignore it for starters and you don’t pretend that free can’t compete on quality; that did not work out so well for some of the major server vendors as Linux ate into their install base. But you can look at how Red-Hat compete with free; they compete on service and support.

You no longer compete on functionality; Centos pretty much has the same functionality as Red Hat. You have to compete differently.

But firstly you have to look at what you are selling; the Enterprise Storage vendors are selling software running on what is basically commodity hardware. Commodity, should not be taken as some kind of second-rate thing; it really means that we’ve hit a point where it is pretty standard, there is little differentiation.

Yet this does not necessarily mean cheap, Diamonds are a commodity. However, customers can see this and they can compare your price of the commodity hardware that your software runs on against the spot-price of that hardware on the open market.

In fact if you were open and honest, you might well split out the licensing costs of your software and the cost of the commodity hardware?

This is the very model that Nexenta use. Nexenta publish a HSL of components that they have tested Nexenta-stor on; there are individual components and also complete servers. This enables customers to white-box if they want or leverage existing server support contracts. If you go off piste; they won’t necessarily turn you away but there will be a discussion. The discussion may result in something new going onto the support list; it may end up finding out something definitively does not work.

We also have VSAs popping up in one form or another; these piggy-back on the VMware HCL generally.

So is it really a stretch to suggest that the Enterprise Storage vendors might take it a stage further; a fairly loose hardware support list that allows you to run the storage personality of your choice on the hardware of your choice?

I suspect that there are a number of vendors who are already considering this; they might well be waiting for someone to break formation first. There’s quite a few of them who already have; they don’t talk about it but there are some hyper-scale customers who are already running storage personalities on their own hardware. If you’ve built a hyper-scale data-centre based around a standard build of rack, server etc; you might not want a non-standard bit of kit messing up your design.

If we get some kind of standardisation in the control-plane APIs; the real money to be made will be in the storage management and automation software. The technologies which will allow me to use a completely commoditised Enterprise Storage Stack are going to be the ones that are interesting.

Well, at least until we break away from an array-based storage paradigm; another change which will eventually come.

 

 

Change Coming?

Does your storage sales rep have a haunted look yet? Certainly if they work for one of the traditional vendors, they should be beginning to look hunted and concerned about their prospects long-term; not this year’s figures and probably not next year’s but the year after? If I was working storage sales, I’d be beginning to wonder about what my future holds. Of course, most sales look no further than the next quarter’s target but perhaps its time to worry them a bit.

Despite paying lip-service to storage as software; very few of the traditional vendors (and surprisingly few start-ups either) have really embraced this and taken it to the logical conclusion; commoditisation of high margin hardware sales is going to come and despite all efforts of the vendors to hold this back, it is going to change their business.

Now I’m sure you’ve read many blogs predicting this and you’ve even read vendor blogs telling you how they are going to embrace this; they will change their market and their products to match this movement. And yet I am already seeing mealy-mouthed attempts to hold this back or slow it down.

Roadmaps are pushing commoditisation further off into the distance; rather than a whole-hearted endorsement, I am hearing HCLs and limited support. Vendors holding back on releasing virtual editions because they are worried that customers might put them into production. Is the worry that they won’t work or perhaps that they might work too well?

Products which could be used to commoditise an environment are being hamstrung by only running on certified equipment. And for what is very poor reasoning; unless the reasoning is to protect a hardware business. I can point to examples in every major vendor; from EMC to IBM to HDS to HP to NetApp to Oracle.

So what is going to change this? I suspect customer action is the most likely vector for change? Cheap and deep for starters; you’d probably be mad not to seriously consider looking at a commodity platform and open-source. Of course vendors are going to throw a certain amount of FUD but like Linux before; there is momentum beginning to grow, lots of little POCs popping up.

And there are other things round the corner which may well kick this movement yet further along. 64-bit ARM processors have been taped out; we’ll begin to see servers based on those over the next couple of years. Low-power 64-bit servers running Linux and one of a multitude of open-source storage implementations will become two-a-penny; as we move to scale-out storage infrastructure, these will start to infiltrate larger data-centres and will rapidly move into the appliance space.

Headaches not just for the traditional storage vendors but also for Chipzilla; Chipzilla has had the storage market sewn-up for a few years but I expect ARM-based commodity hardware to push Chipzilla hard in this space.

Yet with all the focus on Flash-based storage arrays, hybrid-arrays and the likes; everyone is currently focusing on the high-margin hardware business. No vendor is really showing their hand in the cheap and deep space; they talk about big data, they talk about software defined storage…they all hug those hardware revenues.

No, many of us aren’t engineering companies like Google and Facebook but the economics are beginning to look very attractive to some of us. Data growth isn’t going away soon; the current crop of suppliers have little strategy apart from continuing to gouge…many of the start-ups want to carry on gouging whilst pretending to be different.

Things will change.

Storage is Interesting…

A fellow blogger has a habit of referring to storage as snorage and I suspect that is the attitude of many. What’s so interesting about storage, it’s just that place that you keep your stuff? And many years ago as an entry level systems programmer; there were two teams that I was never going to join…one being the test team and the other being the storage team, because they were boring. Recently I have run both a test team and a storage team and enjoyed the experience immensely.

So why do I keep doing storage? Well, firstly I have little choice but to stick to infrastructure; I’m a pretty lousy programmer and it seems that I can do less damage in infrastructure. If you ever received more cheque-books in the post from a certain retail bank, I can only apologise.

But storage is cool; firstly it’s BIG and EXPENSIVE; who doesn’t like raising orders for millions? It is also so much more than that place where you store your stuff; you have to get it back for starters. I think that people are beginning to realise that storage might be a little more complex than first thought; a few years ago , the average home user only really worried about how much disk that they had but the introduction of SSDs into the consumer market has hammered home how the type of storage matters and the impact it can have on the user experience.

Spinning rust platters keep getting bigger but for many, this just means that the amount of free-disk keeps increasing, the increase in speed is what people really want. Instant On..it changes things.

So even in the consumer market; storage is taking on a multi-dimensional personality; it scales both in capacity but also in speed. In the Enterprise; things are more interesting.

Capacity is obvious; how much space do you need? Performance? Well, performance is more complex and has more facets than most realise. Are you interested in IOPs? Are you interested in throughput? Are you interested in aggregate throughput or single stream? Are you dealing with large or small files? Large or small blocks? Random or sequential?

Now for 80% of use-cases; you can probably get away with taking a balanced approach and just allocating storage from a general purpose pool. But 20% of your applications are going to need something different and that is where it gets interesting.

Most of the time when I have conversations with application teams or vendors; when I ask the question as to what type of storage that they require, the answer comes back is generally fast. There then follows a conversation as to what fast means and whether the budget meets their desire to be fast.

If we move to ‘Software Defined Storage’, this could be a lot more complex than people think. Application developers may well have to really understand how their applications store data and how they interact with the infrastructure that they live on. If you pick the wrong pool,  your application performance could drop through the floor or the wrong availability level, you experience a massive outage.

So if you thought storage was snorage; most developers and people still do, you might want to start taking an interest. If infrastructure becomes code; I may need to get better a coding but some of you are going to have to get better at infrastructure. Move beyond fast and large and understand the subtleties; it is interesting…I promise you!

/dev/null – The only truly Petascale Archive

As data volumes increase in all industries and the challenges of data management continue to grow; we look for places to store our increasing data hoard and inevitably the subject of archiving and tape comes up.

It is the cheapest place to archive data by some way; my calculations currently give it a four-year cost something in the region of five-six times cheaper than the cheapest commercial disk alternative . However tape’s biggest advantage is almost its biggest problem; it is considered to be cheap and hence for some reason no-one factors in the long-term costs.

Archives by their nature live for a long-time; more and more companies are talking about archives which will grow and exist forever. And as companies no longer seem to be able to categorise data into data to keep and data not to keep; exponential data-growth and generally bad data-management; multi-year, multi-petabyte archives will eventually become the norm for many.

This could spell the death for the tape-archive as it stands or it will necessitate some significant changes in both user and vendor behaviour. A ten year archive will see at least four refreshes of the LTO standard on average; this means that your latest tape technology will not be able to read your oldest tapes. It is also likely that you are looking at some kind of extended maintenance and associated costs for your oldest tape-drives; they will certainly be End of Support Life. Media may be certified for 30 years; drives aren’t.

Migration will become a way of life for these archives and it is this that will be a major challenge for storage teams and anyone maintaining an archive at scale.

It currently takes 88 days to migrate a petabyte of data from LTO5-to-LTO6; this assumes 24×7, no drive issues, no media issues and a pair of drives to migrate the data. You will also be loading about 500 tapes and unloading about 500 tapes. You can cut this time by putting in more drives but your costs will soon start escalate as SAN ports, servers and periphery infrastructure mounts up.

And then all you need is for someone to recall the data whilst you are trying migrate it; 88 days is extremely optimistic.

Of course a petabyte seems an awful lot of data but archives of a petabyte+ are becoming less uncommon. The vendors are pushing the value of data; so no-one wants to delete what is a potentially valuable asset. In fact, working out the value of individual datum is extremely hard and hence we tend to place the same value on every byte archived.

So although tape might be the only economical place to store data today but as data volumes grow; it becomes less viable as long-term archive unless it is a write-once, read-never (and I mean never) archive…if that is the case, perhaps in Unix parlance, /dev/null is the only sensible place for your data.

But if you think your data has value or more importantly your C-levels think that your data has value; there’s a serious discussion to be had…before the situation gets out of hand. Just remember, any data migration which takes longer than a year will most likely fail.

Service Power..

Getting IT departments to start thinking like service providers is an up-hill struggle; getting beyond cost to value seems to be a leap too far for many. I wonder if it is a psychological thing driven by fear of change but also a fear of assessing value.

How do you assess the value of a service; well, arguably, it is quite is simple…it is worth whatever someone is willing to pay for it. And with the increase prevalence of service providers vying with internal IT departments; it should be relatively simple. They’ve pretty much set the base-line.

And then there are the things that the internal IT department just should be able to do better; they should be able to assess Business need better than external. They should know the Business and be listening to the ‘water cooler’ conversations.

They should become experts in what their company does; understand the frustrations and come up with ways of doing things better.

Yet there is often a fear of presenting the Business with innovative and better services. I think it is a fear of going to the Business and presenting a costed solution; there is a fear of asking for money. And there is certainly a fear of Finance but present the costs to the Business users first and get them to come to the table with you.

So we offer the same old services and wonder why the Business are going elsewhere to do the innovative stuff and while they are at it; they start procuring the services we used to provide. Quite frankly, many Corporate IT departments are in a death spiral; trying to hang-on to things that they could let go.

Don’t think I can’t ask the Business for this much money to provide this new service…think, what if the Business want this service and ask someone else? At least you are going to be bidding on your own terms and not being forced into a competitive bid against an external service provide; when it comes down to it, the external provider almost certainly employees a better sales-team than you.

By proposing new services yourself or perhaps even taking existing ‘products’ and turning them into a service; you are choosing the battle-ground yourselves…you can find the high ground and fight from a position of power.