Storagebod Rotating Header Image

Storage

Designed to Fail

Randy Bias has written an interesting piece here on the impact of complexity on reliability and availability; as you build more complex systems, it becomes harder and harder to engineer in multiple 9’s availability. I read the piece with a smile on my face and especially the references to storage; sitting with an array flat on it’s arse and already thinking about the DAS vs SAN argument for availability.

How many people design highly-available systems with no single points of failure until it hits the storage array? Multiple servers with fail-over capability, multiple network paths and multiple SAN connections; that’s pretty much standard but multiple arrays to support availability? It rarely happens. And to be honest, arrays don’t fall over that often, so people don’t tend to even consider it until it happens to them.

An array outage is a massive headache though; when an array goes bad, it is normally something fairly catastrophic and you are looking at a prolonged outage but often not so prolonged that anyone invokes DR. There are reasons for not invoking DR, most of them around the fact that few people have true confidence in their ability to run in DR and even fewer have confidence that they can get back out of DR, but that’s a subject for another blog.

I have sat in a number of discussions over the years where the concept of building a redundant array of storage arrays has been discussed i.e stripe at the array level as opposed to the disk level. Of course, rebuild times become interesting but it does remove the array as a single point of failure.

But then there are the XIVs, Isilons and other clustered storage products which are arguably extremely similar to this concept; data is striped across multiple nodes. I won’t get into the argument about implementations but it does feel to me that this is really the way that storage arrays need to go. Scale-out ticks many boxes but does bring challenges with regards to metadata and the like.

Of course, you could just go down the route of running a clustered file-system on the servers and DAS but this does mean that they are going to have to cope with striping, parity and the likes. Still, with what I have seen in various roadmaps, I’m not betting against this as an approach either.

The monolithic storage array will continue for some time but ultimately, a more loosely coupled and more failure tolerant storage infrastructure will probably be in all our futures.

And I suppose I better find out if that engineer has resuscitated our array yet.

 

Reality for Scality

You know that I have somewhat mixed feelings about Object Storage; there is part of me which really believes that it is the future of scalable storage but there is another part of me that lives in the real world. This is the world where application vendors and developers are currently unwilling to rewrite their applications to support Object Storage; certainly whilst there is no shipping product supporting an agreed standard. And there are a whole bunch of applications which simply are not going to be re-written any time soon.

So for all their disadvantages; we’ll be stuck with POSIX filesystems for some time; developers understand how to code to them and applications retain a level of storage independence. You wouldn’t write an application to be reliant on a particular POSIX filesystem implementation so why would you tie yourself to an Object Store?

I was pleased to see this announcement from the guys at Scality; they are good guys and their product looks sound but no matter how sound your product is, you have to try to make it easier for yourself and your customers. Turning their product into a super-scalable filer is certainly a way to remove some of the barriers to adoption but will it be enough?

Of course, then there are the file-systems which are beginning to go the other way; a realisation that if you are already storing some kind of metadata, it might not be a huge stretch to turn it into a form of object store.

File-systems, especially those of the clustered kind and Object Storage seem to be converging rapidly. I think that this is only for the good.

 

Price is Right?

As the unit cost of storage continues to trend to zero and that is even with the current premium being charged due to last year’s floods; how do we truly measure the cost of storage and it’s provision?

Now, many of you are now thinking ‘zero’? Really?

And my answer would be that many of the enterprise class arrays are down to a few dollars per gigabyte over five years; certainly for SATA and NL-SAS. So the cost of storing data on spinning rust is heading towards zero; well certainly the unit cost is trending this way.

Of course, as we store ever more data; the total cost of storage continues to rise as the increase in the amount of data we store outstrips the price decline. Dedupe and the likes are a temporary fix which may mean that you can stop buying for period of time but inevitably you will need to start increasing your storage estate at some point.

So what are you going to do? I don’t think that we have a great solution at the moment and current technologies are sticking plasters with a limited life. Our storage estates are becoming like landfill; we can sift for value but ultimately it just hangs around and smells.

It is a fact of life that data management is a discipline much ignored; lots of people point at the Cloud as a solution but we are simply shifting the problem and at some point that’ll come unstuck as well. Cloud appliances will eventually become seen as a false economy; fixing the wrong problem.

Storage has simply become too cheap!

Which is a kind of odd statement for Storagebod to make….

 

Local Storage, Cloud Access

Just as we have seen a number of gateways to allow you to access public cloud storage in a more familiar way and making it appear as local to your servers, we are beginning to see services and products which do the opposite.

To say that these turn your storage into cloud storage is probably a bit of a stretch but what they do is to allow your storage to be accessed by a multitude of devices where-ever they happen to be. They bring the convenience of Dropbox but with a more comfortable feeling of security because the data is stored on your storage. Whether this is actually any more secure will be entirely down to your own security and access policies.

I’ve already blogged about Teamdrive and I’ll be blogging about it again and also the Storage Connector from Oxygen Cloud in the near future. I must say that some of the ideas and the support for Enterprise storage by the folks at Oxygen Cloud looks very interesting.

I do wonder when or if we’ll see Dropbox offer something similar themselves, Dropbox with it’s growing software ecosphere would be very attractive with the ability to self-host. It would possibly give some of the larger storage vendors something to consider.

These new products do bring some interesting challenges which will need to be addressed; you can bet that your users will start to install these on their PCs, both at work and at home. The boundaries between corporate data and personal data will become ever blurred; much as I hate it, the issue of rights management is going to become more important. Forget the issue of USB drives being lost, you could well find that entire corporate shares are exposed.

But your data any time, any place is going to become more and more important; convenience is going to trump security again and again. I am becoming more and more reliant on cloudy storage in my life but for me it is a knowing transition; I suspect for many others, they are simply not aware of what they are doing.

This is not a reason to simply stop them but a reason to look at offering the services to them but also to educate. The offerings are coming thick and fast, the options are getting more diverse and interesting. The transition to storage infrastructure as software has really opened things up. Smaller players can start to make an impact, let’s hope that the elephants can dance.

Personal Cloud Storage

As long term readers and followers of this Blog will know, I really like Dropbox but there are issues with it; especially around security and potential access of others to my data and I have stopped storing confidential data in it. What would be ideal would be for me to host my own Dropbox server but unfortunately, they’ve not gone down that route.

However, I have been introduced to a promising contender; Teamdrive are a German software company who have developed something similar to Dropbox but with the added advantage that you can host your own Teamdrive server on your own hardware.

My friend Rose is doing the PR for them and kindly got hold of license for me to play with so that I could set up my own environment (note: there is a free server license which is limited to 10Gb, the unlimited license appears to be €99 per year).

One of the nice things about the Teamdrive server is that they provide a version which will run on a Synology Home NAS; so I downloaded that and I installed it, quickly VI-ed the configuration file and fired it up. The Windows and Mac versions of the server appear to have a nice GUI so that you don’t have to edit configuration files but there are few options and the lack of GUI for the Linux version is no hardship.

I downloaded the latest Teamdrive Client for my MacBook; installed that and pointed it at the newly installed Teamdrive server. The process of getting it attached was painless and worked quickly and easily.

Teamdrive allows you to configure an existing directory as a Teamdrive share or in Teamdrive terminology, a ‘Space’ or you can create a new ‘Space’ and start from that. Once you have created a ‘Space’, you can invite other users to the share. Please note, it appears that they already need to have registered with Teamdrive to be invited. Not entirely sure why this should be the case if you intend to run an entirely private service.

Running your own server is interesting because it allows you to see how the files are stored on the server; they are encrypted and hence even if someone manages to get access to the server; the files should stay secure. I haven’t looked too closely at the encryption yet, so I can’t really vouch for how secure it is. However storing the files like this does mean that they cannot be shared using another protocol such as NFS or SMB from my Synology.

All in all, Teamdrive appears to be a solid shared storage implementation with the added attraction that you can run it privately. There are iOS and Android clients in development but I’ve not tried them, this is a bit of a hole in the Teamdrive story at present. The other advantage is that you can scale a lot more economically than the hosted competitors

p.s Matthew Yeager has recommended a product called Appsense Datalocker which works with Dropbox to provided an encrypted solution. I’ve just started to have a play and it looks most promising.

Wobbles?

Okay, so once more I take my life into my own hands and post about NetApp! I look forward to the hordes descending and telling me how terribly wrong I am and how I don’t understand anything!

So let’s get this straight, I think NetApp are a great company and they have done the storage world a great service in the way that they simplified administration and furthered the cause of IP storage, especially NAS. They led the way in taking commodity components and turning them into Enterprise class arrays; we should thank them for all of this!

But are NetApp having a bit of a blip in growth? Chris Mellor certainly seems to think so in his interpretation of the latest IDC Storage Tracker figures although there is a different take here; so it’s not entirely clear however I have some thoughts on all of this.

Having taken a straw-poll of some of my peers in the industry, large NetApp estates have such terrible utilisation figures that dedupe is pretty meaningless; 25-40% utilisation seems to be relatively common with reporting based around ‘effective capacity’ sometimes making it worse.

But I am beginning to see utilisation figures creep up; there is an increasing challenge to use this unused capacity and push it up. So existing NetApp customers might not be buying quite so much and are under increased pressure to use what they’ve already got. Has this impacted NetApp’s run-rate business?

And if customers can’t use the capacity that they’ve got; this is going to lead to some hard questions and there are now a plethora of vendors who have solid NAS products who can push NetApp very hard on capability and incredibly hard on price.

That and the confusion that OnTap 8 brought with it’s choice of modes has probably left NetApp a little wobbly.

Throw in the Engenio acquisition which confused their story even more and actually, it turns out that NetApp might not have purchased Engenio at the best possible time with them moving into an architectural refresh all of their own.

I think NetApp may continue to have a wobbly few quarters but that might not be such a bad thing. Great companies do wobble and learn from it; let’s hope that NetApp are a great company, I think they are…

 

Desktop, Data, Devilry

In the post-PC era; the battle for the desktop has moved on to the battle for your data; Microsoft’s leaked new features for SkyDrive demonstrates this nicely; joining Dropbox, iCloud, the soon to be announced Google Drive and a myriad of others, where you store your data is becoming more and more of a battle-ground. The Battle of the Desktop has moved from the Battle of the Browser to the Battle for Your Data; throw Social Media such as Facebook, Twitter and sites as Flickr into the mix; this is heading to one hell of mess and one hell of a fight.

Where on earth are you going to store your content? And once it is there, how do you get it out and more importantly will this drive stickiness? Apple seem to think so, Apple are making tighter integrations with their operating systems and the iCloud; Mountain Lion and iOS6 will see more features leveraging iCloud natively; Microsoft will do so similar things with Windows 8 and SkyDrive; yes, you will be able to access your data from other operating systems and devices but it will not be the experience you will get from the native operating systems.

Native Operating Systems? Will we see even tighter integration with the operating systems? Will we see Cloud-Storage gateways built into the operating system? For example as broadband gets faster, is there need for large local storage devices? Could your desktop become a caching device with just local SSD storage and intelligently moving your data in and out of the Cloud? Mobile devices are pretty much there but they deal with much smaller storage volumes, is the desktop the next frontier for this?

But could the battle for your data produce the next big monopoly opportunity for Microsoft and Apple? Building hooks in at the operating system level would seem to make technical sense but I can hear the cries from a multitude; service providers, content providers and the likes will have a massive amount to say about this.

For example, there are the media devices such as PVRs etc; with content providers and broadcasters increasingly providing non-linear access to their content, why is this not all on demand and why do we need a PVR any more? A smaller device with a local SSD cache would make considerably more sense; they’d be greener and removing the spinning disk would probably reduce failures but this would mean a pretty much whole-scale move to IPTV, something which is a little way off.

But arguably, this is something that Apple are really moving towards; owning your content, your data and your life will be theirs. And where Apple go, expect Microsoft to be not far behind; you think the Desktop is irrelevant? I for one don’t believe it; this story has a long way to run. It’s still about the Desktop, just that the Desktop has changed.

Storage People Are Different

An oft-heard comment is that ‘Storage People are weird/odd/strange’; what people really mean is that ‘Storage People are different’; Chuck sums up many of the reasons for this in his blog ‘My Continuing Infatuation with Storage‘.

Your Storage Team (and I include the BURA teams) often see themselves as the keepers to the kingdom, for without them and the services that they provide, your businesses will probably fail. They look after that which is most important to any business; its knowledge and its information. Problem is, they know it and most other people forget this; this has left many storage teams and managers with the reputation of being surly, difficult and weird but if you were carrying the responsibility for your company’s key asset, you’d be a little stressed too. Especially if no-one acknowledged it.

The problem is that for many years; companies have been hoarding corporate gold in dusty vaults which are looked after by orcs and dragons who won’t let anyone pass or access it but now people want to get access to the gold and make use of it. So now the storage team is having to not only worry about ensuring that the information is secure and maintained, people actually want to use it and want ad-hoc access to it, almost on demand.

Problem is that the infrastructures that we have in place today are not architected to allow this to happen and the storage teams do not have processes and procedures to allow this to happen. So today’s ‘Storage People maybe different’ but tomorrow’s ‘Storage People will be  a different different’. They will need to be a lot more business focussed and more open; but that asset that they’ve been maintaining is growing pretty much exponentially in size and value; so expect them to become even more stressed and maybe even more surly.

That is unless you work closely with them to invest and build a storage infrastructure which supports all your business aspirations; unless vendors invest in technologies which are manageable at scale and businesses learn to appreciate value as opposed to sheer cost.

Open, accessible, available and secure; this is the future storage domain; let’s hope that the storage teams to support this also have these qualities.

Don’t SNIA at #storagebeers

I have just noticed that SNIA Data Centre Technology Academy London is this year at the The Grange Tower Bridge Hotel, this is the same road as my favourite posh curry restaurant; also as is traditional for SNIA, it clashes with EMC-World and is on May 23rd.

Assuming that no-one decides to treat me and send me to EMC-World; I am probably going to organise #storagebeers followed by #storagecurry at Cafe Spice Namaste; I will post more nearer the date. But I know many attendees love a good curry and this is really good curry.

So this is an early warning and a good place to be if you can’t make it to EMC-World; you can console yourself in much better beer and a damn fine curry!!!

Soft Cache

VFCache is really yet more evidence that Storage Vendors are simply becoming software suppliers*; although ostensibly a hardware product; the smarts is down in the software layer and that is where the smarts are going to live. EMC are simply leveraging everything that they have learnt from PowerPath (Free PowerPath, Keep Fighting the Fight!) and using that to build on to introduce storage functionality on the server.

From the presentations we have seen so far, it seems that DeDupe is going to run on the server and not the card;  well, that’s my interpretation. Obviously this is going to have some interesting impact on CPU and memory utilisation meaning that EMC are going to have get this right or they risk the whole reason for putting cache closer to the server. Replication and cache consistency may also  be a server level function.

This does have some interesting implications; although it appears to be hardware product, how hard would it be for EMC to use any ‘flash’ technology which is installed in the server; do we have to have EMC hardware and could we use a Fusion-IO card or even a bog-standard SSD? What happens to the pricing?

EMC are already talking about accelerating any array; although it’ll be better with their own array but will they take this further and use anyone’s cache hardware? We’ll probably end up with another gargantuan certification matrix because EMC like those but it does seem possible. Perhaps Powerpath/Cache which allows a number of different vendor’s caching products to be used? This way EMC can monetise their software even further or perhaps, Powerpath/Cache comes free with VFCache but you need to pay to use it with third party cache products?

And what about file-based acceleration? Where do NAS and indeed Object storage fit into this? Do they? One of the biggest complaints or issues oft heard about object storage is that it can be slow, so would it benefit from some kind of caching; could cache hints be carried in the object metadata?

Also, it might be interesting to see how this could be integrated in the hypervisor stack? Perhaps a PowerPath/VE which support multiple vendor’s caching products?

Now EMC have validated PCIe local flash-cache concepts; we can start to move on and see where this takes us. And yes, someone other than EMC could have validated the concept but they didn’t; so let’s move on from there.

What are the other big vendors going to react with? IBM, HP, perhaps NetApp with SPAM (Server Performance Acceleration Module), HDS?

[* I’ve been thinking a lot about my recent experience with storage and whether storage teams may have more in common in with application teams than first thought; storage infrastructures tend to be more heterogeneous than other data centre infrastructures, certainly there is more vendor differentiation. ]