Storagebod Rotating Header Image

Storage

Start-Ups Galore

Recently it seems that there are more storage start-ups than ever before; be it flash-based storage, object storage, storage aimed at virtual environments, cloud storage, storage as software, storage appliances; it seems that every day more and more press releases announcing yet another innovation in the storage space hit my email address.

How many of these are truly innovative, not so many I guess but it seems that the storage start-up industry is in rude health. It seems that the barrier to entry into the market has significantly dropped and that the introduction of commodity-based hardware and software has really changed things.

And yet we still see the doom merchants predicting the end of the storage administrator and to be fair, a few years ago, I might have been in agreement but the sheer diversity of storage infrastructures, big data growth and just general growth leads me to feel that the storage administrator role still has life. Yes, it will change the role and the role will evolve much as storage has evolved and the role may become more virtualisation focussed but there will still be storage specialists and there will probably be as many as ever.

I am going to do my bit to ensure that the role of the ‘Storage Bod’ continues and encourage the diversity which will drive more complexity; I am a judge for the Tech Trailblazers awards, so if you are a new storage start-up and your product can further drive the complexity into the storage environment, you should enter. But if your product is really simple, just works and makes lives easier, please don’t bother….we want the environment to stay complex and a black-art.

Of course I am probably in the minority and some of the judges will be looking for more sensible things, so I guess start-ups with products both complex and simple should probably enter. There’s some good prizes, some great sponsors and excellent judges (well, better qualified than me anyway).

As I say the barrier for entry to the market seems to have fallen somewhat but some extra cash and help is always handy.

Amazon Goes Glacial

Amazon have announced a pretty interesting low-cost archive solution called Amazon Glacier; certainly the pricing which works out at $120 per terabyte per annum with 11 9’s availability is competitive. For those of us working with large media archives could this be a competitor with tape and all the tape management headaches that tape brings. I look after a fairly large media archive and there are times when I would gladly see the back of tape for-ever.

So is the Amazon Glacier the solution we are looking for? Well for the type of media archives that I manage, unfortunately the answer is not yet or not for all use cases. The 3-4 hour latency introduced on a recall by the Glacier does not fit many media organisations, especially those who might have a news component. At times even the minutes that retrieving from tape take seems to be unacceptable, especially to news editors and the like. And even at $120 per terabyte; when you are growing at multiple petabytes a year, the costs fairly quickly add-up.

Yet, this is the first storage product which has made be sit up and think that we could replace tape. If the access times were reduced substantially and it looked more like a large tape library; this would be an extremely interesting service. I just need the Glacier to move a bit faster.

Friday Quick Thought

Flash=Compute Storage, stuff you are processing

Disk=Data Storage, stuff you have processed or might want to process soon

Tape=Data Hoard, stuff you can’t bare to get rid because you might need it but realistically, you only need <10% but you don't know which 10%.

Super Glue Required

IBM’s purchase of TMS was not the biggest surprise, especially for anyone who has been involved with IBM’s HPC team. It’s a good move for both companies and gives IBM a good solid flash-based storage team. It does add yet another storage array product to IBM’s ever growing portfolio and positioning outside of the HPC world is going to be fun; IBM have multiple competing arrays but arguably the TMS range does not overlap with any.

And so the move to flash continues or more likely a move to a 2-tier storage future; with active data sitting in a flash-tier and resting data sitting on a SATA/SAS bulk storage tier. But with all these competing products and different vendors, the storage management and implementation head-aches could be massive.

Now, we could move to hybrid arrays where both flash and traditional rotational storage live in the same array. The array itself can auto-tune what sits where, moving data according to temperature; we’ve seen this in the various auto-tiering technologies from EMC, IBM and HDS for example.

We could move to a world where the flash in an array simply works as an extended cache tier, augmenting the DRAM cache; speeding up reads, think NetApp’s approach.

Both of these implementations allow existing storage architectures to be enhanced with flash and hence both are pretty popular with the existing vendors and customers. Nothing much changes and things feel very much Business As Usual but faster.

Then there are the new players on the block with their flash-only arrays; architected to make the most out of flash. These tend to have really screaming performance but can you afford to replace all of your storage with flash technology? If you can’t, you need to think a lot harder about how best to utilise these; for example if the data you are storing has any form of longevity and needs to be kept, you will need to come up ways of moving this data between tiers of storage which almost certainly come from different vendors. Experience suggests that this sort of tiering is very hard to-do and applications need to be designed with this in mind.

And then there vendors who believe that you should implement flash as close to the server as possible. This is the approach of Fusion-IO and the like; in many ways this could be very attractive, certainly if you can use it as a cache-tier but if you have very large server farms, this could be expensive and yet again, you could run into design headaches where positioning workloads starts to get much harder. There are also potential issues with clustering, failure modes and the like. But it could allow you to leverage your existing storage estate and sweat the asset.

This introduction of a new tier of storage has re-opened the ILM/HSM box; the glue which moves data between different tiers of storage is going to be incredibly important and this more than the actual hardware could well define the future of flash in the Enterprise and beyond.

We are seeing a rapidly evolving hardware market but the technologies could manifestly increase the complexity of the storage environment. This might be good news for the storage administrators who look at the increasingly simplified administration tools that all of these arrays ship with but to enable the dynamic environments that Business requires, the integration layer is going to have to start appearing.

And as IBM’s acquisition of TMS shows, acquiring hardware platforms and expertise seems to be the current focus…even if every array you purchased was IBM branded, data movement between the arrays would be hard; start adding other vendors into the mix and your problems are going to be interesting.

Edgey

A week or so ago, I spent an hour talking to Avere about their NAS products; this was not really long enough to do any kind of deep dive but what I heard, I liked. It could just be marketing fluff but what they are trying to do is clever and potentially very useful for a variety of reasons.

People love their NetApp filer, they just don’t like their next dozen. Filer proliferation and if I am being fair, it’s not just NetApp but all traditional NAS devices, is a major problem. Data mobility and name-space consolidation; making use of capacity that you’ve bought; these are all problems.

Avere hope to allow you to use all that capacity you’ve got in your filers and to allow you to easily move data in your filers around between them, from EMC to NetApp to white-box NFS to a Blue-Arc and perhaps into a Cloud service.

Firstly, they accelerate by putting a ‘Edge-filer’ in front of your traditional NAS; this could be filled with SSD or spinning rust. This provides a cache-tier which allows hot data to be stored on more performant ‘disk’; for their benchmarking, they use whitebox servers full of disk and front them with their devices. This demonstrates that they can take commodity disk and accelerate it into something gives the big-boys something to think about. This allows you get over some of the problems that the large SATA drive are causing with the rapid decline in I/O per terabyte.

Secondly, almost a side-effect of their Edge-filer; you get a more consolidated name-space; I hesitate to say Global Name Space…your core filers export their file-systems to Avere’s edge-filers which then export the file-systems to the wider-server infrastructure.

This allows NAS sprawl to be hidden from the users but it does obviously come with some limitations caused by the core-filer limitations; for example, if your core-filer only supports a 16Tb file-system, your export is going to be limited to that and you will need to monitor this carefully.

But where I really see a use for the Avere technology is in front of Scale-Out NAS devices allowing the efficient use of cheaper SATA technologies to provide capacity whilst preserving performance. So it is no surprise to me to that Avere are making some real in-roads into the worlds of media and rendering.

Avere are an interesting proposition and what I find especially interesting is that in many ways they don’t compete with NetApp, EMC et al but provide a complementary product which allows end-users to make better use of their investment.

Still, I suspect that the big-boys might not like this; they don’t really want you to be too efficient…

Virtually Pragmatic?

So why have EMC joined the storage virtualisation party and although they are calling it federation, it is what IBM, HDS and NetApp amongst others call storage virtualisation? So why do this at this time after warning about dire consequences about doing so in the past.

There are probably a number of reasons to do this; there have certainly been commercial pressures to do so, I know of a number of RFPs which have gone out from large corporates which have mandated this capability; money talks and in an increasingly competitive market, EMC probably have to tick this feature box.

The speed of change in the spinning rust market appears to be slowing, certainly the incessant increase in the size of hard disks is slowing and there means that there might be less pressure to technically refresh the spindles and a decoupling of the disk from the controller makes sense. EMC can protect their regular upgrade revenues at the controller level and forgo some of the spinning rust revenues. They can more than make up for this out of maintenance revenues on the software.

But I wonder if there is a more pressing technological reason and trend that means that it is a good time to do this; that is the rapid progress of flash into the data-centre and how EMC can work to increase the acceleration of adoption. It is conceivable that EMC could be looking shipping all-flash arrays which allow a customer to continue to enjoy their existing array infrastructure and realise the investment that they have made. It is also conceivable that EMC could use a VMAX like appliance to integrate their flash-in-server more simply with a third party infrastructure.

I know nothing for sure but the size of this about turn from EMC should not be understated; Barry Burke has railed against this approach to storage virtualisation for such a long time, there must be some solid reasoning to justify it in his mind.

Pragmatism or futurism, a bit of both I suspect.

The Last of the Dinosaurs?

Myself and Chris ‘The Storage Architect’ Evans were having a twitter conversation during the EMC keynote where they announced the VMAX 40K; Chris was watching the live-stream and I was watching the Chelsea Flower Show, from Chris’ comments, I think that I got the better deal.

But we got to talking about the relevance of the VMAX and the whole bigger is better thing. Every refresh, the VMAX just gets bigger and bigger, more spindles and more capacity. Of course EMC are not the only company guilty of the bigger is better hubris.

VMAX and the like are the ‘Big Iron’ of the storage world; they are the choice of the lazy architect, the infrastructure patterns that they support are incredibly well understood and text-book but do they really support Cloud-like infrastructures going forward?

Now, there is no doubt in my mind that you could implement something which resembles a cloud or let’s say a virtual data-centre based on VMAX and it’s competitors. Certainly if you were a Service Provider which aspirations to move into the space; it’s an accelerated on-ramp to a new business model.

Yet just because you can, does that mean you should? EMC have done a huge amount of work to make it attractive, an API to enable to you to programmatically deploy and manage storage allows portals to be built to encourage self-service model. Perhaps you believe that this will allow light-touch administration and the end of the storage administrator.

And then myself and Chris started to talk about some of the realities; change control on a box of this size is going to be horrendous; in your own data-centre, co-ordination is going to be horrible but as a service provider? Well, that’s going to be some interesting terms and conditions.

Migration, in your own environment,  to migrate a petabyte array in a year means migrating 20 terabytes a week more or less. Now, depending on your workload, year-ends, quarter-ends and known peaks, your window for migrations could be quite small. And depending how you do it, it is not necessarily non-service impacting; mirroring at the host level means significantly increasing your host workload.

As a service provider; you have to know a lot about the workloads that you don’t really influence and don’t necessarily understand. As a service provider customer, you have to have a lot of faith in your service provider. When you are talking about massively-shared pieces of infrastructure, this becomes yet more problematic. You are going to have to reserve capacity and capability to support migration; if you find yourself overcommitting on performance i.e you make assumptions that peaks don’t all happen at once, you have to understand the workload impact of migration.

I am just not convinced that these massively monolithic arrays are entirely sensible; you can certainly provide secure multi-tenancy but can you prevent behaviours impacting the availability and performance of your data? And can you do it in all circumstances, such as code-level changes and migrations.

And if you’ve ever seen the back-out plan for a failed Enginuity upgrade; well the last time I saw one, it was terrifying.

I guess the phrase ‘Eggs and Baskets’ comes to mind; yet we still believe that bigger is better when we talk about arrays.

I think we need to have some serious discussion about optimum array sizes to cope with exceptions and when things go wrong. And then some discussion about the migration conundrum. Currently I’m thinking that a petabyte is as large as I want to go and as for the number of hosts/virtual hosts attached, I’m not sure. Although it might be better to think about the number of services an array supports and what can co-exist, both performance-wise but also availability window-wise.

No, the role of the Storage Admin is far from dead; it’s just become about administering and managing services as opposed to LUNs. Yet, the long-term future of the Big Iron array is limited for most people.

If you as an architect continue to architect all your solutions around Big Iron storage…you could be limiting your own future and the future of your company.

And you know what? I think EMC know this…but they don’t want to scare the horses!

A New Symm?

So EMC-World is here and the breathless hype begins all over again and in amongst the shiny, shiny,shiny booths; the acolytes worship the monolith that is the new Symmetrix. Yet a question teases the doubters, do we need a new Symmetrix?

Okay, enough of the ‘Venus in Furs’ inspired imagery; although it might be strangely appropriate for the Las Vegas setting but there is a question which needs to be asked, do we need a new Symmetrix?

Now I am probably these days far enough removed but not so distant that I can have a stab at an answer. And the answer is; no, I don’t believe that we actually needed a new Symmetrix but EMC needed to develop one anyway.

There are certainly lots of great improvements; a simpler management interface and the bringing it into the Unisphere world has been long overdue. It seems that many manufacturers are beginning to realise that customers want commonality and that shiny GUIs can help to sell a product.

Improvements to Timefinder snaps are welcome; we’ve come a long way from BCVs and mirror poistions; there’s still a long way to go to get customers to come along with you tho’. Many cling onto the complex rules with tenacity.

Certainly the mirroring of FAST-VP so that in the event of fail-over, there is a Performance Recovery Point of 0 is  achievable is very nice; it’s  something I’ve blogged about before and is a weakness in many automated tiering solutions.

eMLC SSDs; bringing the cost of SSD down whilst maintaining performance, this is another over-due capability as is the support of 2.5″ SAS drives improving density and I suspect performance of spinning rust.

Physical dispersal of cabinets; you probably won’t believe how long this has been discussed and asked for. Long, long overdue but hey, EMC are not the only guilty parties.

And of course, Storage ‘Federation’ of 3rd party arrays; I’m sure HDS and IBM will welcome the vindication of their technology by EMC or at least have a good giggle.

But did we need a new Symmetrix to deliver all this? Or would the old one have done?

Probably but where’s the fun in that?

I don’t know but perhaps concentrating on the delivery to the business before purchasing a new Big Iron array might be more fitting. I don’t know about you but in the same way that I look at mainframes with nostalgia and affection; I’m beginning to look at the Symmetrix and the like in the same way.

If you need one, you need one but ask yourself…do I really need one?

Flash Changed My Life

All the noise about all flash arrays and acquisitions set me thinking a bit about SSDs and flash; how it has changed things for me.

To be honest, the flash discussions haven’t yet really impinged on my reality in my day-to-day job, we do have the odd discussion about moving metadata onto flash but we don’t need it quite yet; most of the stuff we do is sequential large I/O and spinning rust is mostly adequate. Streaming rust i.e tape is actually adequate for a great proportion of our workload. But we keep a watching-eye on the market and where the various manufacturers are going with flash.

But flash has made a big difference to the way that I use my personal machines and if I was going to deploy flash in a way that would make the largest material difference to my user-base, I would probably put it in their desktops.

Firstly, I now turn my desktop off; I never used to unless I really had to but waiting for it to boot or even awake from sleep was at times painful. And sleep had a habit of not sleeping or flipping out on a restart; turning the damn thing off is much better. This has had the consequence that I now have my desktops on an eco-plug which turns off all the peripherals as well; good for the planet and good for my bills.

Secondly, the fact that the SSD is smaller means that I keep less crap on it and am a bit more sensible about what I install. Much of my data is now stored on the home NAS environment which means I am reducing the number of copies I hold; I find myself storing less data. There is another contributing factor; fast Internet access means that I tend to keep less stuff backed-up and can stream a lot from the Cloud.

Although the SSD is smaller and probably needs a little more disciplined house-keeping; running a full virus check which I do on occasion is a damn sight quicker and there are no more lengthy defrags to worry about.

Thirdly, applications load a lot faster; although my desktop has lots of ‘chuff’ and can cope with lots of applications open, I am more disciplined about not keeping applications open because their loading times are that much shorter. This helps keeping my system running snappily, as does shutting down nightly I guess.

I often find on my non-SSD work laptop that I have stupid numbers of documents open; some have been open for days and even weeks. This never happens on my desktop.

So all-in-all; I think if you really want bang-for-buck and to put smiles on many of your users’ faces; the first thing you’ll do is flash-enable the stuff that they do everyday.

 

 

Long Term Review: Synology DS409

Over the past three years, my primary home NAS has been the Synology DS409; in this time, I’ve built my own NAS solutions as well and have trialled a number of home-build solutions but my core home NAS remains the DS409.

When I bought the DS409, I looked and considered a number of competing solutions; Drobo and QNAP boxes came highly recommended and there are still plenty of people who swear by them.

The build quality of the DS409 is excellent and still looks pretty much good as new but then again it is not as it I am kicking it across the room on a regular basis. I give it regular clean-out with compressed air, just to blow the dust out of the fans; it still runs quiet and cool.

It currently has 4x1Tb Western Digital drives in a RAID-5 format; it has an additional e-SATA drive attached to it to provide additional storage. These are carved up to provide NFS, SMB and iSCSI shares.

As well as providing traditional file-sharing capability, it is also the print server for the house and also works as a DNLA and an Airplay server. If I didn’t have a separate web-server, VPN server etc; it could also do that for me.

You can integrate into an Active Directory domain if you so wish and you have a variety of options for backing up; you can use an rsync-based back-up solution, back-up in to the s3 Cloud or simply back-up to a locally attached external disk.

Synology continue to support and update the DS409 with firmware and features; the feature-set is constantly being improved features like Synology Hybrid RAID which allows mixed sized drives to be used in a similar way to the Drobo; to CloudStation which enabled your Synology device to work as a private Cloud-storage device.

Synology are constantly improving their software and it is fairly admirable that they continue to update their software for products which they no longer sell. The user interface has improved significantly over time; it is simple and intuitive and if you need to, you can always drop back into the Linux command-line. Having access to the Linux command line means that there are a number of third party applications which can also be installed, it is a very hacker-friendly box.

The only thing it really lacks, is significant integration with VMware but most home-users and probably most small businesses will not miss this at all.

When the time comes to replace my home NAS, Synology will be top of my list.

Highly recommended.