Storagebod Rotating Header Image

Corporate IT

Monomyth

It really doesn’t matter about which Back-Up technology, the myths are pretty much all the same and unless you are aware of them, life will be more exciting for you than it should be…but perhaps that’s the point of Myths…they bring excitement to a mundane existence.

1) 99% Back-Up Completion is great. I’ve been guilty of this in the past when telling people how great my back-up team is…look, 99% success rate; we’re awesome. Actually, it’s a good job that some of my customers in the past have not realised what I was saying. Depending what has failed; I might not be able to restore a critical service but I still have a great back-up completion rate.

2) Design Back-Up Policies. No, don’t do that; build restore policies and then work out what needs to be backed-up to restore the service.

3)Everything Needs to be Backed-Up. Closely related to the above; if you feel the need to back-up an operating system several thousand times…feel free I guess but if you’ll never use it to restore a systems and in these days of automated build servers, Chef, Puppet and the likes, you are probably wasting your time. Yes, they can probably be de-duped but you are putting extra load on your back-up infrastructure for no reason.

3) Replication is Back-Up. Nope, synchronous replication is not a back-up; if I delete a file, that change will be replicated in real-time to the synchronous copy. It’s gone.

4) Snapshots are a Back-Up. Only if your snapshots are kept remotely; a snapshot on the same storage device can give you a fast recovery option but if you have lost the array or even a RAID rank; you are screwed.

5)RAID is a Back-Up. Yes people still believe this and some people still believe that the world is flat.

6)Your Back-Up is Good. No Back-Up is good unless you have restored it; until you have, you potentially have no back-up.

7)Back-Up is IT’s Responsibility. No, it is is a shared responsibility; it can only work well if the Business and IT work in partnership. Businesses need to work with IT to define data protection and recovery targets. IT needs to provide a service to meet these but they do not know what your retention/legal/business objectives are.

8)Back-Up Teams are Not Important. Back-Up teams are amongst the most important teams in your IT organisation. They can destroy your Business, steal your data and get access to almost any system they want…if they are smart and you are stupid!

 

The Reptile House

I was fortunate enough to spend an hour or so with Amitabh Srivastava of EMC; Amitabh is responsible for the Advanced Software division in EMC and one of the principal architects behind ViPR. It was an open discussion about the inspiration behind ViPR and where storage needs to go. And we certainly tried to avoid the ‘Software Defined’ meme.

Amitabh is not a storage guy; in fact his previous role with Microsoft sticks him firmly in the compute/server camp but it was his experience in building out the Azure Cloud offering which brought him appreciation of the problems that storage and data face going forward. He has some pretty funny stories about how the Azure Cloud came about and the learning experience it was; how he came to realise that this storage stuff was pretty interesting and more complex that just allocating some space.

Building dynamic compute environments is pretty much a solved problem; you have a choice of solutions and fairly mature ones. Dynamic networks are well on the way to being solved.

But building a dynamic and agile storage environment is hard and it’s not a solved problem yet. Storage and more importantly the data it holds has gravity or as I like to think of it, long-term persistence. Compute resource can be scaled up and down; data rarely has the idea of scaling down and generally hangs around. Data Analytics just means that our end-users are going to hug data for longer. So you’ve got this heavy and growing thing…it’s not agile but there needs to be some way of making it appear more agile.

You can easily move compute workloads and it’s relatively simple to change your network configuration to reflect these movements but moving large quantities of data around, this is a non-trivial thing to do…well at speed anyway.

Large Enterprise Storage environments are heterogeneous environments, dual supplier strategies are common; sometimes to keep vendors honest but often there is an acceptance the different arrays have difference capabilities and use-cases. Three or four years ago, I thought we were heading towards general purpose storage arrays; we now have more niche and siloed capabilities than ever before. Driven by developments in all-flash arrays, commodity hardware and new business requirements; the environment is getting more complex and not simpler.

Storage teams need a way of managing these heterogenous environments in a common and converged manner.

And everyone is trying to do things better, cheaper and faster; operational budgets remain pretty flat, headcounts are frozen or shrinking. Anecdotally, talking to my peers; arrays are hanging around longer, refresh cycles have lengthened somewhat.

EMC’s ViPR is attempt to solve some of these problems.

Can you lay a new access protocol on top of already existing and persistent data?  Can you make so that you don’t have to migrate many petabytes of data to enable a new protocol?  And can you ensure that your existing applications and new applications can use the same data without a massive rewrite? Can you enable your legacy infrastructure to support new technologies?

The access protocol in this case is Object; for some people Object Storage is religion…all storage should be object, why the hell do you want some kind of translation layer. But unfortunately, life is never that simple; if you have a lot of legacy applications running and generating useful data, you probably want to protect your investment and continue to run those applications but you might want to mine that data using newer applications.

This is heresy to many but reflects today’s reality; if you were starting with a green-field, all your data might live in an object-store but migrating a large existing estate to an object-store is just not realistic as a short term proposition.

ViPR enables your existing file-storage to be accessible as both file and object. Amitabh also mentioned block but I struggle with seeing how you would be able to treat a raw block device as an object in any meaningful manner. Perhaps that’s a future conversation.

But in the world of media and entertainment, I could see this capability being useful; in fact I can see it enabling some workflows to work more efficiently, so an asset can be acquired and edited in the traditional manner; then ‘moving’ into play-out as an object with rich-metadata but without moving around the storage environment.

Amitabh also discussed possibilities of being able to HDFS your existing storage, allowing analytics to be carried out on data-in-place without moving it. I can see this being appealing but challenges around performance, locking and the like become challenging.

But ultimately moving to an era where data persists but is accessible in appropriate ways without copying, ingesting and simply buying more and more storage is very appealing. I don’t believe that there will ever be one true protocol; so multi-protocol access to your data is key. And even in a world where everything becomes objects, there will almost certainly be competing APIs and command-sets.

The more real part of ViPR; when I say real, I mean it is the piece I can see huge need for today; is the abstraction of the control-plane and making it look and work the same for all the arrays that you manage. Yet after the abomination that is Control Center; can we trust EMC to make Storage Management easy, consistent and scalable? Amitabh has heard all the stories about Control Center, so lets hope he’s learnt from our pain!

The jury doesn’t even really have any hard evidence to go on yet but the vision makes sense.

EMC have committed to open-ness around ViPR as well; I asked the question…if someone implements your APIs and makes a better ViPR than ViPR? Amitabh was remarkably relaxed about that, they aren’t going to mess about with APIs for competitive advantage and if someone does a better job than them; then that someone deserves to win. They obviously believe that they are the best; if we move to a pluggable and modular storage architecture, where it is easy to drop-in replacements without disruption; they better be the best.

A whole ecosystem could be built around ViPR; EMC believe that if they get it right; it could be the on-ramp for many developers to build tools around it. They are actively looking for developers and start-ups to work with ViPR.

Instead of writing tools to manage a specific array; it should be possible to write tools that manage all of the storage in the data-centre. Obviously this is reliant on either EMC or other storage vendors implementing the plug-ins to enable ViPR to manage a specific array.

Will the other storage vendors enable ViPR to manage their arrays and hence increase the value of ViPR? Or will it be left to EMC to do it; well, at launch, NetApp is already there. I didn’t have time to drill into which versions of OnTap however and this where life could get tricky; the ViPR-control layer will need to keep up with the releases from the various vendors. But as more and more storage vendors are looking at how their storage integrates with the various virtualisation-stacks; consistent and early publications of their control functionality becomes key. EMC can use this as enablement for ViPR.

If I was a start-up for example, ViPR could enable me to fast-track management capability of my new device.I could concentrate on storage functionality and capability of the device and not on the periphery management functionality.

So it’s all pretty interesting stuff but it’s certainly not a forgone conclusion that this will succeed and it relies on other vendors coming to play. It is something that we need; we need the tools that will enable us to manage at scale, keeping our operational costs down and not having to rip and replace.

How will the other vendors react? I have a horrible suspicion that we’ll just end up with a mess of competing attempts and it will come down to the vendor who ships the widest range of support for third party devices. But before you dismiss this as just another attempt from EMC to own your storage infrastructure; if a software vendor had shipped/announced something similar, would you dismiss it quite so quickly? ViPR’s biggest strength and weakness is……EMC!

EMC have to prove their commitment to open-ness and that may mean that in the short term, they do things that seriously assist their competitors at some cost to their business. I think that they need to almost treat ViPR like they did VMware; at one point, it was almost more common to see a VMware and NetApp joint pitch than one involving EMC.

Oh, they also have to ship a GA product. And probably turn a tanker around. And win hearts and minds, show that they have changed…

Finally, let’s forget about Software Defined Anything; let’s forget about trying to redefine existing terms; it doesn’t have to be called anything…we are just looking for Better Storage Management and Capability. Hang your hat on that…

 

Apple Defaults to Windows Standard

I was looking at the Apple documentation around Mavericks as I was interested to see how they were intending to make more use of the extensive metadata that they have had available for files stored; the keynote made me wonder whether they were beginning to transition to something more object-like. Something which makes a lot of sense in my world, it’d certainly give some of the application vendors a decent kick in right direction.

And I came across this snippet which will upset some die-hard Mac-fans but make some people who integrate Macs into corporate environments pretty happy.

SMB2

SMB2 is the new default protocol for sharing files in OS X Mavericks. SMB2 is superfast,
increases security, and improves Windows compatibility.

It seems that Apple are finally beginning to deprecate AFP and wholeheartedly embrace SMB2; yes, I know some of us might have prefered NFS but it is a step in the right direction. And Apple changing a default protocol to improve Windows compatibility; who’d have thunk it. Still, it appears that Apple are continuing with the horrible resource forks!!

And the big storage vendors will be happy…because they can finally say that they support the default network file system on OSX.

No, I can’t see evidence for a whole-hearted embracing of Object Storage yet..

 

 

Change Coming?

Does your storage sales rep have a haunted look yet? Certainly if they work for one of the traditional vendors, they should be beginning to look hunted and concerned about their prospects long-term; not this year’s figures and probably not next year’s but the year after? If I was working storage sales, I’d be beginning to wonder about what my future holds. Of course, most sales look no further than the next quarter’s target but perhaps its time to worry them a bit.

Despite paying lip-service to storage as software; very few of the traditional vendors (and surprisingly few start-ups either) have really embraced this and taken it to the logical conclusion; commoditisation of high margin hardware sales is going to come and despite all efforts of the vendors to hold this back, it is going to change their business.

Now I’m sure you’ve read many blogs predicting this and you’ve even read vendor blogs telling you how they are going to embrace this; they will change their market and their products to match this movement. And yet I am already seeing mealy-mouthed attempts to hold this back or slow it down.

Roadmaps are pushing commoditisation further off into the distance; rather than a whole-hearted endorsement, I am hearing HCLs and limited support. Vendors holding back on releasing virtual editions because they are worried that customers might put them into production. Is the worry that they won’t work or perhaps that they might work too well?

Products which could be used to commoditise an environment are being hamstrung by only running on certified equipment. And for what is very poor reasoning; unless the reasoning is to protect a hardware business. I can point to examples in every major vendor; from EMC to IBM to HDS to HP to NetApp to Oracle.

So what is going to change this? I suspect customer action is the most likely vector for change? Cheap and deep for starters; you’d probably be mad not to seriously consider looking at a commodity platform and open-source. Of course vendors are going to throw a certain amount of FUD but like Linux before; there is momentum beginning to grow, lots of little POCs popping up.

And there are other things round the corner which may well kick this movement yet further along. 64-bit ARM processors have been taped out; we’ll begin to see servers based on those over the next couple of years. Low-power 64-bit servers running Linux and one of a multitude of open-source storage implementations will become two-a-penny; as we move to scale-out storage infrastructure, these will start to infiltrate larger data-centres and will rapidly move into the appliance space.

Headaches not just for the traditional storage vendors but also for Chipzilla; Chipzilla has had the storage market sewn-up for a few years but I expect ARM-based commodity hardware to push Chipzilla hard in this space.

Yet with all the focus on Flash-based storage arrays, hybrid-arrays and the likes; everyone is currently focusing on the high-margin hardware business. No vendor is really showing their hand in the cheap and deep space; they talk about big data, they talk about software defined storage…they all hug those hardware revenues.

No, many of us aren’t engineering companies like Google and Facebook but the economics are beginning to look very attractive to some of us. Data growth isn’t going away soon; the current crop of suppliers have little strategy apart from continuing to gouge…many of the start-ups want to carry on gouging whilst pretending to be different.

Things will change.

Storage is Interesting…

A fellow blogger has a habit of referring to storage as snorage and I suspect that is the attitude of many. What’s so interesting about storage, it’s just that place that you keep your stuff? And many years ago as an entry level systems programmer; there were two teams that I was never going to join…one being the test team and the other being the storage team, because they were boring. Recently I have run both a test team and a storage team and enjoyed the experience immensely.

So why do I keep doing storage? Well, firstly I have little choice but to stick to infrastructure; I’m a pretty lousy programmer and it seems that I can do less damage in infrastructure. If you ever received more cheque-books in the post from a certain retail bank, I can only apologise.

But storage is cool; firstly it’s BIG and EXPENSIVE; who doesn’t like raising orders for millions? It is also so much more than that place where you store your stuff; you have to get it back for starters. I think that people are beginning to realise that storage might be a little more complex than first thought; a few years ago , the average home user only really worried about how much disk that they had but the introduction of SSDs into the consumer market has hammered home how the type of storage matters and the impact it can have on the user experience.

Spinning rust platters keep getting bigger but for many, this just means that the amount of free-disk keeps increasing, the increase in speed is what people really want. Instant On..it changes things.

So even in the consumer market; storage is taking on a multi-dimensional personality; it scales both in capacity but also in speed. In the Enterprise; things are more interesting.

Capacity is obvious; how much space do you need? Performance? Well, performance is more complex and has more facets than most realise. Are you interested in IOPs? Are you interested in throughput? Are you interested in aggregate throughput or single stream? Are you dealing with large or small files? Large or small blocks? Random or sequential?

Now for 80% of use-cases; you can probably get away with taking a balanced approach and just allocating storage from a general purpose pool. But 20% of your applications are going to need something different and that is where it gets interesting.

Most of the time when I have conversations with application teams or vendors; when I ask the question as to what type of storage that they require, the answer comes back is generally fast. There then follows a conversation as to what fast means and whether the budget meets their desire to be fast.

If we move to ‘Software Defined Storage’, this could be a lot more complex than people think. Application developers may well have to really understand how their applications store data and how they interact with the infrastructure that they live on. If you pick the wrong pool,  your application performance could drop through the floor or the wrong availability level, you experience a massive outage.

So if you thought storage was snorage; most developers and people still do, you might want to start taking an interest. If infrastructure becomes code; I may need to get better a coding but some of you are going to have to get better at infrastructure. Move beyond fast and large and understand the subtleties; it is interesting…I promise you!

Snakebite….

So EMC have unveiled ViPR; their software defined storage initiative; like many EMC World announcements, there’s not a huge amount of detail, especially if you aren’t at EMC World. It has left many of blogger peers scratching their heads and wondering what the hell it is and whether it is something new.

Now like them, I am in that very same camp but unlike them, I am foolish enough to have a bit of guess and make myself look a fool when the EMCers descend on me and tell me how wrong I am.

Firstly, let me say what I think it isn’t; I really don’t believe that is a storage virtualisation product in the same way that SVC and VSP are. The closest EMC have to a product like this is VPLEX; a product which sits in the data-path and virtualises the disk behind it. This I don’t think is a product like this. Arguably these products are mis-named anyway; I think of these as Storage Federation products.

So that is what ViPR isn’t (and can I say that I really hate products with a mix of upper and lower case in their names!).

It is worth looking back in time to one of EMC’s most hated products (by me and many users); Control Center. I think ViPR might have some roots in ECC; to me it feels that someone has taken Control Center and turned it into a web-service; so instead of interacting by a GUI, you interact via the API.

And I wonder if that was how the control component of ViPR came about; when rewriting the core of ECC, I posit that it was abstracted away from the GUI component and perhaps some bright spark came along and thought…what if we exposed the core via an API?

Okay, it might not been of ECC and it could have been Unisphere but this seems a fairly logical thing to do. So perhaps the core of ViPR is nothing really that new, it’s just a change in presentation layer.

[Update: So a lot of the code came from Project Orion which Chad talks about here. So it has been kicking around in EMC for some time, this kind of programmable interface was being discussed and asked for at various ECC user-group/briefings prior to that.]

Then EMC have brought some additional third party arrays into the mix; NetApp seems to be the first one. Using IP that EMC picked up when they bought the UK company, WysDM; who had both a very nice backup reporting tool but also a NAS/Fileserver management tool?

Building additional third party support should be relatively simple using either their CLI or in some cases an exposed API.

So there you go, ViPR is basically a storage management tool without a GUI, or at least it is GUI optional. And with it’s REST API, perhaps you could build your own GUI or your own CLI? Or perhaps your development teams can get on and generally consume all the storage you’ve got but in a programmatic way.

It all seems pretty obvious and begs the question why no-one did this before? I think it might have been arrogance and complacency; this tool should make it easier to plug anyone’s storage into your estate.

But if this was all ViPR was; it’d be pretty tedious. Still EMC obviously read my blog and obviously read this and rapidly turned it into a product or perhaps they simply talk to lots of people too. If I’ve thought it, plenty of others had.

Object Storage has struggled to find a place in many Enterprises; it doesn’t lend itself to many applications and many developers just don’t get it. But for some applications it is ideal; it seems that it would better to have both Object and File Access to the same data, you probably don’t want store it twice either.

So yet again, it’s all about changing the presentation layer without impacting the underlying constructs. However unlike the more traditional gateways into an Object Store; EMC are putting a Object Gateway onto an NFS/SMB share (note to Chuck: call it SMB, not CIFS). Now this is almost certainly going to have to sit in the data-path for Objects. There will be some interesting locking/security model challenges and the like; simultaneous NFS/SMB and Object access is going to be interesting.

It will also require the maintenance of a separate metadata-store, something with a fast database to get that metadata out of. And perhaps EMC own some technologies to do this as well. A loosely coupled metadata store does bring some problems but it allows EMC to leverage Isilon’s architecture and also grab hold of data sitting on 3rd party devices.

[Update: Seems like EMC are using Cassandra as their underlying database. Whether it is Object on File or File on Object; not sure but whatever happens, it is allowing you access via Object or File.]

So ViPR is really at least two products; not one. So..perhaps it’s a Snakebite..

Question is…will it leave them and us lying in the gutter staring at the stars wondering why everyone is looking at us strangely?

 

Can Pachyderms Polka?

Chris’ pieces on IBM’s storage revenues here and here make for some interesting reading. Things are not looking great with the exception of XIV and Storwize products. I am not sure if Chris’ analysis is entirely correct as it is hard to get any granularity from IBM. But it doesn’t surprise me either; there are some serious weaknesses in IBM’s storage portfolio.

Firstly, there is still an awful lot of OEMed kit from NetApp in the portfolio; it certainly appears that this is not selling or being as sold as well as it was in the past. So IBM’s struggles have some interesting knock-on to NetApp.

IBM are certainly positioning the Storwize products in the space which was traditionally occupied by the OEMed LSI (now NetApp) arrays; pricing is pretty aggressive and places them firmly in the space occupied by other competing dual-head arrays. And they finally have a feature set to match their competitors, well certainly in the block space. .

XIV seems to compete pretty well when put up against the lower-end VMAX and HDS ‘enterprise-class’ arrays. It is incredibly easy to manage, performs well enough but is not the platform for the most demanding applications. But IBM have grasped one of the underlying issues with storage today; that is it all needed to be simplified. I still have some doubts about the architecture but XIV have tried to solve the spindle-to-gigabyte issue. There is no doubt in my mind that traditional RAID-5 and 6 are long term broken. If not today, very soon. The introduction of SSDs into the architecture appears to have removed some of the more interesting performance characteristics of the architecture. XIV is a great example of ‘good enough’.

So IBM have some good products from the low-end to the lowish-enterprise block space. Of course, there is an issue in that they seriously overlap; nothing new there though, I’ve never known a company compete against itself so often.

DS8K only really survives for one reason; that is to support the mainframe. If IBM had been sensible and had the foresight to do so; they would have looked at FiCon connectivity for SVC and done it. Instead IBM decided that the mainframe customers were so conservative that they would never accept a new product or at least it would have taken 10 years or so for them to do so. So now they are going to end-up building and supporting the DS8K range for another 10 years at least; if they’d invested the time earlier, they could be considering sunsetting the DS8K.

But where IBM really, really suffer and struggle is in the NAS space. They’ve had abortive attempts at building their own products;  they re-sell NetApp in the form of nSeries these days and also have SONAS/V7000-Unified. Well the nSeries is NetApp; it gets all of the advantages and disadvantages that brings i.e a great product whose best days seem behind it at present.

SONAS/V7000-Unified are not really happening for IBM; although built on solid foundations, the delivery has not been there and IBM really have no idea how to market or sell the product. There have been some quality issues and arguably the V7000-Unified was rushed and not thought all the way through. I mean who thought a two node GPFS cluster was ever a good idea for a production system.

And that brings me onto my favourite IBM storage product; GPFS. The one that I will laud to the hills; a howitzer of a product which will let you blow your feet off but also could be IBM’s edge. Yet in the decade and a bit that I have been involved with it; IBM almost never sells it. Customers buy it but really you have to know about it; most IBM sales would have no idea where to start and even when it might be appropriate.

At the GPFS User Group this week, I saw presentations on GPFS with OpenStack, Hadoop, hints of object-storage and more. But you will probably never hear an IBMer outside of a very select bunch talk about it. If IBM were EMC, you’d never hear them shut-up about it.

One of the funniest things I heard at the GPFS User Group were the guys who repurposed an Isilon cluster as a GPFS cluster. It seems it might work very well.

I personally think it’s about time that IBM open-sourced GPFS and put it into the community. It’s to good not too and perhaps the community could turn it into the core of a software-defined-storage solution to shake a few people. I could build half-a-dozen interesting appliances tomorrow.

Still I suspect like Cinderella, GPFS will be stuck in the kitchen waiting for an invite to the ball.

Object Paucity

Another year, another conference season sees me stuck on this side of the pond watching the press releases from afar, promising myself that I’ll watch the keynotes online or ‘on demand’ as people have it these days. I never find the time and have to catch up with the 140 character synopsis that regularly appear on Twitter.

I can already see the storage vendors pimping their stuff at NAB; especially the Object storage vendors who want to push their stuff. Yet, it still isn’t really happening….

I had a long chat recently with one of my peers who deals with the more usual side of IT; the IT world full of web-developers and the likes. He’d spent many months investigating Object Storage; putting together a proposition firmly targeted at the development community; Object APIs and the likes. S3 compatible, storage-on-demand built on solid technology.

And what has he ended up implementing? A bloody NFS/CIFS gateway into their shiny-new object storage because it turns outs what the developers really want is a POSIX file-system.

Sitting here on the broadcast/media side of the fence where we want gobs of storage provision quickly to store large objects with relatively intuitive metadata; we are finding the same thing. I’ve not gone down the route of putting in an Object storage solution because finding one which is supported across all the tools in today’s workflows is near impossible. So it seems that we are looking more and more to NFS to provide us with the sort of transparency we need to support complex digital workflows.

I regularly suggest that we put in feature requests to the tools vendors to at least support S3; the looks I generally get are one of quiet bemusement or outright hostility and mutterings about Amazon and Cloud.

Then again, look how long it has taken for NFS to gain general acceptance and for vendors to not demand ‘proper’ local file-systems. So give it 20 years or so and we’ll be rocking.

If I was an object storage vendor and I didn’t have my own gateway product; I’d be seriously considering buying/building one. I think it’s going to be a real struggle otherwise and it’s not the Operations teams who are your problem.

Me, I’d love for someone to put an object-storage gateway into the base operating system; I’d love to be able to mount an object-store and have it appear on my desktop. At least at that point, I might be able to con some of the tools to work with an object-store. If anyone has a desktop gateway which I can point at my own S3-like store, I’d love to have a play.

 

/dev/null – The only truly Petascale Archive

As data volumes increase in all industries and the challenges of data management continue to grow; we look for places to store our increasing data hoard and inevitably the subject of archiving and tape comes up.

It is the cheapest place to archive data by some way; my calculations currently give it a four-year cost something in the region of five-six times cheaper than the cheapest commercial disk alternative . However tape’s biggest advantage is almost its biggest problem; it is considered to be cheap and hence for some reason no-one factors in the long-term costs.

Archives by their nature live for a long-time; more and more companies are talking about archives which will grow and exist forever. And as companies no longer seem to be able to categorise data into data to keep and data not to keep; exponential data-growth and generally bad data-management; multi-year, multi-petabyte archives will eventually become the norm for many.

This could spell the death for the tape-archive as it stands or it will necessitate some significant changes in both user and vendor behaviour. A ten year archive will see at least four refreshes of the LTO standard on average; this means that your latest tape technology will not be able to read your oldest tapes. It is also likely that you are looking at some kind of extended maintenance and associated costs for your oldest tape-drives; they will certainly be End of Support Life. Media may be certified for 30 years; drives aren’t.

Migration will become a way of life for these archives and it is this that will be a major challenge for storage teams and anyone maintaining an archive at scale.

It currently takes 88 days to migrate a petabyte of data from LTO5-to-LTO6; this assumes 24×7, no drive issues, no media issues and a pair of drives to migrate the data. You will also be loading about 500 tapes and unloading about 500 tapes. You can cut this time by putting in more drives but your costs will soon start escalate as SAN ports, servers and periphery infrastructure mounts up.

And then all you need is for someone to recall the data whilst you are trying migrate it; 88 days is extremely optimistic.

Of course a petabyte seems an awful lot of data but archives of a petabyte+ are becoming less uncommon. The vendors are pushing the value of data; so no-one wants to delete what is a potentially valuable asset. In fact, working out the value of individual datum is extremely hard and hence we tend to place the same value on every byte archived.

So although tape might be the only economical place to store data today but as data volumes grow; it becomes less viable as long-term archive unless it is a write-once, read-never (and I mean never) archive…if that is the case, perhaps in Unix parlance, /dev/null is the only sensible place for your data.

But if you think your data has value or more importantly your C-levels think that your data has value; there’s a serious discussion to be had…before the situation gets out of hand. Just remember, any data migration which takes longer than a year will most likely fail.

Service Power..

Getting IT departments to start thinking like service providers is an up-hill struggle; getting beyond cost to value seems to be a leap too far for many. I wonder if it is a psychological thing driven by fear of change but also a fear of assessing value.

How do you assess the value of a service; well, arguably, it is quite is simple…it is worth whatever someone is willing to pay for it. And with the increase prevalence of service providers vying with internal IT departments; it should be relatively simple. They’ve pretty much set the base-line.

And then there are the things that the internal IT department just should be able to do better; they should be able to assess Business need better than external. They should know the Business and be listening to the ‘water cooler’ conversations.

They should become experts in what their company does; understand the frustrations and come up with ways of doing things better.

Yet there is often a fear of presenting the Business with innovative and better services. I think it is a fear of going to the Business and presenting a costed solution; there is a fear of asking for money. And there is certainly a fear of Finance but present the costs to the Business users first and get them to come to the table with you.

So we offer the same old services and wonder why the Business are going elsewhere to do the innovative stuff and while they are at it; they start procuring the services we used to provide. Quite frankly, many Corporate IT departments are in a death spiral; trying to hang-on to things that they could let go.

Don’t think I can’t ask the Business for this much money to provide this new service…think, what if the Business want this service and ask someone else? At least you are going to be bidding on your own terms and not being forced into a competitive bid against an external service provide; when it comes down to it, the external provider almost certainly employees a better sales-team than you.

By proposing new services yourself or perhaps even taking existing ‘products’ and turning them into a service; you are choosing the battle-ground yourselves…you can find the high ground and fight from a position of power.