So we can thin-provision, de-dupe and compress storage; we can automate the movement of the data between tiers; now one single array may not have all these features today but pretty much every vendor has them road-mapped in some form or another. Storage Efficiency has been the watch-word and long may it continue to be so.
All of these features reduce the amount of money that we have to pay for our spinning rust; this is mostly a capital saving with a limited impact on operational expenditure. But there is more to life than money and capital expenditure; storage needs to become truly efficient through-out its life cycle; from acquisition to operation to disposal. And although some operational efficiencies have been realised, we are still some distance from a storage infrastructure that is efficient and effective throughout its life-cycle.
Storage Management software still arguably is in its infancy (although some may claim some vendor's tools are in their dotage); the tools are very much focused at the provisioning task. Improving initial provisioning has been the focus of many of the tools and it has got much better; certainly most provision tasks are point and click operations from the GUI and with thin and wide provisioning, much of the complexity has gone away.
But provisioning is not the be all and end all of Storage Administration and Management and it is certainly only one part of the life-cycle of a storage volume.
Once a volume has been provisioned, many things can happen to it;
i) it could stay the same
ii) it could grow
iii) it could shrink
iv) it could move within the array
v) it could change protection levels
vi) it could be decommissioned
vii) it could be replicated
viii) it could be snapped
ix) it could be cloned
x) it could be deleted
xi) it could be migrated
And it is that last one which is particularly time-consuming and generally painful; as has been pointed out a few times recently, there is no easy way to migrate a NetApp 32-bit aggregate to a 64-bit aggregate; there is currently no easy way to move from a traditional EMC LUN to a Virtual Provisioned one; and these are just examples within an array.
Seamlessly moving data between arrays with no outage to the service is currently time-consuming and hard; yes, you can do it, I've migrated terabytes of data between EMC and IBM arrays with no outage using volume management tools but this was when large arrays were less than 50 Tb.
We also have to consider things like moving replication configuration, snapped data, cloned data, de-duped data, compressed data; will the data rehydrate in the process of moving? Even within array families and even between code levels, I have to consider whether all the features at level X of the code are available at level Y of the code.
As arrays get bigger, I could easily find myself in a constant state of migration; we turn our noses up at arrays which are less than 100 Tb which when we are talking in estates which are several petabytes is understandable but moving 100s of Tb around to ensure that we can refresh an array is no mean feat and will be a continuous process. Pretty much once I've migrated the data, it's going to be time to consider moving it again.
There are things which vendors could consider; architectural changes which might make the process easier. Designing arrays with migration and movement in mind; ensure that I don't have to move data to upgrade code levels; perhaps consider modularising the array, so that I can upgrade the controllers without changing the disk. Data-in-place upgrades have been available even for hardware upgrades; this needs to become standard.
Ways to export the existing configuration of an array; import it onto new array, perhaps even using performance data collected from existing array to optimise layout and then replicate the existing array's data to enable a less cumbersome migration approach. These are things which will make the job of migration more simple.
Of course, the big problem is…..these features are not really sexy and don't sell arrays. Headline features like De-Dupe, Compression, Automated Tiering, Expensive Fast Disks; they sell arrays. But perhaps once all arrays have them, perhaps then we'll see the tools that will really drive operational efficiencies appear.
p.s I know, very poor attempt at a Tabloid Alliterative Headline
Please understand that I do not wish to do a pitch and feel free to call me out if you think this is a pitch but I’d like to address the closing statement you made. I won’t mention my company.
While I’m proud of our feature set, which includes many of the gee whiz items you list, I think the most powerful capability we offer is modularity. Customers can – and do – upgrade storage array components as needed (controllers, drives, FE/BE IO, software) and this has been a very strong selling point for us. And if someone doesn’t find the idea of owning a system which can be completely refit over time yet never gets a new serial number sexy… well, they haven’t gone through the pain you describe above.
Point is there are vendors doing what you ask and these capabilities should be a requirement on any storage RFx.
A very good and an important post. As you rightly point out, the act of migrating will be a major headache given the number of features each array has. I remember the enormous trouble in a particular VTL solution when you wanted to upgrade from a non dedupe version to a deduped version!!
Time and time again I fall back to relying on Linux LVM to do this. I’ve been using OpenFiler for some (by your standards) small stores of 15-20TB and it usually comes down to attaching a remote iSCSI share and migrating the data over there with lvchange -m1.
It’d be nice if storage vendors could inter-operate but this isn’t their MO, that’s left to the OS guys.