So disk utilisation rears it’s ugly head again and again and again; some reports suggest that things are getting worse not better. And you know what, I think that they may get even worse in a lot of shops. There a few trends which lead me to this conclusion.
1) Focus on the cost per byte stored; disks are getting larger and so everyone expects the costs to fall.
2) Greater choice in disk speeds, sizes and types.
3) Continued data growth
4) Poor management tools
If we continue to merely focus on the cost per byte without consideration to the cost of accessing the data in a manner which is acceptable to the business; we will find that procurement teams and CIOs focus on purchasing capacity at the cheapest price.
This may actually be acceptable and it may be that you can buy lots of slower disks which enable you to get the performance you require but you will be sacrificing 50% of that capacity because the drives are saturated. So rather focusing on capacity utilisation figures, a more complex model needs to be built which takes account of all factors be it physical space, energy consumption, management overheard etc, etc.
Greater choice leads to more chance of the wrong disks being procured but also it introduces more complexity to the environment. Tiering may actually result in poorer utilisation figures if it is poorly implemented and planned. My experience actually suggests you need a lot less Tier 1 than you think and depending on your procurement models, you may find you’ve purchased more Tier 1 than you require. This leads to an oversupply in Tier 1, there will be initial reluctance to use this as Tier 2 but I suggest that pragmatism is the order of the day. You should be able to move your data around later!
Data Growth means that hard pushed teams are being asked to manage more and more storage. There simply isn’t the time to manually move the data around and ensure that is in the right place.
Management tools do not make it easy to report and get a view of the storage estate.
So what’s the answer?
Firstly understand that the utilisation model is more complex than some of the vendors would have you believe. Use your own figures and models to work out what poor utilisation costs you, run some comparison figures; don’t believe everything you are told.
Secondly, investigate thin-provisioning and wide-striping; it’s going to help! There is no doubt in my mind about this.
Vendors will tell you that virtualisation will help as well; it might do but it might simply give you more headaches. Look carefully at it and think how you might use it but it is not a magic bullet.
But the one thing which is going to help you most is automagic, rules-based management tools. Face it, multiple disk tiers, multiple applications with poorly defined NFRs, incessant data-growth, doing more with the same number of people (if you are lucky); you are not going to cope.
We are building an environment which is too complex and large to manage without automation; you will no longer be able to tune it by ear and the discordance you will introduce by trying will make a cat's choir sound like music.
And when you find these automagic, rule-based management tools which are self-healing, self-tuning, self-managing; drop me an email!! But don’t tell everyone about them or we’ll all be out of jobs!!