As data volumes increase in all industries and the challenges of data management continue to grow; we look for places to store our increasing data hoard and inevitably the subject of archiving and tape comes up.
It is the cheapest place to archive data by some way; my calculations currently give it a four-year cost something in the region of five-six times cheaper than the cheapest commercial disk alternative . However tape’s biggest advantage is almost its biggest problem; it is considered to be cheap and hence for some reason no-one factors in the long-term costs.
Archives by their nature live for a long-time; more and more companies are talking about archives which will grow and exist forever. And as companies no longer seem to be able to categorise data into data to keep and data not to keep; exponential data-growth and generally bad data-management; multi-year, multi-petabyte archives will eventually become the norm for many.
This could spell the death for the tape-archive as it stands or it will necessitate some significant changes in both user and vendor behaviour. A ten year archive will see at least four refreshes of the LTO standard on average; this means that your latest tape technology will not be able to read your oldest tapes. It is also likely that you are looking at some kind of extended maintenance and associated costs for your oldest tape-drives; they will certainly be End of Support Life. Media may be certified for 30 years; drives aren’t.
Migration will become a way of life for these archives and it is this that will be a major challenge for storage teams and anyone maintaining an archive at scale.
It currently takes 88 days to migrate a petabyte of data from LTO5-to-LTO6; this assumes 24×7, no drive issues, no media issues and a pair of drives to migrate the data. You will also be loading about 500 tapes and unloading about 500 tapes. You can cut this time by putting in more drives but your costs will soon start escalate as SAN ports, servers and periphery infrastructure mounts up.
And then all you need is for someone to recall the data whilst you are trying migrate it; 88 days is extremely optimistic.
Of course a petabyte seems an awful lot of data but archives of a petabyte+ are becoming less uncommon. The vendors are pushing the value of data; so no-one wants to delete what is a potentially valuable asset. In fact, working out the value of individual datum is extremely hard and hence we tend to place the same value on every byte archived.
So although tape might be the only economical place to store data today but as data volumes grow; it becomes less viable as long-term archive unless it is a write-once, read-never (and I mean never) archive…if that is the case, perhaps in Unix parlance, /dev/null is the only sensible place for your data.
But if you think your data has value or more importantly your C-levels think that your data has value; there’s a serious discussion to be had…before the situation gets out of hand. Just remember, any data migration which takes longer than a year will most likely fail.