I think for a great many people in the storage industry, the Storage Array is the answer, the one true answer and do everything in the array. When this gets challenged, we get very defensive; the recent Forrester report asking whether we need a SAN any more elucidated some fairly defensive responses. But it's a question that we need to be asking ourselves more often; we need to be challenging ourselves regularly on this point. The core of our storage strategies may still be Networked/Shared Storage but this does not preclude doing things in a different way.
EMC, I think have woken up to this fact with Atmos; I'm not saying that Atmos is unique or special, I'll leave EMC to say that but Atmos is arguably the first attempt by the big storage boys to look at storage delivery in a different way. It really acknowledges that the special sauce is software; there is no real reason why Atmos could not be delivered as a software appliance, it runs on industry standard hardware utilising pretty dumb disk at the back-end. It is an *Interesting* product and I look forward to seeing it developed, matured and delivered.
But still Atmos delivers storage as a discrete entity, abstracted away from the application platform. I am beginning to come round to the idea that the application platform needs to become more tightly integrated with the storage and at time the application may well do things that the storage has done in the past.
Replication for example; don't get me wrong, array-based replication has served us well; integration, testing, maintenance, complexity and the magic pixie dust which makes replication work has kept many a storagebod fed, clothed and kept in the style to which they have become accustomed. But perhaps there is a better way, perhaps the applications themselves should be managing their replication, ensuring that they are transactionally consistent, after all they have a better view of what transactionally consistent may mean than us guys at the bottom of the stack. We should be starting at the top of the application stack and asking the question, would it be better to do it here?
Clustering, back-up, archiving, many traditional infrastructure functions may be better carried out better at the application level. At least ask the question and think about the answer.
So why doesn't this happen? Firstly, generally Non Functional Requirement gathering is an afterthought! That's boring stuff and anyway, the infrastructure teams look after that stuff for us. Secondly, the clue was in the words "infrastructure teams"; application teams and infrastructure teams rarely have a close relationship; often the first contact that the infrastructure team will have is when an application is delivered to be integrated into the infrastructure and they try to get the application to meet it's NFRs, SLAs etc. At this time, the relationship rapidly becomes antagonistic and fractious; surely the better thing would be for infrastructure teams and application teams to work closer together. The boundaries between the teams need to become blurred and some sacred cows need to be sacrificed. And yes, there is the belief that it will constantly mean re-inventing the wheel but with modern development methodologies and code re-use, this should not be the case.
Turning to the infrastructure to fix application problems, design flaws and oversights should become the back-stop; yes, we will still use infrastructure to fix many problems but less often and with a greater understanding of why and what the implications are.
I am hoping that as the downturn begins to bite that we will be forced to do this and that 2009 will become the year that applications and infrastructures become more integrated. Yes, you will still need storage, servers, networks but as the boundaries blur between infrastructure teams, so the boundaries between infrastructure and applications should blur as well. We all have a common goal in that we provide service. I live in hope…eternal optimist!!
I know you expect a howl on this, so I’ll start with the obligatory “have you lost your mind?” question and move on from there.
Storage people have been arguing for years about whether or not things like protocols matter. Some like to think that a protocol mashup is just fine and argue that nobody cares or wants to know about them. Of course, they matter a great deal and I can’t imagine a world of application directed storage functions and the resulting protocol mess this would unleash.
Many believe the cost of storage infrasturctures is too high. But what should that cost be? We look at an off the shelf disk drive and compare that to the cost of storage in arrays and scratch our heads at the discrepancy. The answer of course is the availability of data and the management power that allows people to maintain predictable data access.
If storage arrays were better at driving usage efficiencies, the pain of paying so much per TB of storage would be assuaged (some). But storage efficiency has mostly been an afterthought in many enterprise leading array designs.
A real focus on efficiency is badly needed, much more than a focus on shiny, faster and very expensive gadgets (flash SSDs) – which is the grand distraction our industry likes to pursue.
Back to application-driven storage: One answer would be better protocols that clearly identify the application and provide “tracks” (metadata) for implementing data-driven management policies. But the creation of new protocols does not seem all that promising. You mention the problems of requirements gathering in your blog – and there is the reason why.
Another answer might be finer granularity of systems and virtual systems to create explicit mappings of storage resources to applications. Products with SANscreen-like functionality might be able to make that possible.
Of all the available paths to reach application-driven storage policies, I’d put my money on virtulization technology as the most realistic. Yes, there would be lots of small virtual things to manage, but machines are pretty good at that sort of thing, even if people aren’t. At the end of the day, IT is going to have to resign itself to being managed with greater degrees of automation – just as we have been telling our customers for all these decades. The shoemakers children need new shoes.
Good post –
If history is our guide – the OS layer served the mainframe market well. The “storage in the fabric” trend died down. I don’t know if application vendors have the resource, dedication, or expertise to pull it off. EMC is thinking of storage in a new way – Atmos and their Cloud initiatives are great evidence of this.
The only two vendors that could pull off storage in the OS for open systems, I think, are IBM and Sun. IBM hasn’t put a lot a development towards this and while Sun is coming close with recent ZFS innovation, they can’t seem to get beyond their marketing and business challenges.
I do think we will still have SANs and these point products for years to come. But I think storage functionality will pop up in more unique ways. More than likely, the best approach may come from a new or innovative player using an older technology. Look what NetApp did with NFS.
I wasn’t expecting a huge howl; I was expecting some thoughtful responses, repostes etc!!! If I look round our estate, there are large chunks of storage managed by non-storage engineers. They are discrete point solutions and there are a lot of them; some of frighteningly expensive and some are frighteningly cheap.
Almost all of them use application-level functionality to provide services that are normally provided in the array in the IT world. Granted that they are fairly small, 100 terabytes here, 150 terabytes there; so fairly small. Alot of the disk sits in things you can hardly call a rack let alone an array and it works. Its a very interesting contrast as to what happens in the corporate IT world.
So I am being challenged to come up with reasons why this won’t work for some of our corporate applications and in a lot of cases; it’s a struggle. So much so, that it has lead me to think that I’ve got a lot to learn from these amateurs. You see no-one has ever told them that the right way is to use a SAN or a NAS; they’ve just hacked together something which works.
Martin – fully agree, more & more the application layers are providing the resilience functions that infrastructure used to have to. Similarly we increasingly see groupings of application architectures that can use ‘somewhat less traditional’ storage approaches. The technology exists already to do this, and frankly what is now being sold (or hyped) by ‘the big 4’ isn’t unique or new – but rather legitimises what less well known entities have been doing for sometime.
What I’m struggling with is that ‘the big 4’ are so busy obsessing with their one-upmanship ‘widget’ fights that they are completely missing the game change going on. It’s easy to say that today’s storage technology is less relevant (even negated) by tomorrow’s (ie 2009) NFRs, but the big one for me is that today’s economics (ie vendor storage rev stream) simply will not work for tomorrow’s NFRs. Infrastructure TCO cost funding is dropping, rapidly with less of the cake to be sliced up but few seem to understand that this will force real change.
So yes for real storage innovation I’m off looking under our developer’s desks (and at their homes) to see what they are putting together for themselves at no cost…
As we are all too aware, the cost of storage is not falling at the same rate as the CAGR. So storage becomes an increasing percentage of the infrastructure and there is a real focus on storage costs.
Dedupe, thin provisioning, virtualisation all help somewhat but they are temporary fixes. So costs have to fall and change will have to come; now, do the vendors slice margins (which are wafer-thin according to my starving sales-man) or do we change the game?
The game has changed before; the PC, the Open Systems, Open Source, SAN etc; the game will change again.
My reply got a bit long so …. -> http://mattpovey.wordpress.com/2009/01/05/building-resilience-into-applications/