Storagebod Rotating Header Image

Cloud

Beer, beer….we want more #storagebeer!

Yet another #storagebeers to be held on Tuesday after the SNIA Academy. This is an important one as we hopefully have not one but two special guests from the blogosphere. There will be other bloggers there and I'm not saying that you aren't special, you all are but these two are travelling some way to be with us!

Firstly, we have Alex MacDonald of NetApp; hopefully Alex will be joining us from the frozen North; okay Scotland but it's been so cold recently, I think it counts as the frozen North. 

And secondly, we have another NetApp blogger in the form of Val Bercovici who will be joining from warmer climes! 

So if you want an informal and hopefully wide-ranging discussion from these two members of the Storage Blogosphere; please come along! 

We shall be meeting in the Chesire Cheese which is close to the SNIA venue and also to Tower Hill tube. Please note, not the more famous Ye Olde Chesire Cheese on Fleet Street! 

p.s to make #storagebeers entries easier to find, I have added a #storagebeers category!

Future Imperfect?

The Anarchist posts an account written by one of EMC's engineers imagining the day in the life of a future application administrator, managing storage has now got so simple that we no longer need specialist storage administrators. It's a compelling vision but how likely is it to come to pass? 

That's a hard question to answer, EMC certainly believe that they are going to announce technology that will enable this and all indications are that they are going to be setting out more details at EMC World. It'll certainly be interesting to see how ready the technology is and how near to ship it is, let's hope that they can do better than they have with FAST2.

But as I keep saying this is not about technology, it's about people and process. 

Firstly, EMC is not the storage world; there are plenty of companies out there who are not running EMC storage or at least are running EMC storage and something else. If  this is an EMC only tool with a huge cost to migrate into; there will be resistance and it's not only a case of migrating into; there is the cost of getting out of again. Technology refreshes are not just upgrades to the latest from the incumbent vendor. 

I suspect that somewhere along the line, there will be some interesting changes in maintenance costs; you will be incentivised to upgrade to the latest greatest; I mean it is easy, why wouldn't you and if the maintenance costs for year's 4+ are high; well, the TCO 'savings' would stack up on paper.

Anyway, let's just say that you look at the figures and you decide that actually this all stacks up for you over the next 5-10 years. So off you set on this journey. 

Firstly, you've got define a service catalogue; now this isn't quite as easy as you might think; yes, I can define a service but what does that actually mean in Business terms? What does 5000 IOs per sec, 5 ms response time, 500 Megabytes per second really mean to a Business User? Do you review all your current applications and define service tiers around those? RTO and RPO; we talk to the Business about those already and still we struggle to get meaningful answers. 

Nothing spoken about in Anarchist's blog entry really addresses these problems. 

Secondly, how does this new dynamic world fit into the current processes? Now, we can sit and moan about whether the current processes are fit for purpose; whether the current change, configuration, problem and incident management processes really work. But many of these processes have been in place for years and actually do a pretty good job with today's technology. 

But we aren't talking about today's technology? Sorry, this is not a black and white situation; you will not suddenly stop using today's technology and if there is one thing which will cause chaos, that is having two different processes in place (unless you try to run as two different IT organisations). This is a non-trivial problem. 

I also love the comments about chargeback. Chargeback in nearly every organisation I've for/with has been one of those things which seems like a good idea until the Finance Department get involved. We get into complex areas about Capital versus Operational Expenditure; what does the chargeback cover etc, etc.

EMC's vision is marvellous and compelling, the technology is the least of the problems but I will add something on that. I seem to recall that we were supposed to have no application developers by now; that code-generation was going to get so simple that a business user would be able to define their applications based on business rules and the business application would be generated. I suspect that there are more application developers now than there have ever been.

Buy Bycast

I was fascinated and indeed slightly surprised at yesterday's announcement that NetApp had bought Bycast, an object storage company. Why so?

Well I distinctly remember Val saying just after the Atmos launch that NetApp had an object storage strategy and something would be coming soon; I'm sure someone can probably find the exact comment. So if NetApp were already working on object storage, why did they purchase Bycast? Now either Val was being economical with the truth and when he said that NetApp were working on object storage, he meant that they were looking at buying something? Or perhaps, NetApp have struggled internally to develop something; certainly there are rumours that the focus was on getting OnTap 8 out the door and all development resource was really focused on that.

Now this purchase leaves NetApp with something of a problem as to what to actually do. 

1) Do they integrate Bycast into OnTap? I suspect that this would be the preferred route and would allow them to keep the 'Unified Storage' tag but this route is fraught with difficulty and the Spinnaker experience may well have scared them off.

2) Do they simply continue with Bycast as a separate product? This is a much more pragmatic approach but it leaves them open to criticism that they no longer sell 'Unified Storage'. 

It's an interesting position for NetApp to find themselves in.

Object Storage itself is an interesting technology area and one which itself brings a number of challenges. Actually implementing an object store is not especially hard but interfacing the object store with a multitude of different applications is going to be the biggest challenge. Meta-data continues to be challenging with many custom schema being deployed for different uses.

It's still very early days for large-scale deployments of object storage; I suspect that this market has some way to go before any kind of front runner/leader can be declared. I think we are still stumbling over the early fences at the moment.  

It will also be interesting to see how NetApp position the product; it could feasibly be positioned as a direct competitor with OnTap 8. Could they fall into some of internal politics that EMC's Atmos has suffered from with sales-guys worrying that it could cannibalise their traditional revenue streams? 

Or do they use it as a way to compete with EMC's DaaD strategy? 

Do they try to build an application infrastructure around it? 

Interesting times….

  

The Doctor and The Director

Imagine if you will, a whooshing, swooshing noise like an unearthly wind and as it fades away, a blue box fades in; the words 'Police Box' can be seen written upon it and there is silence.

The door opens and a young man pops his head out, looks around with the eyes of the trickster.

'We're here; this is the place, follow me.'

And following the young man out of the boxes comes a conservatively suited man; greying, looking slightly confused, like a banker out of time.

'This is it, this is your future; the future of IT in 2010, twenty years on!'

You know who the young man is but who is the other man? He's an IT Director of a major bank from twenty years ago; they were called IT Directors generally back then, not CIOs or CTOs.   

So what does he see, what changes does he see? And why does he look so horrified at what he sees? Let's ask him!

'Where's all the space gone? In my day, this room was nearly empty; there were a few big boxes with IBM written on them, a few boxes with Tandem written on them and lots of space! We built this data-centre in the seventies to cater for the huge systems but by the late eighties, the systems had shrunk so much that you could have a game of soccer in them or at least a five-side game!'

'What are these racks full of kit, just look at all the cables? Where's the mainframe, are all these racks mainframes? Why do we need so much?'

He wanders around staring at all of the unfamiliar names and the young man explains what he's seeing.

'That's the Intel systems..'

'You mean Intel, like in my PC?'

'Oh yes, just like in your PC but now they run business critical systems too and over here, here's your Open Systems'

'Open what?'

'Open Systems, it's another word for Unix'

'Unix? You mean that academic operating system written by those hippies in America?'

'And over here, well you know what this is'

'Well it's big and it's got IBM written on it, so I guess it's a mainframe?'

'Indeed, it's a mainframe but it doesn't run MVS any more, it runs zOS!'

'zOS?'

'It's just MVS; marketing, don't let it worry you!'

'And what the hell are all these boxes with hundreds of cables in? It looks a mess'

'Well, those are your Cisco routers and switches; they run your network!'

'SNA?'

'Oh no, TCP-IP'

'What the hell is that? Oh don't tell me! This all looks like a total mess!'

'Well if you think this is bad; just let me introduce you to the teams who run this lot'

So the Doctor and the IT Director exit the data-centre and take a tour; the IT Director shaking his head as he goes.

'So where we had a Mainframe storage team, a Mainframe network team, Mainframe System Programmers and operators; you now have all of those teams but with a Unix team, a Linux team (which is like Unix but not), a Windows team, a LAN team, a WAN team, a Storage Team, a BackUp team, a Security Team, a multitude of architects, a multitude of procurement people and you are also telling me that we've got a new team called the 'Virtualisation Team' which is a bit like VM but on PCs'

'We've also got DBAs supporting several flavors of databases; all built on a standard which no-one follows? More application developers than you can shake a stick at? A team to support the PCs which everyone appears to have now and lord knows what else?'

'And this is progress'

'Don't you worry!', the young man says with a grin and a twinkle, 'Because there is a plan, a new technology called Cloud, it's going to change everything! Some people like to call it 'the Software Mainframe'! It'll change everything, I promise…'

The IT Director sighs and shakes his head; 

'So in twenty years, we'll have all the previous teams? And a whole bunch of new ones to run this 'Software Mainframe'?'

'We could take a look if you want..'

'No, let's go back…!'

'Well if you insist but I'll warn you; there'll be no more of me on TV for a while; that Michael Grade hates me!'

BFI

BFI is an acronym which gets thrown around a bit and could stand for many things

Brute Force and Ignorance is one…but I've come up with a hopefully a new one which goes along with it, Big F**king Infrastructure. And this is my problem with Cloud at present; there seems to be a trend around at the moment that the point of Cloud is to build Big F**king Infrastructure. 

Now as an infrastructure bod, I can appreciate this and indeed, the part of me which likes looking at big tanks, fighter jets, aircraft carriers etc; finds BFI cool! Who wouldn't want to build the biggest, baddest data centre in the world? 

But is it really the point of Cloud? And this is what concerns me! Cloud should not just be about building infrastructures, it certainly should not be about turning data centres into Building Blocks. Cloud needs to be more than that.

It needs to be about something more; it needs to be about changing development methodologies and tools. If we just use it to simply replicate at scale what we do today, I think that we have failed. It certainly needs to be more than packaging and provisioning. It needs to be about elegance and innovation. 

I really don't want Cloud to turn into something like Java; what do I mean? Don't get me wrong, Java is great (and the JVM is greater) but how much Java written is simply C written in Java? Lots, believe me! I don't believe that Java has changed development paradigms nearly as much as some people like to believe. A large amount of C++ code is also simply C written using some of the features of C++ but not the fundamental structural changes brought by C++. And so it goes on.

Cloud brings elasticity to infrastructure, applications need to be designed with this elasticity in mind. A database needs to be able to scale up on demand and then gracefully shrink back down again; perhaps it needs to be able to start additional instances of itself on different machines to meet a peak and then when the load falls away, it should remove those instances whilst maintaining transactional consistency and integrity. 

Developers need to be able to design applications which wax and wane with demand. Yes we can fix a lot of these sort of issues at an infrastructure level but is that actually the right place to do it? We can fix a huge amount of problems with BFI but are we bringing sledgehammers to bear? 

So Cloud needs to be more than BFI! And that is why I was glad to see this story here about VMware and Redis; like Zilla writes, I also know >.< NoSQL apart from a couple of presentations at Cloud Camp and what I've read on the Net. After sitting in presentations by VMWare employees where they seemed to be equating Virtualisation with Cloud; it is great to see that they are looking beyond that. Let's hope it continues.

Seller of Dreams

I like Chuck's blog, I suspect I'd like Chuck in the flesh as well; I like the seller of dreams. And that is what he does but I'm going to call EMC and him on the latest dream being sold. Not because it's not a good dream; it has potential to be a very good dream, if EMC can actually get a grasp of it.

About this time last year, EMC announced a product with a big fan-fare; you may remember it, it was called V-MAX and it was announced with a load of exciting features. Most importantly was FAST and it was to come in two tranches:-

FAST v1, which pretty much everyone not drinking the EMC kool-aid decided was not very exciting. In fact many people struggled to see what the difference between FAST V1 was and the existing SymOptimizer product…apart from the fact, that you could actually buy the SymOptimizer product!

FAST v2, which pretty much everyone who wasn't drinking somebody else's kool-aid decided was pretty damn exciting. Okay, Compellent had got there first but this was large Enterprise scale stuff. 

And here we are, pretty much a year on; FAST v1 struggled out of the door before the end of 2009 but we are now in 2010…another big EMC announcement, this time even more spectacular than the VMAX announcement. Spectacular in two ways

    1) The vision is huge; this is EMC's moonshot! 

    2) There is no product at present (but we are promised big things at EMC World)

Actually, comparing it to the moonshot may end up to be a very good comparison because when Kennedy announced the race to the Moon; the US were massively behind the Soviets. Arguably, EMC are massively behind companies like IBM who have a very good product in Sysplex but that's a mainframe product. 

But before getting to the moon; can you finish what you've started elsewhere? So you before you start on the grandiose plans, deliver what you explicitly promised last year; that is block-level FAST working with virtually-provisioned volumes. Because at the moment, it looks like 3PAR have beaten you to the punch!

And howabout delivering what you have been promising in private for some time; a light-weight, performant and reliable SRM tool? 

So EMC get:-

A* for Ambition

C- for Delivery

p.s Chuck, if you manage to get the moon, work out what you are going to do there! It'd be unfortunate to get there and for everyone to applaud the effort and then simply get on doing what they were doing before. Make sure that people actually want this, not a few sad geeky dreamers.

p.p.s Still I note from the Now web-site, despite Alex's assurances that OnTap 8 was going to be GA next week and this was some weeks ago; OnTap 8 is still sitting at RC3. 

Autonomic for the People

Autonomic computing was a phrase coined by IBM in 2001; arguably the frame-works which were defined by IBM as part of this initiative could form much of what is considered Cloud Computing today.

And now 3Par have taken the term Autonomic and applied it to storage tiering. This is really a subset of the Autonomic Computing vision but none the less it is one which has recently gained a lot of mind-share in the Infrastructure world, especially if you were to replace the word Autonomic with the word Automatic; leaving you with Automatic Storage Tiering. But I think autonomic has rather more to it than mere automation; autonomic implies some kind of self management.

An autonomic system should be
  • Self Configuring
  • Self Healing & Protecting
  • Self Optimising 
IBM themselves defined five levels of evolution on the path to autonomic computing
  1. Basic 
  2. Managed 
  3. Predictive
  4. Adaptive
  5. Autonomic
Here I shall crib from the IBM press release dated 21st October 2002
"The basic level represents the starting point where a significant number of IT systems are today. Each element of the system is managed independently by systems administrators who set it up, monitor it, and enhance it as needed.

At the managed level, systems management technologies are used to collect information from disparate systems into one, consolidated view, reducing the time it takes for the administrator to collect and synthesize information.

At the predictive level, new technologies are introduced that provide correlation among several elements of the system. The system itself can begin to recognize patterns, predict the optimal configuration and provide advice on what course of action the administrator should take. As these technologies improve, people will become more comfortable with the advice and predictive power of the system.
The adaptive level is reached when systems can not only provide advice on actions, but can automatically take the right actions based on the information that is available to them on what is happening in the system.
Finally, the full autonomic level would be attained when the system operation is governed by business policies and objectives. Users interact with the system to monitor the business processes, and/or alter the objectives."
As press-releases go; it's really rather good and has applicablity in much that we are trying to achieve with dynamic infrastructures. It would behoove many vendors to look honestly at their products and examine where they are on this scale. IBM never really managed to deliver on their vision but has any vendor come close yet? 

I wonder if 3Par are really at level five of the evolutionary process; in fact they actually talk about Adaptive Optimisation as well as Autonomic Storage Tiering; a sub-conscious admission that they are not quite there yet?

But Autonomic Computing Infrastructures is something that all vendors and customers should be aspiring to though. Of course, there is the long term issue of how we get the whole infrastructure to manage itself as an autonomic entity and how we do this within an heterogeneous environment is surely a challenge. Still, surely it is the hard things which are worth doing?

What is Dynamic?

There's a lot of talk about Dynamic Data Centres, Dynamic Infrastructures; mostly in a cloudy context and mostly as some over-arching architectural vendor-focused vision. At times, I wonder if when a vendor talks about a 'Dynamic Infrastructure'; if they actually mean, you can use as much of OUR infrastructure as you like? You can flex up and down on OUR infrastructure.

This is rather limiting from an end-user IT consumer's point of view because you still find yourselves locked into a vendor or a group of vendors. So it's only dynamic with constraints; actually, I think Amazon got it right in their naming, it's Elastic but not truly Dynamic.

So as a good architect/designer/bodge-it-and-scarper-type person, you should be asking this question every time; if I do this, can I get out? What is my exit plan? Can I change any key component of the stack without major process/capability impact? Is the lock-in which comes with any unique feature worth it? 

And when I say any component, I mean all the way up to the application. So as part of the non-functional requirements of any application, there should be

1) Data Export/Import

2) Archival

standards defined and actually implemented. This goes for any off-the-shelf application as well. 

For Cloud to truly change the way IT is done and delivered; this has to be done..otherwise the only way is vertically integrated stacks, which ultimately lead to long-term lock-in. There are still mainframes in existence, not only because they are the right platform for some workloads but also because people are struggling to unpick the complex interdependencies which exist.

Seeding the Media Cloud – Part 2

In Part 1; I described GPFS, focussing on it's ability to massively scale but scale alone doesn't make a Cloud; whatever anyone would have you believe.

GPFS has two features which could allow it to become almost Cloud like. 

Firstly the move to Linux allows it to be based on commodity hardware. GPFS actually doesn't care what it's back-end disk is; it could be direct attached SATA, it could be V-MAX attached via a SAN, it could be iSCSI or it could be EFDs; it really doesn't care. As long as it appears as a block-device, it should be able to use it.

As for network, as long as it's IP; as far as I can tell, it doesn't care either. The faster the better obviously!!

So you can scale-out on fairly cheap hardware.

But it is GPFS's ability to move data around for you which could enable the most Cloud-like attributes.

GPFS like other IBM storage products supports the concept of storage pools. A Storage Pool is simply a group of storage devices, likely with similar characteristics; performance and reliability come to mind. A file-system must consist of a least one storage pool but may be made up of up to eight storage pools including the ability to define a storage pool external to GPFS such as TSM to allow data to be migrated to tape.

There is also the concept of a fileset which is basically a sub-tree of the GPFS global namespace; it allows administrative operations to be granted on a portion of a the filesystem and each fileset has it's own root directory and all files belonging to the fileset are only accessible via the root directory. Please note, this does not allow secure multi-tenancy but conceivably they could become the foundation of a secure multi-tenancy capability.  

GPFS includes within it a policy engine which allows it to automatically manage file data using a set of rules. This allows you to control the initial placement of files at creation time but also move these files over time. 

Placement rules are obviously run when a file is created.

Migration rules are run from the command line on demand but normally from a job scheduler like cron.

The rules are written using an SQL like language and can act on a number of attributes depending whether it is a placement rule or a migration rule; placement rules can work on user, group, fileset/sub-directory or filename. 

So you could do something like

Rule 'mp3' SET POOL 'SATA' WHERE UPPER(NAME) LIKE '%MP3'

Rule 'db' SET POOL 'FC' FOR FILESET ('database')

Rule 'sox' SET POOL 'Encrypted-disk' REPLICATE (2) FOR FILESET ('finance')

Migration rules run on things like Last Modified Time, Last Accessed Time, File Size as well as all of the attributes available placement rules.

So you can do things like

Rule 'logfiles' MIGRATE TO POOL 'sata' WHERE UPPER(name) LIKE '%LOG'

Rule 'core' DELETE WHERE UPPER(name) LIKE 'CORE'

You can also carry out actions based on a date; so you could move all the end of year reporting files onto your fastest disk before they were needed and then back off again the next week.

IBM in their typical fashion call this ILM and hence everyone ignores it because ILM is deeply unsexy. But they have the basis of a simple but powerful policy engine. And yet they don't tell anyone about it.

So do IBM have a storage cloud? No, not yet but they are close. The policy engine is simply not powerful enough yet but it could evolve into something and it's been around for long enough that you should be able to trust your data to it. 

And it does have some other nice features, like replication, snapshots, multi-clusters, GPFS over a WAN. I just wish IBM would make it easier for you guys to play with it but if you've got an IBM account manager, hassle them for an evaluation copy. 

If they try to tell you how complex it is…they're living in the past. It's much easier than it used to be and you can build yourself a small virtual cluster, quickly and easily in the virtualisation environment of your choice. I'd suggest building on top of Centos 5 but just remember to edit the right files to get it to pretend to be proper RedHat. 

As I rediscover this well-hidden IBM product, I'll be sure to share my findings.

Seeding the Media Cloud – Part 1

If you are a regular reader of this blog and a follower of my tweets, you will be aware that I work for one of the UK's largest broadcasters and at the moment, I'm working on a massive media archive. I doubt that there is a media company in the world who isn't trying to work out to move away from what is a fairly tortuous workflow which involves a huge number of systems and eventually ends up on a tape. 

When I say tape, it is important to differentiate between video-tape and data-tape; life can get very confusing when talking about digital workflow and you find people talking about tapeless systems which actually use a huge amount of tape but this is about from moving from a video-tape based system to a data-tape archive. 

But this little entry isn't about tape; it's about disk and more importantly, it's about how a build a massively scalable archive with some very demanding throughput requirements. In fact, what we are building is a specialised storage cloud and much of what we are doing has application in many industries. 

The core of this storage cloud is a cluster file-system; more importantly it is a parallel cluster file-system allowing multiple servers to the same data. The file-system appears to the users/applications as a standard file-system and is pretty much transparent to them, There are a number of cluster file-systems around but we have just chosen the venerable GPFS from IBM.

When I say venerable, I mean it; I first came across GPFS over ten years ago but at that time it was known as MMFS (multimedia filesystem) or Tigershark; it was an AIX only product and it was a pig to install and get working. But it supported wide-striping of data and it's read performance was incredible. I put this into what was a large media archive in a UK university (it is fair to say that I probably have more media on my laptop now) but it was very cool at the time.

MMFS then mutated into GPFS and what was already arcane descended in the world of the occult. GPFS was aimed at the HPC community and as such ran on IBM's SP2; I suspect it what at this point that GPFS got it's reputation as incredibly powerful but an extremely complex environment to manage. And running on top of AIX, it was never going to set the world alight. But in it's niche, it was a very popular product. However at about GPFS 1.x; I moved away from the world of HPC and never thought I would touch GPFS again.

In between now and then various releases have come and gone largely un-noticed I suspect to the world at large. However in 2005, an important release was made, GPFS 2.3 was released and Linux support was brought in.  I suspect it was at this point, that GPFS started to make a quiet comeback and move back into what was it's original heartland; the world of media.

So here we are in 2010; GPFS now sits at version 3.3 and supports Linux, AIX and Windows 2008; Solaris was road-mapped but has fallen off and there appears to be no more plans to support Solaris.

Figures are always good to throw in I guess; so let's get some idea of how far this will scale. 

  • Maximum Nodes in a cluster: 8192 (architectural); 3794 Nodes Tested (Linux)
  • Maximum Size of File System: 2^99, 4 Pb Tested
  • Maximum Number of Files in a File System: 2 Billion
  • Maximum Number of File Systems in a GPFS cluster: 256

So it'll scale to meet most needs and more importantly there are clusters out there which drive over 130 Gigabytes of throughput to a single sequential file. We don't need anything like that throughput yet.

And of course it meets those enterprise requirements such as the ability to add and remove nodes, disks, networks etc on the fly. Upgrades are allegedly non-disruptive, I say allegedly because we've not done one yet.

It also has a number of features which lead me to feel that GPFS is almost Cloud storage like in nature; it's not quite but it's very close and could well become so.

More in Part 2….