Storagebod Rotating Header Image

Cloud

Cloud-Bailing

Chuck Hollis pondered whether using a remote DR facility could ever be consider ‘Cloud Bursting’ and much conversation is ensuing along the lines that ‘Cloud Bursting’ is a marketing thing which currently doesn’t exist and won’t exist until applications can be architected which automagically scale and move themselves to utilise capacity where-ever else it maybe. I am paraphrasing some much brighter people than me who know a lot more about Cloud than a mere Storagebod but that’s kind of the message I took away.

Anyway, what Chuck was pondering is not exactly new; for decades, we have moved workloads about; sometimes moving them temporarily to a DR site or second site to free up capacity for a transient capacity demand or whilst waiting for a capacity upgrade. Mainframe houses are/were pretty well versed in this; shifting workloads at peak times and then bringing them back when the crisis has averted.

It’s not really bursting; bursting is something which just happens and is dynamic, immediate and exciting. This sort of workload management is somewhat akin to bailing out a boat or perhaps transferring a liquid from a now too small container whilst you either stem the flow or get a bigger container.

And yes, you could use the public Cloud as your temporary container but you could also use your DR site, perhaps your development kit…but to ‘Cloud-Wash’ it as ‘Cloud-Bursting’ is probably pushing things a little far. So perhaps ‘Cloud-Bailing’; simply on the grounds that it sort of brings to mind of something which is neither elegant and a little haphazard.

Or perhaps we could just call it Workload Management and consider that its a general discipline which could be applied equally to Cloud and to more traditional IT?

Big Chief Running Cloud

This piece on cio.com caught my attention; Service Providers and Outsourcing advisors have been marking CIOs knowledge and capability to structure cloud deals and not surprisingly, the scores are not especially flattering. I wonder what would happen if it were reversed and CIOs and Senior IT Executives were to mark Service Providers and Advisors on their knowledge of Cloud and how it could be applied to their business?

I suspect that in many cases it would not be flattering either. I suspect that many would say that their partners are confused, present conflicting messages, have little meaningful metrics to measure success and struggle to present a clear direction.

The problem is that there needs to a genuine conversation about Cloud and what it means to each individual business opportunity. The technology is mostly ready but it is a complex engagement because it is not a traditional IT engagement; if both sides treat this like an outsourcing deal, opportunity is going to be missed.

Outsourcing deals are often filled with conflict with tensions between two organisations and a focus on setting in place firm and fast rules of engagement. For a Cloud engagement, I think there is going to be a need for more flexible frameworks and certainly a more partnership orientated approach.

This requires a re-examination of process from both sides and will drive some interesting and uncomfortable conversations. Vendors may find themselves in the position where they have to almost admit that everything that both sides once thought they knew is wrong!

Cloud business cases will need metrics which reflect value and not cost containment. CIOs know how to contain cost, they’ve got very good at it but they need help in defining value. Yet still pretty much every vendor and Service Provider will still talk about how they will help you contain cost but not how they add real value.

There needs to be an element of business transformation in every deal; if you underplay this requirement, the Cloud engagement will surely fail.  Any engagement needs to have a transformation metric; in many cases, you are going to have long-term engagement at all levels of the IT organisation. Most sensible commentators know that Cloud is not going to reduce the number of people that are needed in IT but it is going to change the skills required; this again can change the nature of the conversation away from the normal outsourcing conversation.

Big Chiefs Running Clouds are going to need are going to need a lot of help.

And you need to remember that this is new territory for a lot of people and in many cases, you will be able to count the number of Cloud engagements that your people have carried out on the fingers of one thumb. As a service provider, you have a lot of responsibility to make these engagements work.

OpenStack Day – EMEA

Tomorrow, July 13th sees the first OpenStack EMEA Day devoted to the open-source cloud computing platform.  This event sold out very quickly and then was moved to a larger venue and then proceeded to sell out again; looking at the list of the attendees, it seems that the great and good of the UK storage and virtualisation twitteratti with the exception of the VCE gang will be in attendance (actually there are a couple of Cisco attendees but no EMC or VMWare).

A good mix of vendors, service providers and end-users bodes well for this day and shows that there is an appetite to examine alternatives to the nascent incumbents in this space .

I’m looking forward to see what OpenStack can bring to the party; I’ve already looked at the OpenStack storage components and am quietly impressed but I’m looking forward to seeing more about the Compute and the Provisioning components. I think if the latter is strong, then the whole offering has real legs but if it is weak and complex, then it’s a chance going begging.

Provisioning is still one of the areas in Cloud Computing which has some distance to travel…

 

The Call of the Cephalopod

As I continue in my experiments with scalable storage systems, I came across Ceph. Apart from having a very cool logo, Ceph is a distributed scalable network and storage file system which is based on an object store with a metadata management cluster which allows standard POSIX file-system interaction with the objects in the store.

As well as standard file-system’ access, Ceph also allows applications to interaction directly with the object store, an S3 compatible REST interface to allow applications designed to work with Amazon S3 to use Ceph as a private S3 Cloud and finally a network block device driver enabling Ceph to work as an ehthernet block-device like iSCSI; but the block device can be striped and replicated across the cluster allowing greater reliability and scalability (both capacity and performance).

Ceph is can currently be considered as alpha-code and you probably should not store your life work on it. That said it is relatively easy to set-up and play with. And as it released under the LGPL, you have full access to the source and can do with it what you will.

I built my cluster on top of Ubuntu and built it on three VMs; now, if I was building for performance, I would probably have ensured that I had built these on seperate spindles but I wasn’t and I was just seeing how quickly I could build it and get it working.

It probably took me a hour or so to build the VMs, install the packages,  configure the cluster and bring up a file-system. At that point, I just played with Ceph and treated it like a normal file-system, it works; writing a file on one node appears on all of the nodes in the cluster.

I’ve not had the chance to play with either the S3 or the native Object interface but I must say it all looks very promising.

There is an increasingly large amount of good work being done with scalable file-systems in the Open Source community but if you want to run a mixed cluster with Windows and Linux sharing block access to file-systems you are currently stuck with the commercial options such as Stornext and GPFS.

However, if you looking to build your own scalable NAS or Cloud; you have lots of options. And even if you still go with a commercial option, building your own is good place to learn what questions to ask.

 

Apple kills the Cloud!

I think Apple might have put the final nail in the coffin in the term Cloud; no more will vendors want to call their offerings Cloud.

‘Cloud, isn’t that some that Apple do? Isn’t that some where people can save their iTunes? Why would I want to implement one here?’

‘Run my applications in the Cloud? We use Windows and Unix, not Macs; surely it’ll be very expensive to move all my applications to Mac!’

‘Cloud, isn’t that a bit like iCloud but not as cool and user-friendly?’

So lets hope that in their maniacal plan to take over the world, Apple have done us all a favour and killed the term Cloud in ‘corporate IT’.

 

Open Source Scale Out Storage

So you want to build yourself a storage cloud but you don’t have the readies to build one using one of the commercial products which are available.  Well, don’t worry, there are open source alternatives which might allow you to get a taste of Scale Out without breaking the bank.

Gluster is one such open source alternative and is now part of OpenStack, the open source cloud computing platform being built by a number of developers and vendors.

Gluster is available as a commercial software appliance or you could simply download the packages and install it on a variety of Linux distributions including Ubuntu and Redhat-derived Linux distributions. I have recently built a small cluster using Scientific Linux 6.0 (Scientific Linux is my new favourite Redhat derived Linux and SL 6.0 is based on RHEL 6) and ESXi.

The initial set-up is pretty easy and it took me less than a couple of hours to stand-up a three node cluster and build a small environment. The documentation is clear and should be simple for anyone with a modicum of Linux knowledge to follow.

I will give people a couple of tips; if you do not want to play with IPtables, turn them off to get yourself up and running. And the latest version of Gluster requires rsync 3.0.7 for its geo-replication; there does not appear to be a RPM for RHEL 6.0 at present, just use the Fedora RPM and that appears to work fine.

Adding additional nodes is simple and I’ve quickly added a fourth virtual node non-disruptively; then it is simply a case of telling gluster to rebalance the files across the nodes.

But only supporting Linux means that if you want to serve files to other operating systems, this means utilising NFS and CIFS. There seems to be conflicting information on whether Gluster supports  CTDB and the necessary locking; so at the present I am only exporting NFS  from a single node with no fail-over support yet. My next experiment will be to see if I can get it configured as a true scale-out NAS solution.

I will let you know how I get on!!

 

Future Postive

Chuck’s post about Big Data Storage (as opposed to your tiny data storage) is pretty much on the money; simplicity is very much the key and it is this ‘no frills’ approach to storage which means that small teams can manage large amounts of storage with the minimum amount of fuss.

The key to us is the ability to scale quickly and easily; adding storage in and then simply using software to do clever stuff like balancing; be it OneFS, GPFS or StorNext, the job is relatively simple assuming that you’ve done the initial set-up correctly.

Much about what he says about Snaps, DeDupe and the other features that the more traditional general purpose storage arrays have also rings very true. Much of the data we deal with does not lend itself to DeDupe and there are some other interesting aspects of the data we deal with; once the file is written, it is never changed.

Think of it like a RAW file from a digital camera; you don’t actually change the file, you may develop the file, in some applications, we may save a file which details the edits and transformations which are required to produce the processed file and in others, we may save a copy of the processed file.

Replication and archive is handled at an application level; we could do it at the file-system or storage level but it is easier to let the application handle it and this means we can be completely storage-agnostic.

We are relinquishing a certain amount of control and empowering the user to take responsibility for their environment. They decide on the number of replicas they require and to a certain extent, they decide as to where these replicas are stored; if they want to store eight copies locally, then they are empowered to do so.

We do need better instrumentation to allow us to look at ways of telling them exactly how much data they are storing and also how much bandwidth etc they are consuming. This could develop into a charge-back system but I suspect it will be more an awareness exercise. It would also allow us to model the impact of moving to a public cloud provider for instance where both available bandwidth and bandwidth consumption are important factors.

Looking at this model may have longer term implications for general purpose storage; if VMware and other operating systems continue to add features such as snapshot and replication into their software stacks; then the back-end array type becomes almost entirely commodity.

If we consider the impact of caching flash moving up into the server stack; yet more array functionality becomes server-level functionality. Of course, there are still challenges in this space with ensuring that the array is aware and co-operating with the server but if the replication and other functionalities are actually carried out at the server level, then this becomes more feasible.

I can imagine a time where the actual brains of the array are completely virtualised and live on the servers; virtualised VMAXs, VNXs, Filers, v7000s etc are all within the realms of possibility in the near future and probably exist today in various vendor labs. The rust/SSDs become a completely commodity item.

Where does this leave the storage team? In the same place it is today, at the core of the Enterprise working to provide simple and effective access to the information asset; they may be working at different levels and collaborating with different people but they’ll still be there. But they might have smiles on their faces instead of frowns…now there’s a thought!

The Complexity Conspiracy

For many years, there’s been a cosy little conspiracy between vendors, IT departments, analysts and just about everyone else involved in Enterprise IT and that is that it is complex, hard and generally baffling to everyone else. And in our chosen specialism, Enterprise Storage; we are amongst the most guilty of all.

We cloak simple concepts in words which mean little or can have multiple meanings; we accept bizarre and arcane commands for basic operations, we accept weird and wonderful capitalisations, we map obsolete commands onto new functions adding yet further obfuscation, we allow vendors to continue to foist architectures on us which made perfect sense fifteen years a go but have no real validity now.

I at times wonder just what a mess we would be in if new vendors such as 3PAR and Compellant had not come along and massively simplified things; hands have been forced a bit but arguably this has not yet gone far enough. The mid-range systems are generally better than they were but we need to see this pushing up the stack to the high-end.

It is not enough to sit back and plead ‘backwards compatibility’ as an excuse for not revisiting tools and terminology.

I think what I would like to see is a big push to simplify and clarify terminology; let people in and stop veiling with false complexity. And ironically enough, I think if we were to do so; we might find that in de-mystifying what we do, our users appreciate what we do more.

It will become easier to explain concepts such as availability and recoverability; the concepts will become understood and appreciated more and with that understanding, there will be more demand for them. Hand-waving and muttering that its complex and pretending to be the gate-keepers to IT nirvana is no longer really a tenable situation. They are going off to do their own things and they are no longer believing in the power of the IT department.

They *know* that this stuff is not complex and they are going to prove it themselves. But although it is not fundementally complex, it is not always easy and we only have to look at the impact that large Cloud outages have. We do know how to do this but we have to share and embrace; people often talk about how Cloud can enable collaboration and how true this is; it enables and encourages collaboration at a multitude of levels.

But this is not just about Cloud; it is about a change in how we work with our end-user colleagues;  it is about telling our vendors that the systems that they are shipping are unnecessarily complex; we should be demanding simplicity and not allowing them to ship us product which is only useable by occultists engineers or whatever else we want to call ourselves; we need to invest in simplifying our environments and we need to stop being complicit in a ‘Conspiracy of Complexity’.

VMware gets a Rocket

So VMware carry along on the acquisition trail, now picking up SlideRocket. SlideRocket, a SaaS provider specialising in enabling the creation and sharing of presentations; think Cloud Powerpoint appears to be another brick in the strategy to enable VMware to cover all bases when it comes to all things Cloud.

I guess we have got to expect this to be integrated somehow into the Zimbra Collaboration Appliance and I guess the next thing on the shopping list is going to be a full Cloud-based Office Suite as well. It seems that VMware have an aggressive expansion into as many parts of the enterprise stack as possible in plan.

Or is it that VMware have a lot of vapourware presentations that they want to get out there and share? Perhaps they are fed-up with paying Microsoft for the licensing fees of Powerpoint?

Who knows? But let’s keep that bubble expanding….the vBubble!

Cloud Storage without Cloud….

Another day, another new Cloud Storage Service; today I got an invite for AeroFS which is a Cloud Storage Service with a difference, it doesn’t necessarily store your data in the Cloud unless you ask it to, what it does do is manage a number of folders (AeroFS calls them libraries) and allow you to sync them between your various machines using a peer-to-peer protocol.

You can share the folders with other people on the service  and you can also decide which of the folders get synced to each of your machines which gives you a fairly coarse-grained sync. You also decide which of the folders get backed-up to the Cloud, so it is possible just back-up those folders that are important.

There is client support for Windows, Mac and Linux at present.

Currently the service is an invite-only alpha and I’ve not had a huge amount of time to play with it but it looks like a potentially interesting alternative to Dropbox but it will need mobile clients for it to truly compete. I do like the P2P aspects of the service and I do like that I can sync pretty much unlimited data between the clients. It is certainly one to watch.

AeroFS is here.