If you are a regular reader of this blog and a follower of my tweets, you will be aware that I work for one of the UK's largest broadcasters and at the moment, I'm working on a massive media archive. I doubt that there is a media company in the world who isn't trying to work out to move away from what is a fairly tortuous workflow which involves a huge number of systems and eventually ends up on a tape.
When I say tape, it is important to differentiate between video-tape and data-tape; life can get very confusing when talking about digital workflow and you find people talking about tapeless systems which actually use a huge amount of tape but this is about from moving from a video-tape based system to a data-tape archive.
But this little entry isn't about tape; it's about disk and more importantly, it's about how a build a massively scalable archive with some very demanding throughput requirements. In fact, what we are building is a specialised storage cloud and much of what we are doing has application in many industries.
The core of this storage cloud is a cluster file-system; more importantly it is a parallel cluster file-system allowing multiple servers to the same data. The file-system appears to the users/applications as a standard file-system and is pretty much transparent to them, There are a number of cluster file-systems around but we have just chosen the venerable GPFS from IBM.
When I say venerable, I mean it; I first came across GPFS over ten years ago but at that time it was known as MMFS (multimedia filesystem) or Tigershark; it was an AIX only product and it was a pig to install and get working. But it supported wide-striping of data and it's read performance was incredible. I put this into what was a large media archive in a UK university (it is fair to say that I probably have more media on my laptop now) but it was very cool at the time.
MMFS then mutated into GPFS and what was already arcane descended in the world of the occult. GPFS was aimed at the HPC community and as such ran on IBM's SP2; I suspect it what at this point that GPFS got it's reputation as incredibly powerful but an extremely complex environment to manage. And running on top of AIX, it was never going to set the world alight. But in it's niche, it was a very popular product. However at about GPFS 1.x; I moved away from the world of HPC and never thought I would touch GPFS again.
In between now and then various releases have come and gone largely un-noticed I suspect to the world at large. However in 2005, an important release was made, GPFS 2.3 was released and Linux support was brought in. I suspect it was at this point, that GPFS started to make a quiet comeback and move back into what was it's original heartland; the world of media.
So here we are in 2010; GPFS now sits at version 3.3 and supports Linux, AIX and Windows 2008; Solaris was road-mapped but has fallen off and there appears to be no more plans to support Solaris.
Figures are always good to throw in I guess; so let's get some idea of how far this will scale.
- Maximum Nodes in a cluster: 8192 (architectural); 3794 Nodes Tested (Linux)
- Maximum Size of File System: 2^99, 4 Pb Tested
- Maximum Number of Files in a File System: 2 Billion
- Maximum Number of File Systems in a GPFS cluster: 256
So it'll scale to meet most needs and more importantly there are clusters out there which drive over 130 Gigabytes of throughput to a single sequential file. We don't need anything like that throughput yet.
And of course it meets those enterprise requirements such as the ability to add and remove nodes, disks, networks etc on the fly. Upgrades are allegedly non-disruptive, I say allegedly because we've not done one yet.
It also has a number of features which lead me to feel that GPFS is almost Cloud storage like in nature; it's not quite but it's very close and could well become so.
More in Part 2….