News

Storage basics

Simon Sharwood

Delivering data more quickly is a challenge every IT shop faces, but never more so when trying to get information off disks and in front of the people who want to wield it. Indeed, there are few IT pros that do not realise that storage is something of a bottleneck as they attempt to go about their everyday chores.

Storage has achieved this dubious status for two reasons, one of which is that the amount of data that flows in and out of storage arrays can do so in quantities that networks simply were not designed to carry over sustained periods.

The second is that disks themselves can only bury and disinter data so quickly.

So how can businesses increase the speed at which data moves from disk to user, disk-to-machine or disk-to-application?

Technology itself offers several answers, as there is a definite hierarchy of speediness in the storage arena, the most hotly-debated of which are the various technology choices that can increase the speed at which storage will operate.

The most basic choice is that of disk drive. Hard disk makers have all sorts of tricks to make it possible for their machines to read data from their platters and send it on its way to the processor of a PC or a network.

The most basic of those techniques is the speed at which a hard disk rotates. Slow drives will turn a mere 4200 times per minute and, because the drive's read heads can only read data once each revolution, offer many fewer chances to capture data than a drive that speeds around at 15,0000 RPM. Faster drives enable faster access by reducing latency, the amount of time taken before data reaches the read head. 4200 RPM drives can require over seven milliseconds to bring the required data to a position where the read head can do its job. 15,000 RPM drives can get the same task done in not much more than two milliseconds, and still have time to roll a metaphorical cigarette while they wait for their next task!

It will come as no surprise to learn that faster drives cost more money, quickly making a cost/benefit equation a part of any attempt to speed storage performance.

A second technique used to make hard drives faster is increasing the density of data on each platter. This is a worthwhile endeavour because the denser the data, the less physical movement involved in retrieving it. Hard disk makers are therefore in perpetual pursuit of ways to pack more and more data into each square centimetre of disk space, the better to facilitate quick retrieval of information.

The latest and greatest trick from hard disk makers is perpendicular writing to the disk surface. Traditional hard drives align the poles of the media they use horizontally, "smearing" each piece of data onto a region of the disk's surface.

Perpendicular storage technology instead seeks to align the poles of the media vertically, almost "poking" data down into the media. This technique requires less physical space on the media and therefore enables greater density, which in turn allows faster access and therefore faster retrieval.

Perpendicular technology is coming to market now, with Hitachi's newly-available one-terabyte disk an early example.

Off the disk and onto the bus

Data's next step in its journey off the disk is through the bus connecting media to other components in a computer or storage device.

The main current contenders in this space are the serial ATA (SATA) and Small Computer System Interface (SCSI) standards, each of which are built into hard drives to give them an interface to the rest of the electronic world.

Each have their advantages, with SCSI enjoying a theoretical speed advantage and higher sustained throughput. SCSI also has an edge in reliability, as SCSI drives will typically outlast SATA drives.

But SCSI is not cheap to implement and UltraSCSI drives (UltraSCSI is the most advanced form of the standard and the most common today) are therefore more expensive than their SATA cousins.

SATA is also said to be a simpler standard to implement which means cheaper drives, although this comes with reliability issues. On the upside, however, is a newish SATA feature called native command queuing (NCQ) that makes it possible to have several outstanding commands within a drive at the same time and therefore speed access.

Consensus suggests that SCSI is the best option for core storage tasks, with SATA the better candidate for secondary storage roles where speed is important but constant, heavy-duty data wrangling is less likely to be required.

Squeezing extra speed out of either standard is very much a case of understanding where each can make a difference when faced with particular storage needs.

Network complications

Complicating matters further is the choice of network over which to carry data and which can afford greater speed.

The complication comes from the fact that standards like Fibre Channel ATA (FATA) can let SATA drives connect to a Fibre Channel network. Fibre Channel is, of course, a networking protocol and was initially developed as the successor to HIPPI, a 1980s favourite for machine to machine networks that screamed along at then-amazing speeds of 800Mbit/s.

Fibre Channel aimed to exceed that speed and was later adapted as a way to connect SCSI disks in recognition of the need for high speed data exchange between storage arrays, with its 4 Gbp/s once a delightfully speedy alternative.

Fibre Channel can now connect to SATA or SCSI drives with few worries about the kind of disk at the other end. It is fast, reliable and widely used.

It remains, however, a somewhat different network protocol. With Ethernet the world's dominant networking protocol many organisations wonder if it makes sense to acquire Fibre Channel and the equipment and skills needed to operate it, especially now that Ethernet has reached 10Gbp/s and there are storage devices more than happy to take advantage of that speed.

Organisations looking for storage speed are now therefore conducting interesting trade-offs of speed vs. reliability vs. cost of acquisition vs. cost of ownership.

Less is more

Another issue affecting the search for speed is the growing array of tools, techniques and technologies that seek to reduce the amount of data that must be carried, with data de-duplication to the fore and offering users a chance to simply store less data.

Most approaches to this issue seek to shrink the size of every file ever created, the better to speed its passage over networks, disks and buses alike. Few vendors' approaches to this problem are the same and all therefore deserve individual consideration.

And that approach of careful assessment to ensure that storage systems meet your need for speed and general utility is what underpins the approach preferred by Kevin McIsaac, an analyst with Intelligent Business Research Services.

Mr McIsaac's February 2007 paper "How to get storage acquisition and planning right" argues that "new storage acquisitions result in sub-optimal storage infrastructure as the requirements are driven by the needs and demands of a new high-profile application implementation."

"This results in a storage infrastructure that cannot be shared, or optimised, across the application portfolio because the planning does not take into account the needs of other applications, either existing or planned."

Mr McIsaac therefore advocates a three-phase planning process to ensure that storage meets a business' needs, with the first being a requirements analysis phase that looks beyond purely speed-oriented metrics such as RPM outlined above and instead considers the desired outcome from a new storage purchase and defines "Storage Service Levels" that are "based on the business the value of the application to the business' operations."

Phase two, he says, is about solution mapping, and is the time when specific technologies and their capabilities can be mapped against the business needs and can deliver the required Service Storage Levels.

The final phase, negotiation, sees the cost of the technology compared to the required service levels to "negotiate for trade-offs in the service levels for lower costs." The outcome of this phase, the paper says, can be changes to priorities as the realities of the costs involved in achieving desired service levels are traded off against their affordability or necessity.

The outcome of that process, it is hoped, is that organisations get the storage and the storage performance needed to help IT professionals daily lives just a little bit easier, at least with regards to data whirring about on disks.

Yet with a vast set of set of different storage technologies on offer and even more complex sets of circumstances in which they can be deployed, this will surely remain an area of IT that is hardly ever easy to tackle, even if the technologies on offer are always improving.