Stay up to date with notifications from The Independent

Notifications can be managed in browser preferences.

The code that computers haven't cracked

Too much information: IT systems may be getting faster all the time, but they can't handle the growing mountains of data

Stephen Pritchard
Sunday 05 December 2004 01:00 GMT
Comments

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

The computer hard disk will be 50 years old next year. The first drives consisted of two-foot- wide, rotating magnetic platters, and were part of the IBM 350 computer. The device could store five million characters, at that time an enormous quantity of data. A 250GB hard disk - storing 50,000 times as much data - now costs under £100. And drive capacities are doubling roughly every 18 months, but still sell for the same price.

The computer hard disk will be 50 years old next year. The first drives consisted of two-foot- wide, rotating magnetic platters, and were part of the IBM 350 computer. The device could store five million characters, at that time an enormous quantity of data. A 250GB hard disk - storing 50,000 times as much data - now costs under £100. And drive capacities are doubling roughly every 18 months, but still sell for the same price.

But this growth is posing a threat to business, which has capitalised on developments since the world's first computer, Colossus, was cracking Hitler's wartime codes at Bletchley Park in Buckinghamshire.

Moore's Law, which states that microprocessors will double in power every 18 months, has led to faster, cheaper computers that have boosted productivity in business. But the growth in storage threatens to have the opposite effect. Computer systems are struggling to keep up with the need to handle ever-greater quantities of data. IT departments face the prospect of spending more and more time and money on data management and archiving.

Research by US consultancy Horison shows that the volume of data being stored worldwide for critical business applications has increased from four exabytes (1 billion gigabytes) in 2002 to 10.2 exabytes this year. Add in data such as digital photos and music files - much of which ends up stored by individuals on their office computers - and the volume rises even more dramatically, to 51 exabytes. This will grow more than fourfold by 2007.

Already, data storage accounts for between 15 and 18 per cent of IT budgets among Fortune 500 companies, says Dave Quinones, a senior partner at consultants Accenture. Much of the initial investment took place during the IT spending boom of the late 1990s.

But, Mr Quinones points out, IT departments bought dedicated storage for specific applications, leading to "islands of storage" that were neither linked, nor used efficiently.

"As a result, utilisation rates are low, storage is not being well managed and the total costs of ownership are going through the roof," he says. On Windows-based systems - considered in the industry to be the least efficient for storage - as much as 80 per cent of hard drive space goes unused.

For a while, the falling costs of hard drives masked the problem of inefficient storage. But ever-larger hard disks tempted individual users and IT managers to keep more and more data on their PCs or servers, rather than archiving it to cheaper storage, such as tape, or even deleting it. Email, in particular, is to blame. "Any chief information officer will say it is email that is the data hog," says Joe Tucci, the chief executive of storage equipment maker EMC. "But emails themselves generate very little data. The big data is the attachments, and these attachments get duplicated."

Unfortunately for IT departments, adding more storage, however cheap, to a system increases management costs. It can hit the performance of computers, too. According to Phil Dawson, an enterprise computing specialist at research company Meta Group, capacity growth is outpacing the ability of manufacturers to move data on and off the drives. This slows down computer systems, forcing companies either to invest in more expensive drive technologies with better read-write speeds, or to build their storage from multiple, smaller and faster drives. "The dilemma is that the [disk] platters are doubling in capacity year on year or maybe every 18 months, but the performance of the drives is not," says Mr Dawson.

And businesses also need to think about how they will back up the data from their growing storage mountains. Tape technology has not advanced as quickly as disk drive designs, forcing companies to invest in more expensive technologies, such as auto-loading tape libraries or even systems that fool back-up software into thinking that it is copying data not to tape, but to another hard disk.

As well as the cost of the hardware, these systems need software to run properly, and trained staff to manage them. Pat Martin, the chief executive at US data management company StorageTek, estimates that the overheads for storage currently account for between $2 and $3 for every dollar spent on the hardware itself. As companies hold more information, he warns that this figure could rise to between $3 and $4.

Deleting data, however, might not be an option: US government regulations are demanding that companies keep more, not less data. "Because of the Sarbanes-Oxley Act and the SEC [the US Securities and Exchange Commission], businesses are required to save email for between three and five years," says Mr Martin. "Some information must be kept much longer: aircraft design and test information has to be kept for 25 years, and test results for chemicals for 50 years."

To keep information for this long, and to ensure that regulators and other interested parties can find it again in the future, businesses need to turn to technology to manage data in a more active way. Technologies known as information lifecycle management (ILM) or hierarchical storage systems move older and less used data from expensive main disk systems to cheaper backup drives and then, eventually, to tape for archiving. Storage companies have also developed techniques that find duplicate copies of data, such as email attachments, and ensure that one rather than dozens of versions are kept on hard drives.

Such systems can also ensure that companies comply with legal and regulatory requirements for keeping data. Regulators are asking to see evidence of companies' archiving arrangements at a very technical level to ensure data really can be retrieved. But, according to Mr Quinones, there is an upside to such pressures.

"Compliance is creating an opportunity for companies to tackle their total cost of ownership issues," he explains.

This is because the same systems that businesses will need to make sure they meet the regulators' data archiving requirements should also make storage far more efficient in the longer term. But boards will have to accept that they need to spend more on storage now, to save money in the future.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in