Whether you're an insurance company trying to detect fraud, an oil company analyzing exploration data to find that next lucrative well, or a retailer who needs to optimize pricing of merchandise, the issue is the same: too much data, and not enough time and horsepower to do a thorough analysis of it. How can you process data in the optimum amount of time?
One huge bottleneck is moving the data from where it's stored on a hard disk into the computer's memory, where computations happen. It takes time to access a disk, find the right data and transfer it to memory, and even fractions of a second count when you're working with millions of records. In many cases, companies solve the problem by analyzing only a representative subset of the data, hoping that the result is the same had they analyzed all of it.
The solution? In-memory computing – bypassing the disk and storing the data in the machine's main memory during computations, according to vendors such as Software AG, SAP AG, Tibco Software and SAS Institute. Analysts at research firm Gartner agree, naming in-memory computing as one of the top 10 strategic technology trends for 2012.
Tech companies are taking two approaches. In the one espoused by Software AG, the entire database is permanently stored in memory rather than on disk. With its Next-Generation Data Management Platform, based on Terracotta BigMemory technology, the company claims that running a risk analysis of a large amount of data can drop to 45 seconds from 45 minutes because the data is already readily available.
In fact, Karl-Heinz Streibich, Software AG's chief executive officer, says the traditional disk-based SQL database is nearing the end of its usefulness in handling large volumes of data. With the cost of in-memory technology falling, real-time analytics have become practical.
Now that memory has become inexpensive, it's feasible to store gigabytes (a gigabyte is 1024 megabytes), or even petabytes (1,048,576 gigabytes), of data exclusively in memory.
How inexpensive is it? One megabyte that would have cost $90 in 1990 goes for a whopping half a cent today, according to computer scientist and researcher John C. McCallum's listing of prices dating back to 1957 on jcmit.com.
Does in-memory storage spell the end of the hard disk? Probably not – there are excellent reasons to continue to use them for some applications. But it allows users choice.
Another approach to in-memory computing, taken by the SAS Institute, combines parallel processing with in-memory technology. Its new LASR analytic server, part of its High Performance Analytics platform, can process a billion rows of data in minutes, or even seconds, rather than hours or days. That means business decisions can happen quickly, rather than lagging far behind the results that drove them.
Parallel processing has been around for a while, but not in mainstream computing. Think of it this way: if you have a giant jar of jellybeans to count all by yourself, it'll take quite a while, but if you gather some friends, pour the candies into separate dishes, and let each person count their own dish, it goes a lot quicker. Add up the totals your pals have come up with, and you have a result in a fraction of the time it would have taken to count the jellybeans yourself. That's parallel processing.
In SAS's world, the jar of jellybeans is a database. The company has built technology that divvies up the records and stashes them in the memories of a bunch of interlinked computers so the data is virtually instantly accessible. Each computer processes its own subset of the data, then the results are combined to give you your answer. At a recent conference in Orlando, Fla., the company demonstrated the technology with one billion records, whipping through them in 32 seconds on the LASR server (which was composed of 48 blade computers processing in parallel).
In each approach, the critical component is memory. Lots and lots of memory, and of course software that can work with it. The more data you need to process at once, the more important it is to do the work in memory so you get results quickly. Quick results mean quick decisions – or the time to adjust the analysis and try again.
Jim Davis, SAS's chief marketing officer, believes in-memory computing is on its way to becoming a commodity. And the way memory prices are dropping, it's a distinct possibility.