For thoughtful commentary on all kinds of database and data storage systems, one of the best sources is Curt Monash’s DBMS2 blog. Recently he posted an article called Traditional Databases will eventually wind up in RAM. I have two comments about his points from that article.
I’m still not totally comfortable with Curt’s distinction between “human-generated” and “machine-generated” data. Data from humans always goes through machines, so at some level all data is machine-generated. I think what you’re saying is that the number of humans is roughly constant (on the time scale you mean), and they only have so much time in a day to key in data, etc. But what about trends that create more bits from any particular bit of human activity?
In the old days, records in databases were created when a person “keyed in” some fields. Now, data is generated every time you click on something. As data systems increase in capacity, won’t computers start gathering more and more data for each human interaction? For example, every time I click, the system records what I clicked on, plus such context the entire contents of what was on my browser screen, how long it was since my last click, plus the times for each of the previous 1,000 clicks, everything it currently (this keeps changing) knows about my buying habits, etc.
That may be far-fetched, but I’m not so sure: betting on things staying the same size as they are has usually turned out to be less than prescient. In any case, the underlying principle is analogous to the “Freeway Effect”: if there are higher data rates and databases, there will never be “enough”.
We’ll find more data to transmit and more to store, forever and ever.
In-RAM Database Systems
Having a database “in RAM” can mean more than one thing.
In traditional DBMS design, data “in RAM” is vulnerable to a very common failure mode, namely, the machine crashing. So no database data is considered to be durable (the “D” in “ACID”) until it has been written to disk, which is less vulnerable, especially if you use RAID, etc. So traditionally writes are sent to a log and forced to disk. You can still keep the data itself in RAM, but recovery from the log will take longer and longer as the log grows in size, so you “checkpoint” the data by writing it out to disk. That can be done in the background if everything is designed properly. This is utterly standard.
It’s also traditional that there isn’t enough RAM to hold the whole database, so RAM is used as a cache. This creates some issues when you have to write a modified page back to disk NOT as part of a checkpoint, and there are very standard ways to deal with that.
“In RAM” can mean (a) as above, but ususally/always the RAM cache is so big that you never overflow the cache; (b) the database system is designed so that data must fit in RAM, which can simplify buffer management and recovery algorithms; (c) you get around the machine-crash problem some way or other and really do keep everything only in RAM.
One way to do (c) is to keep all data in (at least) two copies, such that they’ll never both be down. This requires that the machines (1) have very, very independent failures modes, which is not as easy to do as one might think, and (2) get fixed very quickly, since while one is down you have fewer copies. Issue (2) is one reason to keep more than two copies; usually three copies are recommended, with one being at a “distant” data center.
This approach can be used for the log even if not for the whole DBMS. HFS, the Hadoop File System, and VoltDB consider this the preferred/canonical way to go. In both cases, some users still feel uncomfortable with approach (c), and so both have put in ways to commit the log to a conventional disk. The hope is that as approach (c) proves itself in real production environments over the years, it will be more and more accepted.