Apologies for the esoteric post, folks... but this is kind of important... Two folks from Yahoo, plus two folks from UCLA, have just released a paper on the ACM about a new kind of parallel algorithm: Map - Reduce - Merge.
If you don't know about MapReduce, its the algorithm that makes most of Google possible. Its a simple algorithm that allows you to break a complex problem into hundreds of smaller problems, use hundreds of computers to solve it, then stitch the complete solution back together. Google says its excellent for:
"distributed grep, distributed sort, web link-graph reversal, term-vector per host, web access log stats, inverted index construction, document clustering, machine learning, statistical machine translation..."
bla bla bla... but MapReduce can't do joins between relational data sets. In other words, its great for making a search engine, but woefully impractical for virtually every business application known to man... although some MapReduce-based databases are trying anyway (CouchDB, Hadoop, etc.)
UPDATE: Some Hadoop fans mentioned in the comments that MapReduce can do joins in the Map step or the Reduce step... but its highly restrictive on the Map step, and sometimes slow in the Reduce step... joins are possible, but sometimes impractical.
Well... this latest twist from the Yahoo folks fixed that: they claim MapReduceMerge now supports table JOINs. No proof as of yet, but there are a lot of folks staking their reputation on this... so its a fair bet. The Hadoop folks seem to be experimenting with MapReduceMerge... so if they spit out some new insanely fast benchmarks, my guess is that this is for real...
What does this mean for relational database like Oracle? Uncertain... but I did hear a juicy rumor about 15 months back: some guy from Yahoo sat down in a room with Oracle's math PHDs, and spent a day discussing an algorithm for super-fast multidimensional table joins... like sub-second performance on 14-table relational queries, with no upper limit. My sources told me the Oracle dudes were floored, and started making immediate plans to integrate some new stuff into their database. The Yahoo connection made me think this might be the MapReduceMerge concept...
Coincidence? Perhaps... but a juicy rumor nonetheless.