My latest assignment at work is to try to make a dedicated storage system that we can use to be the performance of a RDBMS when making statistics. I know it seems strange, but the first tests show that a dedicated storage system can be something like 5 times faster then a RDBMS, even without optimizing the storage.
It goes inline with a recent article from the MIT guys that indicates that for specific applications a dedicated storage system can be up to 10 or more times faster then a conventional RDBMS, even if it is a really expensive one. Well, I've known this all along: the RDBMS needs to be sufficiently generic and if you know the nature of the data and the operations that are going to be performed you can always make something faster.
The problem with attempting to make something faster is that you will have to solve some of the problems that an RDBMS already did. No, I'm not talking about Transaction Management, but concerning how data is store. You need to structure your storage in a way that enables you to update, read and delete effectively. And believe me: although this might seem simple, it is not!
Now my days are spend around managing files in blocks, splitting block in slots, making linked list in files, metadata, storage format, etc., etc. And I still have to think about defragmentation techniques...