Novel storage architectures and pointer-free search trees for database systems
MetadataShow full item record
Database systems research is an old and well-established field in computer science. Many of the key concepts appeared as early as the 60s, while the core of relational databases, which have dominated the database world for a while now, was solidified during the 80s. However, the underlying hardware has not displayed such stability in the same period, which means that a lot of assumptions that were made about the hardware by early database systems are not necessarily true for modern computer architectures. In particular, over the last few decades there have been two notable consistent trends in the evolution of computer hardware. The first is that the memory hierarchy of mainstream computer systems has been getting deeper, with its different levels moving away from each other, and new levels being added in between as a result, in particular cache memories. The second is that, when it comes to data transfers between any two adjacent levels of the memory hierarchy, access latencies have not been keeping up with transfer rates. The challenge is therefore to adapt database index structures so that they become immune to these two trends. The latter is addressed by gradually increasing the size of the data transfer unit; the former, by organizing the data so that it exhibits good locality for memory transfers across multiple memory boundaries.We have developed novel structures that facilitate both of these strategies. We started our investigation with the venerable B+-tree, which is the cornerstone order-preserving index of any database system, and we have developed a novel pointer-free tree structure for its pages that optimizes its cache performance and makes it immune to the page size. We then adapted our approach to the R-tree and the GiST, making it applicable to multi-dimensional data indexes as well as generalized indexes for any abstract data type. Finally, we have investigated our structure in the context of main memory alone, and have demonstrated its superiority over the established approaches in that setting too. While our research has its roots in data structures and algorithms theory, we have conducted it with a strong experimental focus, as the complex interactions within the memory hierarchy of a modern computer system can be quite challenging to model and theorize about effectively. Our findings are therefore backed by solid experimental results that verify our hypotheses and prove the superiority of our structures over competing approaches.