• Micro-Specialization: Dynamic Code Specialization in DBMSes

      Zhang, Rui (The University of Arizona., 2012)
      Database management systems (DBMSes) form a cornerstone of modern IT infrastructure, and it is essential that they have excellent performance. In this research, we exploit the opportunities of applying dynamic code specialization to DBMSes, particularly by focusing on runtime invariant present in DBMSes during query evaluation. Query evaluation involves extensive references to the relational schema, predicate values, and join types, which are all invariant during query evaluation, and thus are subject to dynamic value-based code specialization. We observe that DBMSes are general in the sense that they must contend with arbitrary schemas, queries, and modifications; this generality is implemented using runtime metadata lookups and tests that ensure that control is channelled to the appropriate code in all cases. Unfortunately, these lookups and tests are carried out even when information is available that renders some of these operations superfluous, leading to unnecessary runtime overheads. We introduce micro-specialization, an approach that uses relation- and query-specific information to specialize the DBMS code at runtime and thereby eliminate some of these overheads. We develop a taxonomy of approaches and specialization times and propose a general architecture that isolates most of the creation and execution of the specialized code sequences in a separate DBMS-independent module. We show that this approach requires minimal changes to a DBMS and can improve the performance simultaneously across a wide range of queries, modifications, and bulk-loading, in terms of storage, CPU usage, and I/O time of the TPC-H and TPC-C benchmarks. We also discuss an integrated development environment that helps DBMS developers apply micro-specializations to identified target code sequences.
    • Supporting the Procedural Component of Query Languages over Time-Varying Data

      Gao, Dengfeng (The University of Arizona., 2009)
      As everything in the real world changes over time, the ability to model thistemporal dimension of the real world is essential to many computerapplications. Almost every database application involves the management oftemporal data. This applies not only to relational data but also to any datathat models the real world including XML data. Expressing queries ontime-varying (relational or XML) data by using standard query language (SQLor XQuery) is more difficult than writing queries on nontemporal data.In this dissertation, we present minimal valid-time extensions to XQueryand SQL/PSM, focusing on the procedural aspect of the two query languagesand efficient evaluation of sequenced queries.For XQuery, we add valid time support to it by minimally extendingthe syntax and semantics of XQuery. We adopt a stratum approach which maps a&tauXQuery query to a conventional XQuery. The first part of the dissertationfocuses on how to performthis mapping, in particular, on mapping sequenced queries, which are byfar the most challenging. The critical issue of supporting sequenced queries(in any query language) is time-slicing the input data while retaining periodtimestamping. Timestamps are distributed throughout anXML document, rather than uniformly in tuples, complicating the temporalslicing while also providing opportunities for optimization. We propose fiveoptimizations of our initial maximally-fragmented time-slicing approach:selected node slicing, copy-based per-expression slicing, in-placeper-expression slicing, and idiomatic slicing, each of which reducesthe number of constant periods over which the query is evaluated.We also extend a conventional XML query benchmark to effect a temporal XMLquery benchmark. Experiments on this benchmark show that in-place slicingis the best. We then apply the approaches used in &tauXQuery to temporal SQL/PSM.The stratum architecture and most of the time-slicing techniques work fortemporal SQL/PSM. Empirical comparison is performed by running a variety of temporalqueries.