Abstract
Indexing is a fundamental mechanism for efficient data access. Recently, we proposed the Variable-Aligned Length (VAL) bitmap index encoding framework, which generalizes the commonly used word-aligned compression techniques. VAL presented a variable-aligned compression framework, which allows columns of a bitmap to be compressed using different encoding lengths. This flexibility creates a tunable compression that balances the trade-off between space and query processing time. The variable format of VAL presents several unique opportunities for query optimization.
In this paper we explore multiple algorithms to optimize both point queries and range queries in VAL. In particular, we propose a dynamic encoding-length translation heuristic to process point queries. For range queries, we propose several column orderings based on the bitmap's metadata: largest segment length first (lsf), column size (size), and weighted size (ws). In our empirical study over both real and synthetic data sets, we show that our dynamic translation selection scheme produces query execution times only 3:5% below the optimal. We also found that the weighted size column ordering significantly and consistently out-performs other ordering techniques. Finally, we show that algorithms scale to data sets that are row-ordered.