Workload-Aware Cache Management of Bitmap Indices

Julia Kaeppel; Jason Sawin; David Chiu

doi:10.1145/3632366.3632386

Back

Workload-Aware Cache Management of Bitmap Indices

Conference proceeding

Open access

Workload-Aware Cache Management of Bitmap Indices

Julia Kaeppel, Jason Sawin and David Chiu

Proceedings of the IEEE/ACM 10th International Conference on Big Data Computing, Applications and Technologies, pp.1-10

ACM Conferences

BDCAT '23: IEEE/ACM 10th International Conference on Big Data Computing, Applications and Technologies

12/04/2023

DOI: https://doi.org/10.1145/3632366.3632386

Abstract

Information systems

Information systems -- Data management systems

Information systems -- Data management systems -- Database management system engines

Information systems -- Information retrieval

Information systems -- Information storage systems

Theory of computation

Theory of computation -- Theory and algorithms for application domains

Theory of computation -- Theory and algorithms for application domains -- Database theory

Theory of computation -- Theory and algorithms for application domains -- Database theory -- Database query processing and optimization (theory)

Big-data management systems must handle multiple concurrent queries over multi-dimensional data sets. To achieve high throughput, such systems could implement various techniques to avoid redundant computations and data fetches. One such approach is to cache a subset of the query results and reuse these results to (partially) fulfill future query requests. This approach can be quite effective for query-at-a-time processing. However, we suspect that even greater performance is being left on the table if queries are only optimized in isolation, and that higher throughput can be extracted through a systematic examination of the relationships between queries in a given workload. This paper describes a framework that captures inter-query relationships to reveal increased opportunities to exploit caching. We present a heuristic used for scheduling queries and a novel workload-informed cache replacement policy. When these methods are applied in combination, our system is able to extract impressive speedup of the total execution time of batches of queries, using only modest cache sizes. In this paper we show that the proposed replacement algorithm easily outstrips the performance of the classic algorithms FIFO and LRU. Under certain conditions, our system was able to achieve roughly 2 to 4 time speedup over these traditional replacement schemes.

Files and links (1)

url

https://doi.org/10.1145/3632366.3632386View

Published (Version of record) Open

Metrics

2 Record Views

Details

Title: Workload-Aware Cache Management of Bitmap Indices
Author/Creator: Julia Kaeppel - University of Puget Sound
Jason Sawin - University of St. Thomas - Minnesota
David Chiu - University of Puget Sound
Publication Details: Proceedings of the IEEE/ACM 10th International Conference on Big Data Computing, Applications and Technologies, pp.1-10
Conference: BDCAT '23: IEEE/ACM 10th International Conference on Big Data Computing, Applications and Technologies
Series: ACM Conferences
Publisher: ACM
Academic Unit: Computer & Information Sciences; College of Arts and Sciences
Language: English
Resource Type: Conference proceeding
Record Identifier: 991015184886803691