TY - GEN
T1 - Optimizing Read Performance of HBase through Dynamic Control of Data Block Sizes and KVCache
AU - Chae, Sangeun
AU - Kim, Wonbae
AU - Han, Daegyu
AU - Kim, Jeongmin
AU - Nam, Beomseok
N1 - Publisher Copyright:
© 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
PY - 2024/4/8
Y1 - 2024/4/8
N2 - LSM-Tree-based key-value stores such as HBase, RocksDB, and Cassandra use a fixed data block size. In this study, we show that using a fixed block size can lead to unnecessary read amplification and cache pollution. To address this issue, we propose a dynamic data block size control method to store small key-values in small data blocks and large key-values in large data blocks to minimize disk I/Os. However, using small data blocks for small key-values can result in performance issues due to increased disk seeks. To mitigate this problem, we implement a two-level cache system, which involves a lower level conventional BlockCache for storing larger, coarse-grained data blocks and an upper level cache, KVCache, for storing smaller, fine-grained key-value pairs. Our experiments show that the dynamic data block size control and fine-grained KVCache help effectively reduce read amplification and improve read performance in HBase.
AB - LSM-Tree-based key-value stores such as HBase, RocksDB, and Cassandra use a fixed data block size. In this study, we show that using a fixed block size can lead to unnecessary read amplification and cache pollution. To address this issue, we propose a dynamic data block size control method to store small key-values in small data blocks and large key-values in large data blocks to minimize disk I/Os. However, using small data blocks for small key-values can result in performance issues due to increased disk seeks. To mitigate this problem, we implement a two-level cache system, which involves a lower level conventional BlockCache for storing larger, coarse-grained data blocks and an upper level cache, KVCache, for storing smaller, fine-grained key-value pairs. Our experiments show that the dynamic data block size control and fine-grained KVCache help effectively reduce read amplification and improve read performance in HBase.
KW - key-value stores
KW - log-structured merge tree
UR - https://www.scopus.com/pages/publications/85197662831
U2 - 10.1145/3605098.3635898
DO - 10.1145/3605098.3635898
M3 - Conference contribution
AN - SCOPUS:85197662831
T3 - Proceedings of the ACM Symposium on Applied Computing
SP - 1495
EP - 1503
BT - 39th Annual ACM Symposium on Applied Computing, SAC 2024
PB - Association for Computing Machinery
T2 - 39th Annual ACM Symposium on Applied Computing, SAC 2024
Y2 - 8 April 2024 through 12 April 2024
ER -