Regresar

Benchmarking key-value stores via trace replay

Abstract:

Key-value stores are an important component of cloud applications. These NoSQL databases typically do not support ACID transactions, hence, their applications have diverged from traditional database workloads and are not well represented by benchmarks like TPC-C. In this context, YCSB emerged as the de facto benchmark for cloud serving stores, supporting a wide variety of synthetic workloads. However, no publicly available tool exists that can replay real traces against these systems, this is harmful to the community, as the replay of real workloads is an important system evaluation technique. As a result, others have developed ad-hoc closed-source replay tools that use undocumented replay models, thus making their results impossible to replicate. Furthermore, choosing and implementing the right replay model is not trivial. We propose a trace replay model suitable for key-value stores and describe KV-replay, an open-source replayer that implements this model. We show that using synthetic workloads leads to significant evaluation errors: As much as 33% error in miss rate for small cache sizes and 18% speedup overestimation. Evaluations show that KV-replay is accurate, fast and useful.