Regresar

Storage-efficient data replica number computation for multi-level priority data in distributed storage systems

Abstract:

Distributed storage systems often use replication for improved availability, performance and scalability. In this paper, we consider the case of using file replication to improve the availability of different classes of files, where some classes are more 'important' than others and more replicas are created for them to achieve improved availability. The question we attempt to answer is: given a fixed storage budget for storing replicas, what is the number of replicas of each file class to create to maximize the (weighted) overall availability of files? We present our work towards a replica number computation algorithm that takes into account a storage budget, a configurable maximum expected percentage of failed nodes, and weights for different file classes. Simulation results show that our algorithm is able to improve the availability of the prioritized files with higher weights, has a low computation time and can utilize storage space efficiently when total storage space scales to a large size. © 2013 IEEE.