 |

Failures of any type are common in current data-centers. As
data scales up, its availability becomes more complex, while
different availability levels per application or per data item
may be required. Skute is a self-managed key-value store that
dynamically allocates the resources of a data cloud to several
applications in a cost-efficient and fair way. Our approach
offers and dynamically maintains multiple differentiated
availability guarantees to each different application despite
failures. We employ a virtual economy, where each data
partition acts as an individual optimizer and chooses whether
to migrate, replicate or remove itself based on net benefit
maximization regarding the utility offered by the partition and
its storage and maintenance cost. Comprehensive experimental
evaluations suggest that our solution is highly scalable and
adaptive to query rate variations and to resource
upgrades/failures.
Skute has the following properties:
- it provides geographical replication of data
- it ensures high availabilty of data by maximizing the geographical diversity of replicas. Hence, 2 replicas of the same data partition will be hosted on 2 servers located in the same rack with a miminal probability
- it handles load peaks or flash crowds by replicating popular data partitions closed to the users
- thanks to our economic model, the load among the servers in the cloud is balanced, resulting in better global resources usage
Figure 1 - Virtual rings: three applications with different availability levels.
 |
Our approach combines the following innovative characteristics:
- it enables a computational economy for cloud storage resources.
- it provides differentiated availability statistical guarantees to different applications despite failures by geographical diversification of replicas.
- it applies a distributed economic model for the cost-efficient self-organization of data replicas in the cloud storage that is adaptive to adding new storage, to node failures and to client locations.
- it efficiently and fairly utilizes cloud resources by performing load balancing in the cloud adaptively to the query load.
Optimal replica placement is based on distributed net benefit maximization of query response throughput minus storage as well
as communication costs, under the availability constraints. The optimality of the approach is proved by comparing simulation results
to those expected by numerically solving an analytical form of the global optimization problem. Also, a game-theoretic model is employed to observe the properties of the approach at equilibrium. A series of simulation experiments prove the aforementioned characteristics of the approach. Finally, employing a fully working prototype of Skute, we experimentally demonstrate its applicability in real settings.
Virtual Ring
Our approach employs the concept of multiple virtual rings on a single cloud in an innovative way. Thus, as subsequently explained, we allow multiple applications to share the same cloud infrastructure for offering differentiated per data item and per application availability guarantees without performance conflicts. Each application uses its own
virtual rings, while one ring per availability level is needed, as depicted in Figure 1. Each virtual ring consists of multiple virtual nodes that are responsible for different data partitions of the same application that demand a specific availability level. This approach provides the following advantages over existing key-value stores:
-
Multiple data availability levels per application. Within the same application, some data may be crucial and some may be less important. In other words, an application provider may want to store data with different availability guarantees. Unlike existing approaches, Skute allows a fine-grained control of the resources of each server, as every virtual node of each virtual ring acts as an individual optimizer, thus minimizing the impact among applications.
-
Geographical data placement per application. Data that is mostly accessed from a given
geographical region should be moved close to that region. Without the concept of virtual rings, if
multiple applications were using the same data store, data of different applications would have to
be stored in the same partition, thus removing the ability to move data close to the clients. However, by employing multiple virtual rings, Skute is able to provide one virtual store per application, allowing the geographical optimization of data placement.
We have implemented a fully working prototype of Skute on top of Project Voldemort, which is an open source implementation of
Amazon Dynamo written in Java. Servers are not synchronized and no centralized component is required.
Publications
 |
|
A self-organized, fault-tolerant and scalable replication scheme for cloud storage
Nicolas Bonvin, Thanasis G. Papaioannou, Karl Aberer
In ACM Symposium on Cloud Computing 2010 (SOCC2010), June 10-11, 2010, Indianapolis, USA
|
|
 |
|
Cost-efficient and Differentiated Data
Availability Guarantees in Data Clouds
Nicolas Bonvin, Thanasis G. Papaioannou, Karl Aberer
In 26th IEEE International
Conference on Data Engineering (ICDE2010), March
1-6, 2010, Long Beach, California, USA |
|
 |
|
Dynamic Cost-Efficient Replication in Data
Clouds
Nicolas Bonvin, Thanasis G. Papaioannou and Karl
Aberer
In Automated
Control for Datacenters and Clouds (ACDC09),
Barcelona, Spain, June 15-19, 2009 |
|
© 2007-2011 Nicolas Bonvin | Last Modified 2011-05-09 16:57:50
|
|
 |