RoVEr: Robust and Verifiable Erasure Code for Hadoop Distributed File Systems

Document Type

Conference Proceeding

Publication Date



Erasure Coding based Storage (ECS) is replacing tradition replica-based systems because of its low storage overhead. In an ECS, however, every task needs to fetch remote pieces of data for its execution, and data verification is missing in the current framework. As security issues keep rising and there have been security incidents occurred in big data platforms, the compromised nodes in a computing cluster may manipulate its hosted data fed for other nodes yielding misleading results. Without replicas, it is quite challenging to efficiently verify the data integrity in ECS. In this paper, we develop ROVER, which is an efficient and verifiable ECS for big data platforms. In ROVER, every piece of data is monitored by its checksums stored on a set of witnesses. Bloom filter technique is used on each witness to efficiently keep the records of the checksums. The data verification is based on the majority voting. ROVER also supports a quick reconstruction of Bloom Filter when a node recovers from a failure. We present a complete system framework, security analysis, and a guideline for setting the parameters. The implementation and evaluation show that ROVER is robust and efficient against the attack from the compromised nodes.



This document is currently not available here.