Abstract
Due to the fast-growing amount of data and cost consideration, shingled-magnetic-recording (SMR) drives are developed to provide low-cost and high-capacity data storage by enhancing the areal-density of hard disk drives, and (data) deduplication techniques are getting popular in data-centric applications to reduce the amount of data that need to be stored in storage devices by eliminating the duplicate data chunks. However, directly applying deduplication techniques on SMR drives could significantly decrease the runtime performance of the deduplication system because of the time-consuming SMR space reclamation caused by the sequential write constraint of SMR drives. In this paper, an SMR-aware deduplication scheme is proposed to improve the runtime performance of SMR-based deduplication systems with the consideration of the sequential write constraint of SMR drives. Moreover, to bridge the information gap between the deduplication system and the SMR drive, the lifetime information of data chunks is extracted to separate data chunks of different lifetimes in different places of SMR drives, so as to further reduce the SMR space reclamation overhead. A series of experiments was conducted with a set of realistic deduplication workloads. The results show that the proposed scheme can significantly improve the runtime performance of the SMR-based deduplication system with limited system overheads.
Original language | English |
---|---|
Journal | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems |
DOIs | |
State | Accepted/In press - 2021 |
Keywords
- Costs
- Data Deduplication
- Data mining
- Degradation
- Drives
- Garbage Collection
- Hardware/Software Codesign
- Performance evaluation
- Quality of service
- Runtime
- Runtime Performance.
- Shingled Magnetic Recording (SMR)
- Vertical Integration