Preview only show first 10 pages with watermark. For full document please download

Unistore: A Unified Storage Architecture For Cloud

   EMBED


Share

Transcript

Unistore: A Unified Storage Architecture for Cloud Computing Progress and Proposal Presenter: Wei Xie Project Members: Wei Xie, Jiang Zhou, and Yong Chen Data-Intensive Scalable Computing Laboratory (DISCL) Computer Science Department Texas Tech University We are grateful to the Nimboxx and the Cloud and Autonomic Computing site at Texas Tech University for the valuable support for this project. Unistore Overview q q To build a unified storage architecture (Unistore) for Cloud storage systems with the co-existence and efficient integration of heterogeneous HDDs and SCM (Storage Class Memory) devices Scalable and high performance storage for virtual machine Scalability, manageability, elasticity, Performance, data reliability, fault tolerance, Energy efficiency, low cost Data Characterization Component Data Placement Component Heterogeneous Placement Algorithm VM Provision Component Heterogeneous Replication Algorithm Deduplication Component Data Migration/Cache Component 1 Project Highlights q Two papers published: § § q One paper submitted: § q SUORA paper published at IEEE NAS’16 Hierarchical Consistent Hashing at IEEE ISPA’16 Strategy Consistent Hashing submitted to PPoPP’17 Ongoing work and New proposal § § Elasticity in decentralized storage Deduplication of SSD cache in cloud storage 2 Consistent Hashing for Heterogeneous Storage Strategy CH p Virtual node C2 B2 p A1 c1 b1 capacity bandwidth p B1 A2 c2 b2 C1 c3 b3 capacity bandwidth Adopt a unified hashing ring to manage heterogeneous nodes Maintain attributes of each node Use a selection strategy for mapping nodes capacity bandwidth § Diagram of HiCH algorithm Hierachical CH New data 1st copy MetaTree of objects on SSDs Read data SSDs ring When SSD is full § § p True Data placement strategy § When data become hot Location strategy Uniform strategy Performance strategy HDDs ring § When new data objects come, the first choice is to place the data on SSD ring Data movement between SSD and HDD rings 16 3 Elasticity in Scale-out Storage p Size-up and size-down in elastic storage n n p Size-up to meet I/O demand Size-down to save energy or free up resource for other use Agility to scale is critical to better satisfy the goals 4 hours Facebook Trace [1]L. Xu et al, “SpringFS: Bridging Agility and Performance in Elastic Distributed Storage.”, FAST’14 4 Consistent Hashing based Storage p Consistent hashing n n n For decentralized distributed system Sizing up and down without completely changing data layout Sizing down multiple nodes will need to migrate data first 5 Elasticity in CH based Storage p Challenges to improve agility n n n Scaling-up: p data movement to redistribute data p If node is turned down but not failed, there is no need to move data when it is turned on again Scaling-down: p keep data available and reduce data movement Data consistency and failure handling p The redundancy level may decrease when scaling down p Keep data replicas consistent across multiple nodes 6 Elasticity in CH based Storage p Our solution n Primary and secondary nodes: primary keeps one copy of whole data set, while secondary keep replicas n Size-down p p n p n Secondary nodes Re-replicate to ensure data availability Track modified data so that the data in the shutdown nodes can be updated when turned on 9 10 4 3 D1 Size-up p n Primary nodes CH automatically size-up Migrate updated data to new nodes but no need to transfer existing data Data consistency: need to keep data consistent when migrating modified data Version consistent hashing 1 8 5 7 2 6 6 Version Consistent Hashing Scheme Build versions into the virtual nodes Avoid data migration when adding nodes or node fails Maintain efficient data lookup p p Version1, committed D1 c 1 c c 2 Update, lookup 1 4 D1 3 3 1 2 Data lookup algorithm 3 4 4 v2 6 v3 1 5 v3 3 2 7 v4 c D1 Lookup 1 2 3 v1: 1,2 v2: 4,1 v3: 4,6 v4: 4,6 Lookup locations: {4, 6, 1, 2} c 2 D1 lookup Update, lookup Update, lookup 4 Performance improvement: D1 • • • • • 1 c 2 Update, lookup Version3, committed 5 c D1 4 c 1 c c 3 n, v3 n, v2 5 4 3 c D1 Lookup 3 D1 c 2 D1 n, v2 1 c c 2 Version3, uncommitted Version2, uncommitted 1 5 Average lookup amplification p 2 3 4 5 10 4 Consistent Hashing Commit CH Version CH 10 3 10 2 10 1 10 0 0 500 1000 1500 2000 Number of nodes (half are added) 7 On-going/Future Work p p p Implement/evaluate the elastic consistent hashing in Sheepdog that is deployed on the DISCI cluster at TTU Use Microsoft Enterprise I/O trace and compare the agility of resizing with existing techniques Prepare a submission for IPDPS17 conference 8 Proposal of New Project p p p Deduplication is critical especially in storage systems with SSD cache due to SSDs’ limited endurance and capacity Current study focuses on single machine deduplication, but redundancy could occur across machines We propose distributed deduplication in SSD caches in the cloud storage environment Virtual machine guests Hosts SSD Cache Storage Array De-duplication SAN A typical scenario in cloud-scale data centers 9 Motivation of the Study p Motivation n n VMs across hosts may share the same operating system and dataset SSD cache is usually host-local, it is challenging to de-dup across multiple node local storage (how to reference data on remote host) Virtual machine guests Hosts SSD Cache Storage Array De-duplication SAN A typical scenario in cloud-scale data centers 10 Proposed Solution p Virtual machine guests Hosts SSD Cache Storage Array p p SAN Metadata (fingerprints) is distributed using memcached Data not de-duplicated across hosts Consider a global LRU instead of local LRU n n addr 1 fingerprint 0ae31fq Fingerprint table host 1 2 fingerprint 0ae31fq 13ews19 LRU table n Recently used data on one host may be reused on another host If duplicate data is written back to primary storage, only create a reference on primary storage If duplicate data is inserted, a reference to the remote host is created 11 Summary p p p p p Elasticity is an important feature in cloud computing and considered in scale-out storage systems We propose techniques to allow agile resizing and minimal performance degradation It is going to benefit data centers to optimize resource utilization and/or save power consumption We will continue investigating workload characterization based on statistical techniques We have proposed a de-duplication system for cloud storage as a new project and seeking sponsorship for either continuing the Unistore project or the new project 12 Thank You Please visit: http://cac.ttu.edu/, http://discl.cs.ttu.edu/ Acknowledgement: The CAC@TTU is funded by the National Science Foundation under grants IIP1362134 and IIP-1238338. Please take a moment to fill out your L.I.F.E. forms. http://www.iucrc.com Select “Cloud and Autonomic Computing Center” then select “IAB” role. What do you like about this project? What would you change? (Please include all relevant feedback.)