Transcript
SnapScale Architecture blueprint to success
Abstract SnapScale™ industry’s simplest and largest enterprise scale-out storage solves the problems of traditional storage by allowing rapid and unpredictable data growth without adding management complexity. Built on high performance scale-out file system architecture SnapScale allows granular scaling to 100’s of PBs. Beyond the performance and capacity scaling, SnapScale is inherently designed for very high data integrity and data protection. For the hyperscale data repositories of today and tomorrow, SnapScale is built on peer sets protection technology that solves key scalability issues that plague the traditional RAID and replication approach. This is through changing the resiliency paradigm and an
extremely efficient data distribution mechanism. The parallel streams continuous replication engine of SnapScale ensures line speed data transfer between data repositories that drastically reduce data backup windows. Additional features include built in load balancing, tunable replication schema, distributed campus storage clusters and an efficient use of both computing power and network bandwidth. Finally SnapScale uses centralized management for a distributed IT environment allowing the customer to manage the central SnapScale data repository along with branch office SnapServers and SnapSANs from the same management utility called SnapManager.
WHITE PAPER
Table of Contents Abstract 3 The Challenges Enterprises Face with Traditional Storage 3 The Solution: Scale-Out 3 The Simplest & Largest Scale-out: SnapScale 3 Who should use SnapScale? 4 What is SnapScale? 4 High Fault Tolerance 4 Scale-Up 4 Scale-Out 5 Performance Optimization 5 Campus Cluster 5 Immortal Storage Infrastructure 5 SnapScale Hardware Overview 6 Data Migration 6 Remote Monitoring 6 Conclusion 7
WHITE PAPER
The Challenges Enterprises Face with Traditional Storage
The Simplest & Largest Scaleout: SnapScale
Traditional NAS storage requires maintaining different filers for different lines of business. The clients get mounted to the filer that belonged to their line of business. A bunch of clients, and serves mounted to different volumes means varying levels of utilization. This means customers have to juggle and manually move data between the various volumes so that the data is balanced and does not exceed the rigid volume limits . This results in additional complexity to create and move volumes and data mount points constantly.
There a few defining characteristics of a scale-out storage system:
Even as the traditional NAS storage vendors make claims on scale-out storage their volumes are limited to 100TB. So even as they roll out new releases, the architecture gaps are still applicable. The customer has to choose between: A. Productivity B. Storage Utilization C. Business Growth Challenge: Traditional NAS leads to data jockeying
The Solution: Scale-Out A more scalable solution that grows to multiple PB’s and 10’s billions of files/objects in the single ‘volume’. This effectively reduces opeartional cost and complexity for their data-at-scale management challenges.
1. Distributed Storage System a. Single global namespace so that there is a single root data mount point that can grow to infinity (almost) b. Data striped across nodes so that the compute and spindle performance of multiple nodes and disks can be utilized for reads and writes c. High Aggregate Performance so when more clients connect to the storage cluster the performance of the storage cluster continues to increase. This enables the storage infrastructure to scale to a large number of clients 2. Granular Growth a. Start with partially populated nodes. X2 nodes need a minimum of 4 disks in each node and so with just 12 drives a scale-out storage cluster can be started. b. Grow capacity or performance by simply adding drives or nodes to the existing storage infrastructure. c. Add drives or nodes non-disruptively, i.e. no downtime needs to be scheduled for unplanned expansions. There is no additional installation or manageability requirements for the new nodes added to the storage cluster as they learn their configuration from the cluster. X2 and X4 nodes can be mixed in the same storage pool. 3. Keep Growing a. Data and spares are auto-balanced during expansions or hardware failures b. Performance distributed among the various nodes and drives to maximize performance of the storage cluster. The client sessions are redistributed in the event of node failure c. No known limits of expansion for the storage cluster capacity and performance.
WHITE PAPER
Who should use SnapScale? SnapScale is an ideal solution for businesses with unpredictable data growth for large sets of data that is accessed by a large number of clients. Every business has unstructured data in the form of files, images, documents, media, records, objects and content. This is generated across all major verticals like hospitals, pharmaceuticals, research firms, manufacturing firms, oil & gas or energy companies etc. Capacity requirements can start at 24TB, growing fast and indefinitely. The solution is a good fit for workloads that have multiple parallel sessions accessing the data and the aggregate storage cluster performance is important. The solution works well for customers that need the storage cluster to always remain on i.e. data protection and availability is important. An easy to use web-based user interface allows fast deployment and simple management to most effectively utilize IT time and resources.
of drives in the cluster failed as long as at least one member of every peer set was still available. There is full read write data access through node and drive failures. SnapScale features support for global hot spare drives which can automatically be used to replace failed hard drives in the cluster. Hot spares are kept evenly distributed across the nodes of the cluster to improve options for repair. Spare disks will be immediately used to recover from disk failures by re-syncing from a healthy member of the peer set. If there is no spare available anymore, the corresponding peer set is degraded and waits for a new disk to get inserted.
It is recommended to keep at least 2 spare disks per node.
What is SnapScale? This picture shows how SnapScale is built up on multiple nodes. A minimum of 3 nodes is required to start the cluster. The minimum number of drives per node is 4 (on SnapScale X2) or 12 (on SnapScale X4). SnapScale can grow in capacity by adding additional drives to a node (up 12 on SnapScale X2 or up to 36 on SnapScale X4). There can be an indefinite number of nodes in the cluster.
High Fault Tolerance Data is distributed in peer sets. Depending on the chosen data protection level data is stripped across nodes and disks automatically to be able to survive 1 or 2 disk failure per peer set or 1 or 2 node failures. There will be no data loss even if half the number
Built-in Snapshots: A snapshot is a consistent, stable, point-in-time image of the cluster storage space that can be backed up independently of activity on the cluster storage. They can be used either for recovering files which have been deleted by accident or for backing up a consistent data set. Snapshot space is reserved on each peer set member drive.
Scale-Up When new disks are added they should be added 2 at a time and into different nodes. If there is a degraded peer set then the new disk will be utilized by that peer set. If there are no degraded peer sets then the number of spares are checked. If there are enough spares then the newly inserted disks will immediately form a peer set. Disks in a cluster need to have same rotation speed and connectivity. It is okay to mix size types.
WHITE PAPER
Scale-Out
Campus Cluster
Nodes should be added depending on the data protection level. So with data protection level 2 (3 members in a peer set), it is recommended that three nodes should be added to expand the cluster. With data protection level 1 (2 members in a peer set), the recommendation is to add at least two nodes. This ensures easier and faster creation of peer sets. A data protection level 2 also allows the extension by 1 additional node. This should be done during offpeak hours as it causes a re-distribution of peer sets which might consume time and compute cycles. All nodes are required to be the same OS revision.
As SnapScale distributes data across nodes for redundancy reasons, these nodes could also be distributed across different locations to prevent from data loss in case of a disaster at a location. This is called a Campus Cluster. A prerequisite is a fast dark fiber connection between these locations. After recovering from a disaster, the node is automatically rebuilt.
Performance Optimization There are two 10G network ports for client side and two 10G ports for the backend storage network. The two ports can be used in load-balancing mode or bonding mode. In order to get more performance on the client side, the load-balancing mode is recommended. In addition, all IP addresses on the front-end network of all nodes should get a DNS name resolution. Utilizing a DNS server will cause a round-robin rotation through the IP addresses to enhance performance on the client side. There are two in-built tools that help with optimizing performance that may result from imbalance caused by node or drive failures. Spare Distributor evenly redistributes spares and peer set members across the cluster nodes. Maintaining a balance of spare drives helps ensure that spares are available if a peer set member should fail. Data Balancer moves data between different peer sets to optimize utilization by moving data from more to less heavily used peer sets. Maintaining a balance of peer set capacity improves performance by assuring a balance of read and write traffic across all peer sets.
Utilizing this functionality prevents administrators from duplicating hardware at a second site and from purchasing additional replication software. With latency <1ms across the campus, this method is a lot faster than traditional replication. In addition, there are no RAID operations necessary, which also accelerates writing at the different locations.
Immortal Storage Infrastructure The storage cluster can survive hardware failures as discussed in the fault tolerance section. It can also survive software refreshes through a rolling upgrade methodology. Rolling upgrades allow one node to be upgraded at a time so that the data access is not interrupted during the upgrade process. When data interruption is acceptable the entire cluster can be upgraded in a single instance, which is a much faster process. Since hardware types can be mixed and matched the storage cluster can also go through the hardware refresh without disruption to data access. For this immortality the storage cluster assumes uninterrupted power supply. Power Recommendation: As SnapScale is a high capacity device storing hundreds of Terabytes, it is highly recommended installing two UPS (uninterruptible power supplies) for the whole cluster. This prevents the system against low power and
WHITE PAPER
power failure conditions. Connecting to an UPS, SnapScale is able to shut down gracefully in the event of a power interruption. An APC® branded UPS should be used, as SnapScale can be easily connected via USB. When selecting a particular UPS, ensure it is capable of providing power to a SnapScale node for at least ten minutes. In addition, to allow the cluster sufficient time to shut down cleanly, the UPS must be configured to provide power for at least five minutes after entering a low battery condition.
SnapScale Hardware Overview SnapScale X2
SnapScale X4
Form Factor
2U, 12-drive
4U, 36-drive
Networking Options
1/10GbE RJ-45, SFP+
1/10GbE RJ-45, SFP+
Min/Max Capacity
24TB-512PB
72TB-512PB
Drives per Node
4 – 12
12 - 36
Types of Drives
2TB, 3TB, 4TB, 5TB, or 6TB NL-SAS
Processor
Intel Quad-Core Xeon
Dual Intel Six-Core Xeon
Node Memory
32GB
64GB
Data Migration In order to make it easier to move to SnapScale we have added a high performance data migration engine that allows customers to move from 3rd party storage to SnapScale easier and faster. This tool leverages the same replication technology that is used between SnapScale systems – meaning this High Performance Data Import tool can pull data from external 3rd party storage using parallel data streams; therefore, the import workload is distributed between multiple threads and cluster nodes for maximum performance.
Remote Monitoring Remote monitoring allows SnapScale support and development team to proactively monitor the health of the storage infrastructure. It is an optional feature that the customer can sign up for by providing their contact information and accepting the terms and conditions. When this functionality is enabled then during error conditions, SnapScale sends email to support, which will result in a support ticket being opened automatically. SnapScale will also upload logs that will allow the support engineer to root cause the error and proactively reach the customer with solutions. This is a licensed feature and can be used as long as the product is covered under warranty. All SnapScale nodes come with 1 year Silver level OverlandCare support. All Overland products are eligible for enhanced OverlandCare maintenance options to augment and/or extend the standard warranty. Specially priced multi-year and single year OverlandCare support upgrades are available to protect your investment for the life of the product. Four levels of OverlandCare support are available: Bronze: 9x5 telephone support; advanced parts replacement with 2 business day delivery Silver: 9x5 telephone support; next business day onsite (FRUs); advanced parts replacement of CRUs Gold: 7x24 telephone support: next business day onsite (FRUs & CRUs) Platinum: 7x24 telephone support; 24x7x365 onsite with 4-hour response
Note that the import is performed via CIFS or NFS, so the data can come from virtually any NAS on the market, and, even with SANs, you just need to have a Windows or Linux server mount the SAN volumes and serve as the source.
WHITE PAPER
Conclusion SnapScale is a Clustered Scale Out NAS Solution, which enables companies to grow capacity as their storage requirements grow seamlessly without interruption and without growing management complexity. SnapScale allows an affordable entry level configuration with the possibility to grow instantly and infinitely up to more than 500PB by adding additional nodes to the cluster. Besides capacity growth, there is also the requirement of performance. High transfer rates and fast data access is a must for applications utilizing large files or companies with many concurrent users. The flexibility of SnapScale offers also in this area the
feasibility of scaling in performance in several ways. By scaling up, more disks are added to a node, which results in higher capacity and some more performance from more spinning disks. By scaling out, more nodes are added which results in higher throughput and performance from more network channels. Finally: Peer Sets provide significant advantages over RAID, with much higher performance and space efficiency in today’s world of Data Explosion, Disaster Recovery and highly Geo-distributed Enterprises.
Sales Offices Americas 125 S. Market Street San Jose, CA 95113 USA Tel: 1 (858) 571-5555
Asia Pacific 16 Collyer Quay Level 21 Singapore, 049318 Tel: +65 6818 9266
France 18 Rue Jean Rostand Orsay 91400, France Tel: +33 (0) 1 81 91 73 40
Germany Feldstraße 81 44141 Dortmund Germany Tel: +49 231 5436-0
©2015 Sphere 3D. All trademarks and registered trademarks are the property of their respective owners. The information contained herein is subject to change without notice and is provided “as is” without warranty of any kind. Sphere 3D shall not be liable for technical or editorial errors or omissions contained herein.
United Kingdom Regus Atlantic House Imperial Way Reading, RG2 0TD United Kingdom Tel: +44 1 189 898 000 OVSSX4-CS0415-01
WHITE PAPER