Preview only show first 10 pages with watermark. For full document please download

Mastering Huawei Oceanstor

   EMBED


Share

Transcript

Mastering HUAWEI OceanStor - In Only 4 Hours! This document is intended for Huawei technical and product personnel and is for internal use only. For promotion data and policies, refer to the latest released promotion data and sales guide. Do not use this document as commitments to customers. HUAWEI TECHNOLOGIES CO., LTD. Copyright © Huawei Technologies Co., Ltd. 2014. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd. Trademarks and Permissions and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their respective holders. Notice The purchased products, services and features are stipulated by the commercial contract made between Huawei Symantec and the customer. All or partial products, services and features described in this document may not be within the purchased scope or the usage scope. Unless otherwise agreed by the contract, all statements, information, and recommendations in this document are provided “AS IS” without warranties, guarantees or representations of any kind, either express or implied. The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied. Huawei Technologies Co., Ltd. Address: Huawei Industrial Base Bantian, Longgang Shenzhen 518129 People's Republic of China Website: http://www.huawei.com Email: [email protected] Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. i Preface What can we do in 4 hours? Hmm…We can have a flight trip, enjoy a big feast, or watch a movie. Now we have another option: to become a storage expert! Of course, this shortcut to expertise only suits the gifted ones who can meet all of the following conditions:  Condition 1: You are a Huawei employee.  Condition 2: You have certain background knowledge on storage, for example, you know some terms like "disks".  Condition 3: You are not yet a storage expert.  Condition 4: You have 4 hours of free time. If you have met all the conditions, let's start the journey to explore the storage world. After 4 hours, you will find out storage is such an easy thing to learn! Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. ii About This Document Intended Audience and Document Overview This document is intended for Huawei frontline marketing personnel. This document has a base version and an advanced version. After reading the base version, you will be able to understand: 1. Basic storage knowledge 2. Storage market and its characteristics 3. Major storage manufacturers After reading the advanced version, you will be able to understand: 1. Huawei storage products and their characteristics (especially their differentiated competitiveness) 2. Development tendency of Huawei storage 3. Sales strategies of Huawei storage This document aims to help the audience attain basic storage knowledge and understand the unique values of Huawei storage. This document only gives a glance to the technical features of Huawei storage. For more information, you can obtain related Technical White Paper from the 3MS website. There are reference links at the end of the document. This document is for internal use only and cannot be used as commitments to customers. For promotion data and policies, refer to the latest released promotion data and sales guide. Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. i Contents 1 General Storage Knowledge .......................................................................................... 3 1.1 Basics.............................................................................................................................................. 3 1.1.1 What is storage? Where can we use storage? ........................................................................... 3 1.1.2 What are the categories of storage? ......................................................................................... 3 1.1.3 What are the components in a storage system?......................................................................... 4 1.1.4 What are the key indexes to evaluate a storage system? ........................................................... 5 1.2 Storage Market ................................................................................................................................ 6 1.2.1 What is the storage market size?.............................................................................................. 6 1.2.2 What are the development trends of the storage market? .......................................................... 6 1.2.3 Why are there so many models of storage products? ................................................................ 6 1.3 Major Storage Manufacturers........................................................................................................... 7 1.3.1 Who are the major storage manufacturers? .............................................................................. 7 1.3.2 What are the products provided by these major manufacturers? ............................................... 8 1.4 Key Storage Technologies................................................................................................................ 8 1.4.1 What is RAID? ....................................................................................................................... 8 1.4.2 What are the differences among RAID levels? ....................................................................... 10 1.4.3 What is reconstruction? Why is the reconstruction speed so important?.................................. 11 1.4.4 How to shorten the reconstruction process to improve storage reliability? .............................. 11 1.4.5 What is cache? Why is it important in improving storage efficiency? ..................................... 12 1.4.6 What are backup and disaster recovery? ................................................................................ 12 1.4.7 What are RTO and RPO? ...................................................................................................... 13 1.4.8 What are the common data backup solutions? ........................................................................ 13 1.4.9 How to improve the overall reliability of a storage system? ................................................... 13 2 Acronyms and Abbreviations ...................................................................................... 15 Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 2 1 General Storage Knowledge 1.1 Basics 1.1.1 What is storage? Where can we use storage? Answer: Storage refers to the equipment that stores data. Based on application scenarios, storage is categorized into consumer storage and enterprise storage. 1. Consumer storage is the storage equipment used by individuals, such as laptops, PADs, and mobile storage media. They are of small capacities, low reliability, poor performance, and low costs. 2. Enterprise storage is the storage equipment used by enterprises, and its characteristics are on the contrary of the consumer storage. The most easy-to-understand storage is disks, but the reliability of disks is only two 9s (99%). Data centers require storage to achieve five-9s (99.999%) or even six-9s (99.9999%) reliability. In addition, the servers and applications in a data center must share data, so disks are installed outside servers. Multiple storage engines use a redundancy algorithm and architecture to simultaneously manage these disks and access disk and cache resources, achieving shared storage of high reliability, performance, and scalability. This is what Huawei enterprise storage does. As the core equipment to store data, enterprise storage is widely used in fields including government, finance, telecommunications, enterprises, energy, manufacturing, health care, and education. The market size has reached $100 billion and is growing every year. Storage, computing, network, and security are the four fundamental elements in the IT infrastructure of enterprise data centers, and they cooperate to support the operating of upper-layer applications. The various combinations of these four elements produce a wide range of products and solutions, and we will not detail them in this document. 1.1.2 What are the categories of storage? Answer: Storage can be categorized into SAN and NAS based on their different usage. On a SAN, dedicated storage equipment is used to house disks and provide storage services. Its interfaces and usage are similar to traditional disks. Compared with disks in servers, a SAN delivers higher performance, reliability, and scalability, and is more applicable to databases. Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 3 If users want to store, share, and access unstructured data such as videos and files on a SAN, they have to configure the local file systems on servers to file servers. NetApp simplifies this practice by combining file systems and file sharing functions into the SAN equipment, and this is how the NAS comes into being. Compared with the SAN + file server solution, the professional NAS equipment has lower cost, higher reliability (with a redundancy architecture), higher performance, and more functions. As the functions of server file systems develop slowly in recent years and customers impose lower demands on NAS than SAN, NAS is gradually replacing SAN and becomes an arena where Huawei enterprise storage can accomplish a big deal. There is a popular trend in the storage industry, which is to combine SAN and NAS into the same equipment and to make the equipment support both database and file sharing applications. This practice greatly reduces network complexity, equipment purchase cost, and equipment maintenance cost. 1.1.3 What are the components in a storage system? Answer: Storage can be regarded as a computer with a huge disk, so it has a computing unit (controllers) and a storage unit (disk enclosures). 1. The computing unit is a high-reliability and high-performance computer that runs a dedicated storage operating system. The enclosure where the computing unit resides is a controller enclosure. The front end of the controller enclosure connects to application servers through Fibre Channel or iSCSI links, and handles storage I/O requests. The back end of the controller enclosure connects to disk enclosures, and forwards the I/O requests to relevant disks for data reads and writes. 2. The storage unit is the area where data is stored. We can compare the computing unit to the human brain and the storage unit to the human body. The storage unit consists of disks (HDDs and SSDs) to store data, and the enclosures that house these disks are called disk enclosures. The following shows the exteriors of a controller enclosure and a disk enclosure. Controller enclosure Disk enclosure Notes: 1. Issue 01 (2009-04-10) To ensure high storage reliability: Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 4  A controller enclosure usually houses two controllers that are mutually mirrored. This is called a dual-controller architecture.  A controller enclosure must have redundant batteries that supply power to controllers during external power failures and help controllers write cache data to coffer disks. 2. Multi-controllers: Similar to a multi-engine plane, a storage system that has multiple controllers can deliver high performance and reliability, but it is also a high technical difficulty. In the past, low-end and mid-range storage systems usually supported two controllers, and only high-end storage systems supported more than two controllers. Nowadays, as technologies develop, many storage manufacturers, including Huawei, add multiple controllers to their low-end and mid-range storage systems. 3. Disk and controller integration: To improve system integration and reduce costs, the controller enclosure in a low-end or mid-range storage system contains disks. With such a design, no extra disk enclosures are required in scenarios that need only small capacities. This design is called "disk and controller integration", and HUAWEI OceanStor S2600T and S5000T adopt this design. 4. Disk and controller separation: In scenarios that require large capacities and high performance, controller enclosures do not have disks and disk enclosures are responsible for data storage. This design is called "disk and controller separation". HUAWEI OceanStor S5600T, S5800T, S6800T, and enterprise storage systems adopt this design. 1.1.4 What are the key indexes to evaluate a storage system? Answer: There are two groups of key storage indexes. The first group is hard indexes that make a storage system robust. These indexes assess the system's performance, capacity, hardware processing capability, and interface capability. The second group is soft indexes that make a storage system smart. These indexes analyze the system's software functions in resource utilization, service reliability, and user experience. The following table lists the typical key indexes: ID Index Description 1 Capacity Maximum data volume that can be stored in a storage system. 2 IOPS A performance index that counts the processed I/Os per second. It reflects the service volume that a system can process during a specified period of time. 3 Latency The time it takes for the original data to go through a series of processing steps. 4 Failure rate The number of failures that may occur during a specified period of time. 5 Availability The capability of an IT service and its elements providing required functions during a specified period of time. It is measured by several 9s. Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 5 1.2 Storage Market 1.2.1 What is the storage market size? Answer: According to Gartner, the storage market size in 2013 was $22.5 billion (including the revenue from sales of storage equipment but excluding that of storage software or services), and grows by 4.4% every year. The market size of storage equipment is expected to reach $28 billion in 2018. The revenue from sales of storage hardware and services will reach $40 billion, and the total revenue will reach $100 billion if the sales of software and consulting services are counted in. 1.2.2 What are the development trends of the storage market? Answer: The major trends include the popularity of SSDs, the cloud infrastructure, and the software-defined storage (SDS). 1. Popularity of SSDs: The development of big data and mobile social network causes a fast data growth and poses a demanding requirement on storage performance and capacity. SSDs, with their proven reliability, superb performance, and affordable cost, become more and more popular in the storage market. 2. Cloud infrastructure: A traditional service system usually consists of a server and a storage system, and is dedicated to one type of services only. This results in many information islands in customers' data centers, makes server and storage resources unable to be shared, and wastes construction and maintenance costs. As these service systems develop fast, data center administrators have to constantly adjust the system configurations to address the capacity and performance bottles, which results in an increasing maintenance cost. The cloud infrastructure combines server and storage resources and adds all service systems into a resource pool. In this way, capacity and performance can be shared. The infrastructure also employs smart data storage management technologies to help administrators resolve capacity and performance issues. The unified hardware platform, software platform, and management platform in the cloud infrastructure reduce system cost, provide services for customers on demand, and improve the user experience. 3. SDS: The SDS achieves loose coupling of software and hardware. With it, storage software can run on general servers and virtual machines rather than dedicated hardware, so cost is reduced. In addition, using storage software, storage systems can attain higher performance, higher scalability, and easier maintenance, so the overall system efficiency is improved. 1.2.3 Why are there so many models of storage products? Answer: As starters in the storage arena, people may have the same question: Why are there so many models for storage products of the same series? Using the Huawei OceanStor series as an example, its high-end models include 18800 and 18500, and its low-end and mid-range models include S6800T, S5800T, S5600T, S5500T, S2600T, and S2200T. The same situation applies to the EMC VNX series and the NetApp FAS series. Similar to BMW 1/3/5 series, different storage products have different configurations, such as CPUs, memory, number of ports, and number of disks. Therefore, storage products are divided into many BANDs. The famous consulting firm, Gartner, defines 9 BANDs for storage products according to their different prices. Customers can choose to buy products in different BANDs based on their service requirements and budgets. The following table lists the 9 BANDs and their prices: Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 6 BAND Price BAND9 1000K$+ BAND8 500K$~999.9K$ BAND7 300K$~499.9K$ BAND6 200K$~299.9K$ BAND5 100K$~199.9K$ BAND4 50K$~99.9K$ BAND3 25K$~49.9K$ BAND2 5K$~24.9K$ BAND1 0~4.9K$ The product prices increase from BAND1 to BAND9, where:  BAND1 consists of basic storage arrays such as JBOD.  BAND2 consists of entry-level storage arrays.  BAND3 to BAND6 consist of mid-range storage arrays.  BAND7 to BAND9 consist of high-end storage arrays. 1.3 Major Storage Manufacturers 1.3.1 Who are the major storage manufacturers? Answer: In the global market, major storage manufacturers include Huawei, EMC, NetApp, IBM, HP, DELL, HDS, Fujitsu, and Oracle. In the China market, the manufacturers include MacroSAN, Tongyou, Sugon, Inspur, and UIT.  With accurate acquisition, EMC has built a complete storage product family. However, these acquired products have different architectures and are hard to integrate with one another.  IBM's storage products are sold together with its servers and consulting services. However, its product models are limited and their market positioning is imprecise. One product model usually covers multiple BANDs. The market share of IBM's storage products keeps shrinking, so IBM may cut down its investment in the storage arena.  NetApp has seized the opportunity in unstructured data storage and builds a unified Data ONTAP platform. Their storage products offer abundant software functions and features, and have various differentiated highlights. However, the unified storage products provided by NetApp are a simple combination of NAS and SAN, which cannot maximize the NAS and SAN performance at the same time. What's more, NetApp does not have a generally acknowledged high-end storage product.  Huawei has over 10-year accumulated experience in the storage arena. With the combination of industry-leading technologies and its own innovation capability, Huawei has achieved perfect integration of SAN and NAS, low-end/mid-range/high-end products, Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 7 SSDs and HDDs, primary storage and backup storage, and heterogeneous products. Huawei now provides a complete range of storage products and solutions with high security, proven reliability, on-demand configuration, easy operation, and high efficiency. According to Gartner's statistics data in 2013 Q4, Huawei has surpassed Oracle and Fujitsu in sales revenue, number of sold sets, and capacity of sold sets in the global storage market. In addition, Huawei ranks No. 1 in the three aspects in the China market. 1.3.2 What are the products provided by these major manufacturers? Answer: Competitive storage products nowadays include HUAWEI OceanStor series, EMC VNX and VMAX series, and NetApp FAS series. The following table lists the major manufacturers and their flagship products: Category Huawei EMC High-end storage OceanStor 18500/18800 VMAX Mid-range storage OceanStor S2600T/S550 0T/S5600T/S 5800T/S6800 T VNX2/VNX Solid-state storage Dorado Big data storage and NAS OceanStor 9000 NetApp IBM HP HDS DS8870 VSP XIV G3 3PAR 10800/1040 0 FAS8000/F AS6000/FA S3000/FAS2 000 V7000U/V5 000/V3000 StoreServ70 00 HUS VM Xtrem IO EF540 FlashSystem 720/820 StorServ 7450 HUS VM Isilon FAS series SONAS StoreAll972 0/9320 HNAS3000 series EVA P6000/P200 0 HUS150/130/ 110 OceanStor N8500 1.4 Key Storage Technologies 1.4.1 What is RAID? Answer: RAID is short for redundant array of independent disks. It is a data storage scheme that allows data to be stored and replicated in a hardware disk group consisting of multiple physical disks. In the early stage, storage manufacturers were actually disk manufacturers. However, the capacity, performance, reliability, and data sharing capability of single disks are limited and cannot meet the requirements of enterprise businesses. This problem had not been solved until 1987 when a paper discussing the RAID technology was published by the University of California, Berkeley. Since then, the storage industry started to boost and many giant storage Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 8 manufacturers such as EMC and IBM were established. The following figure shows a photocopy of the paper. The essence of the RAID technology is to combine multiple disks and to achieve high-reliability and high-performance reads and writes using a dedicated algorithm. The detailed implementation is as follows: 1. The group of disks is logically divided into data disks and parity disks. 2. Data writes: The data to be written to the disks is split into multiple segments and a parity bit is calculated out using an algorithm. These data segments and parity bit are written to the data disks and parity disks in parallel. 3. Data reads: If the data needs to be read out, the algorithm sends a parallel write request to the disk group, and then the data segments are combined and then returned to the application. 4. Exception handling: If a disk fails and a data segment cannot be accessed, the algorithm uses the parity bit to retrieve the lost data, so the data integrity is ensured. In this way, the failure of a single disk will not hamper the stable operation of the whole storage system. To achieve a balance between reliability and performance, a traditional RAID group usually consists of 10 disks. A storage array can contain multiple RAID groups, which means that thousands of disks are working simultaneously to provide a PB-level capacity. Compared with a single disk, a disk array has the following advantages: 1. Enhanced reliability: The failure of one or even two disks will not affect the operating of the whole RAID group. 2. Improved performance: The read/write performance of mechanical disks is always a bottleneck. Using the RAID technology, the read/write requests are evenly distributed to multiple disks to process, so the system performance is accelerated. Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 9 1.4.2 What are the differences among RAID levels? Answer: Different RAID levels can provide different levels of reliability, performance, and space utilization. You can select a desired RAID level to meet your specific requirements on the storage system reliability, performance, and capacity. RAID Description Reliability Redundancy Available Space Performance RAID 0 It segments data into stripes and then writes these stripes to multiple disks. It does not support redundancy. Lowest, intolerable of any disk failure % 100% Highest RAID 1 It divides disks in an even number into two groups, the data of which are completely the same. High, tolerable of the failure of a single disk Mirror redundancy 50% Lowest RAID 5 It segments data and parity bits into stripes and then writes these stripes to multiple disks. RAID 5 is one of the most commonly used RAID level. High, tolerable of the failure of a single disk parity redundancy (N-1)/N High RAID 6 It is similar to RAID 5, but it saves two copies of parity bits and data recovery requires both of the two copies. Highest, tolerable of the failure of two disks parity redundancy (N-2)/N High RAID 10 It incorporates the features of RAID 1 and RAID 0, that is, data striping and mirroring are adopted for data reading and writing. RAID 10 is also a widely used RAID level. High, tolerable of the failure of a single disk parity redundancy 50% High RAID 50 It incorporates the features of RAID 5 and RAID 0, that is, data striping, parity bit striping, and data mirroring are adopted for data reading and writing. High, tolerable of the failure of a single disk parity redundancy (N-1)/N High The typical configurations for common applications are as follows: Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 10 1. High-speed databases: RAID 10 for the balance of performance and reliability 2. Ordinary applications: RAID 5 that tolerates the failure of a single disk. If two disks fail at the same time, the RAID group becomes invalid. 3. High-reliability applications: RAID 6 that tolerates the failure of two disks. However, its disk utilization and performance are lower than those of RAID 5. 1.4.3 What is reconstruction? Why is the reconstruction speed so important? Answer: A storage system usually has several backup disks, which are called hot spare disks. If a data disk in a RAID group is damaged, the storage system uses a RAID algorithm to retrieve all the data on that damaged disk and then writes the data to an available hot spare disk for uninterrupted reading and writing. The whole process is called reconstruction. The reconstruction process is transparent to external applications. For example, if a data disk in a RAID 5 group is damaged, the storage system automatically starts reconstruction. However, if another disk fails during the reconstruction process, the whole RAID 5 group becomes invalid. This is called a dual-disk failure that is crucial to a storage system. There are two factors that determine the reconstruction speed: the data amount to be reconstructed and the data write speed of hot spare disks. The reconstruction of 1 TB data usually requires 10 hours. During this period of time, if another disk becomes faulty, all data in the RAID group cannot be used any more, which is a disaster to the storage applications. However, during the reconstruction process, the disk failure rate usually rises, and two reasons are found out based on our testing and analysis records: 1. Disks of the same batch are likely to fail simultaneously: The member disks in a RAID group are usually installed at the same time, and they share the same workloads during system operating. After a period of time, if one disk becomes faulty, the possibility for other disks to become faulty rises. Therefore, some manufacturers prefer to use disks manufactured in different batches to reduce the failure possibility. 2. Disks are easy to fail under heavy workloads: During the reconstruction process, disks are still processing I/O requests; therefore, the workloads on disks are increased and the possibility for disk failures rises. If another disk fails at this time, data will be lost. Therefore, how to shorten the reconstruction process and avoid the simultaneous failures of multiple disks is a key issue to address. 1.4.4 How to shorten the reconstruction process to improve storage reliability? Answer: There are several methods: 1. Using small-capacity and high-speed disks: During a traditional RAID reconstruction process, multiple disks cooperate to restore data onto a hot spare disk. Therefore, the capacity of these disks becomes an important factor. If we use small-capacity and high-speed disks, the data volume to be reconstructed is small, and the reconstruction process is shortened. 2. Reducing the disk failure rate: The failure of a disk is usually caused by the failure of a few tracks. Therefore, we can use a specific algorithm to isolate the failed tracks and avoid the damage of the entire disk. 3. Using a new RAID algorithm: Huawei has developed an innovative RAID algorithm, RAID2.0+. This algorithm virtualizes physical disks and distributes the reconstruction Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 11 loads (write operations) to tens of disks (or even hundreds of disks). In this way, the reconstruction speed is improved by 20 times and the dual-disk failure possibility is minimized. RAID2.0+ is now well accepted by customers. As the disk capacity keeps growing, RAID2.0+ must become more popular in the future. 1.4.5 What is cache? Why is it important in improving storage efficiency? Answer: An optimal storage configuration is in essence the perfect balance among performance, reliability, and cost. All the storage technologies are developed based on these three factors. The disk I/O speed is a bottleneck of the whole storage system. Every read or write operation is processed by disks, which may result in severe latency. The easiest way to improve performance is to add more disks into the system, but this definitely increases the cost. The cache technology is developed to resolve this dilemma. It uses high-performance storage media (such as memory) as a temporary buffer to store frequently-accessed data (hotspot data). With such storage media, the reads and writes to the hotspot data is directly processed by the cache, and the performance is improved by tens of times. However, the downside of cache is that it has a small capacity and costs a high price, so it is only used to store hotspot data. Then a hotspot data identification and scheduling algorithm is in need. In addition, the data temporarily buffered in the cache is also permanently stored in disks, so we also need an algorithm to ensure the consistency of these two copies of data. Here raises another question: Since the cache is so important in improving the system speed, how to ensure its own reliability? For example, a write operation has been processed in the cache and returns a write success message to upper-layer applications, and new data has not yet been flushed to disks. Then an unexpected power failure occurs. Will this cause all the data in the cache be lost? If the controllers in the storage system do not adopt any protection measures, this power failure will cause data loss, which is unaccepted in core applications such as financial applications. Therefore, we usually use two methods to protect the controllers against power failures: 1. Configuring backup battery units (BBUs) for the controllers: Once a power failure occurs, these BBUs can supply power to the controllers and write the cached data to the backup SSDs. When the power supply resumes, data on the SSDs can be restored to the cache. 2. Globally caching data to multiple controllers: If one controller fails, the other controllers can store the cached data. The core of a storage system is the absolute reliability of its data, so we must eliminate the loss of even a bit of data. 1.4.6 What are backup and disaster recovery? Answer: A storage system is only a standalone system. If customers want higher data reliability, they can use backup and disaster recovery to achieve data redundancy.  Issue 01 (2009-04-10) Backup: One or more duplicates can be created for a piece of data. Once the production system becomes unavailable, the backup duplicates can be used to restore the system data. The traditional backup period is one day, that is, the storage system can only retrieve the data generated within one day. What's more, the data recovery period is long and services are interrupted during the period. These two limitations are unaccepted by many mission-critical applications. Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 12  Disaster recovery: Two storage systems are deployed in one city or different cities, and these systems synchronize data with each other in real time or near-real time. Once the primary storage system is down, the other storage system can take over the services within a short time.  Backup and disaster recovery can coexist. There is only one disaster recovery data copy but can be multiple backup data copies that are generated at different times. If the original data is damaged by virus or man-made mistakes, an appropriate backup data copy can be used to restore the lost data. 1.4.7 What are RTO and RPO? Answer: RTO and RPO are two important indexes used to measure the reliability of a disaster recovery system, where:  Recovery time objective (RTO): The length of time it takes to recover services from an outage to an operational state. This index is used to measure the service recovery capability of a disaster recovery system.  Recovery point object (RPO): The amount of lost data during the period from the time when the disaster occurs to the time when the application system recovers to an operational state. This index is used to measure the data redundancy capability of a disaster recovery system. Smaller RTO and RPO are translated into a higher service protection capability. Therefore, to minimize the impact of a disaster on storage services and to achieve short RTO and RPO, we need to build a highly reliable disaster recovery solution if the budge permits. 1.4.8 What are the common data backup solutions? Answer: Data backup can be implemented on three layers:  Application-layer backup: Data is backed up across two or more sites by using host-side applications, databases (such as Oracle and DB2), operating systems (such as UNIX and volume management), and virtualization.  Network-layer backup: Data is synchronized and backed up by capturing and forwarding I/O operations on the channels between hosts and storage systems. The typical solutions of this category include Huawei VIS solution, EMC vPLEX solution, and IBM SVC solution.  Data-layer backup: Data is backed up on storage systems. The backup technologies of this category include synchronous replication and asynchronous replication. 1.4.9 How to improve the overall reliability of a storage system? Answer: Reliability is crucial to a storage system and it can be improved from three aspects: 1. Component-level reliability The major measure is to select optimal components and strictly control quality. For example, storage equipment manufacturers usually cooperate with disk manufacturers to strictly control disk quality from the start of the production phase. In addition, storage manufacturers perform strict tests on every batch of disks. For example, Huawei selects 500 to 1000 disks out of each batch to perform drop, vibration, and temperature tests, and makes sure that there is no batch issue exists. 2. Issue 01 (2009-04-10) Product-level reliability Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 13  Eliminate single points of failure in design. Configure redundancy for key components such as controllers.  Install shockproof brackets and connectors for disks, reducing the vibration and resonance caused by disk operations and increasing the disk service life.  Perform strict tests on the storage system, such as the temperature cycle test. 3. Solution-level reliability Implement multi-site disaster recovery for key data. For example, in the 911 event, many enterprises located in the World Trade Center were attacked, but their services were not affected. The reason was that their data and services were backed up for disaster recovery. Data disaster recovery uses advanced technologies such as snapshot and clone to protect data. If a disaster occurs, the backup data can be used to restore the production data, and no data will be lost. Service disaster recovery adds service switchover on the basis of data disaster recovery. If a disaster occurs, the backup site manually or automatically takes over services of the production site. After the production site recovers, the services are seamlessly switched back to the production site. During this process, no adverse impact will be imposed on services. Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 14 2 Acronyms and Abbreviations Table 2-1 Acronyms and abbreviations Acronym Full Spelling RAID Redundant Array of Independent Disks DAS Direct Attached Storage SAN Storage Area NetWork NAS Network Attached Storage SAS Serial Attached SCSI NL-SAS Nearline Serial Attached SCSI SSD Solid State Disk OLTP On-Line Transaction Processing OLAP On-Line Analytical Processing ERP Enterprise Resource Planning Issue 01 (2009-04-10) Huawei Proprietary and Confidential © Huawei Technologies Co., Ltd. 15