Transcript
Reference Architecture
Backup Solution for Epic Caché Database With EMC® NetWorker®, EMC Data Domain®, and EMC VMAX3® • Faster Backup and Recovery • Easy Deployment and Centralized Management • Efficient Deduplication
EMC E-Lab™ Verticals Engineering Group Abstract The document depicts the reference architecture of protecting the Epic Caché Database with EMC storage integrated backup solutions enabled by the integration of EMC NetWorker with EMC Data Domain and EMC VMAX3.
October 2015
Copyright © 2015 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All trademarks used herein are the property of their respective owners. Part Number H14597
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
2
Table of contents
Table of contents 1
Reference architecture overview.................................................................... 6 Document purpose ...................................................................................................................... 6 Solution purpose ......................................................................................................................... 6 Business challenge ..................................................................................................................... 6 Technology solution .................................................................................................................... 6 Audience ..................................................................................................................................... 7 Terminology ................................................................................................................................ 7
2
Key components .......................................................................................... 8 Introduction ................................................................................................................................ 8 EMC NetWorker ........................................................................................................................... 8 EMC Data Domain with DD Boost ............................................................................................... 10 VMAX3 product overview ........................................................................................................... 11 Epic Caché simulated databases ............................................................................................... 12 VMware vSphere ....................................................................................................................... 13 Solutions Enabler ...................................................................................................................... 13 Deduplication............................................................................................................................ 13
3
Solution architecture .................................................................................. 15 Architecture ............................................................................................................................... 15 Hardware resources................................................................................................................... 16 Software resources .................................................................................................................... 16 VM configuration ....................................................................................................................... 17
4
Configuration ............................................................................................. 18 Backup data flow....................................................................................................................... 18 Restoring data flow.................................................................................................................... 19 Storage configuration ................................................................................................................ 19 Network configuration ............................................................................................................... 20 Integration configuration for DD Boost over IP ........................................................................... 20 Prerequisites .................................................................................................................................... 21 Data Domain settings ........................................................................................................................ 22 NetWorker settings ........................................................................................................................... 24
Backup configuration ................................................................................................................ 26 Proxy Server settings ......................................................................................................................... 26 Backup profile settings ..................................................................................................................... 27
Integration configuration for DD Boost over FC ........................................................................... 30
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
3
Table of contents
5
Test scenario and methodology .................................................................. 33 Methodology for the proxy clones .............................................................................................. 33 Scenario .................................................................................................................................... 34 Methodology ............................................................................................................................. 34 Test tools .................................................................................................................................. 35
6
Test results ................................................................................................ 37 Introduction .............................................................................................................................. 37 Backup time .............................................................................................................................. 37 Observations .................................................................................................................................... 38
Restore time .............................................................................................................................. 38 Observations .................................................................................................................................... 39
Deduplication............................................................................................................................ 39 Observations .................................................................................................................................... 40
7
Conclusion and best practices .................................................................... 41 Summary ................................................................................................................................... 41 Best practices and recommendations ........................................................................................ 41 VMAX3 .............................................................................................................................................. 41 Data Domain ..................................................................................................................................... 41 NetWorker ........................................................................................................................................ 42 Conclusions ...................................................................................................................................... 42
8
References ................................................................................................. 43
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
4
List of Tables
List of Tables Table 1. Table 2. Table 3. Table 4. Table 5. Table 6. Table 7.
Terminology ................................................................................................................. 7 Solution hardware ...................................................................................................... 16 Solution software ....................................................................................................... 16 VM configuration ........................................................................................................ 17 Prerequisites ......................................................................Error! Bookmark not defined. Test cases .................................................................................................................. 34 Backup time results.................................................................................................... 37
List of Figures Figure 1. Figure 2. Figure 3. Figure 4. Figure 5. Figure 6. Figure 7. Figure 8. Figure 9. Figure 10. Figure 11. Figure 12. Figure 13. Figure 14. Figure 15. Figure 16. Figure 17. Figure 18. Figure 19. Figure 20. Figure 21. Figure 22. Figure 23. Figure 24. Figure 25. Figure 26.
EMC NetWorker ............................................................................................................ 8 Data Domain DIA ........................................................................................................ 11 VMAX3 storage array .................................................................................................. 12 Caché database Write IO pattern example .................................................................. 12 Solution architecture .................................................................................................. 15 DD Boost over 10GbE ................................................................................................. 18 DD Boost over FC ........................................................................................................ 18 Restoring data flow example ...................................................................................... 19 Network configuration example .................................................................................. 20 Integration steps ........................................................................................................ 21 DD Boost ifgroups ...................................................................................................... 23 Local Compression Type ............................................................................................. 23 Backup configuration steps ........................................................................................ 26 Network tuning ........................................................................................................... 26 Data Domain DFC SCSI processor devices................................................................... 31 Clone method ............................................................................................................. 33 Initial full copy............................................................................................................ 33 Production backup process ........................................................................................ 34 Backup workload ....................................................................................................... 34 Environment reset ...................................................................................................... 35 Testing process .......................................................................................................... 35 NetWorker Management Console ............................................................................... 35 Symmetrix Performance Analyzer ............................................................................... 36 Backup time (hours:minustes:seconds) ..................................................................... 37 NetWorker restore wizard ........................................................................................... 39 Deduplication factor ................................................................................................... 39
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
5
Chapter 1: Reference architecture overview
1
Reference architecture overview
Document purpose The document describes the reference architecture of Epic Caché Database protected by EMC data protection solutions with EMC® NetWorker®, EMC VMAX3® and EMC Data Domain®, which was tested and validated by the EMC E-Lab™ Verticals Engineering Group. Solution purpose
The reference architecture documents the testing and validation of the advanced backup solutions for functionality and performance. This testing environment was built using a combination of NetWorker, Data Domain, and VMAX3. This reference architecture validates the performance of this solution and provides guidelines to build similar solutions. This document is not a comprehensive guide to every aspect of the solution.
Business challenge
Protecting critical data is a significant challenge for enterprises of all sizes. Specially, with the increasing stress on Electronic Medical Records (EMR) data due to explosive growth and various regulation compliance requirements, healthcare delivery organizations are looking for a better strategy to protect their important business data, including patient records. In addition, more and more healthcare solutions are transferring from traditional infrastructure to virtualization and Cloud platforms, which also call for a seamless data protection solution. Now more than ever, to keep pace with these new requirements, the healthcare industry requires a sophisticated backup approach for the Caché Database. Several terabytes of data are required to be backed up every day, with typical retention periods for Epic data of 30 days. Furthermore, the SLAs call for data recovery to occur at no less than 250 GB per hour, preferably faster. There are several new technologies available from EMC to assist in architecting this kind of solution, but the need exists to know how to best use these technologies to maximize the investment, better support service-level agreements, and minimize the TCO. In addition to the above concerns, other challenges should also be addressed:
Technology solution
•
Strict RTO requirement of 250 GB per hour minimum
•
Zero impact on production environment when performing backup
•
Ability to redirect the restore and restore single files from within the database structure
•
Limited backup window, generally less than 8 hours
•
Easy deployment and management
EMC next-generation data protection solutions are applied to satisfy industry backup and recovery requirements, which are quite suitable for an enterprise’s virtualization and Cloud environment to enable DPaaS/BaaS (Data Protection-as-a service/ Backup-as-a-service) models. Specifically, Data Domain is integrated with NetWorker to offer deduplicated backup and quick recovery.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
6
Chapter 1: Reference architecture overview
Audience
This reference architecture is intended for EMC employees, partners, and customers with an interest in backup and recovery for Epic Caché Database. Readers should already be familiar with Epic Caché Database and EMC technologies.
Terminology
The following table defines terms used in this document. Table 1. Terminology Term
Definition
EMR
Electronic medical record
RPO
Recovery point objective. RPO is the point in time (prior to an outage) that systems and data must be restored to
RTO
Recovery time objective. RTO is the period of time after an outage in which the systems and data must be restored to the predetermined RPO
OLTP
Online transaction processing
TCO
Total costs of ownership
SU
Storage unit in Data Domain
SLA
Service level agreement
VM
Virtual Machine
FA
VMAX3 Front adaptor
NW
EMC NetWorker
BCV
Business continuance volume
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
7
Chapter 2: Key components
2
Key components
Introduction
This section briefly describes the key components and technology used in the solution, including:
EMC NetWorker
•
EMC NetWorker
•
EMC Data Domain with DD Boost
•
VMAX3 Storage Array
•
Epic Caché Database
EMC NetWorker backup and recovery software centralizes, automates, and accelerates data backup and recovery across your IT environment. Boasting recordbreaking performance and flexibility, NetWorker protects critical business data in a fast, secure, and easy-to-manage way. Whether your organization is a small office or a large datacenter, you can trust that your data will be protected. NetWorker users know and trust that their data is backed up and recoverable in the event of user error, data loss, system outage, or catastrophic event. All your business applications remain in service while data backups are taking place with zero downtime.
Figure 1.
EMC NetWorker
NetWorker delivers centralized backup and recovery operations for complete control of data protection across diverse computing and storage environments, including: •
Storage area networks (SANs), network-attached storage (NAS), and directattached storage (DAS)
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
8
Chapter 2: Key components
•
UNIX, Microsoft Windows, Linux, OpenVMS, and Macintosh operating systems
•
Critical business applications including Oracle, Microsoft SQL Server, and Exchange
•
SharePoint, Active Directory, IBM D82, Informix. Lotus, and SAP Sybase
•
Virtual environments, including VMware, Hyper-V, Xen, and Solaris Zones
•
Backup storage options, including tape drives and libraries, virtual tape libraries, disk arrays, deduplication storage systems, and cloud storage
EMC NetWorker is the only backup software application to provide seamless integration with the industry’s two leading deduplication solutions—EMC Avamar® and EMC Data Domain deduplication storage systems. Deduplication revolutionizes disk-based data protection, significantly reducing backup data and providing network-efficient disaster recovery. NetWorker deduplication supports: •
Integration with Data Domain systems via EMC Data Domain Boost software, driving new levels of speed and efficiency for managing data center workloads.
•
Client deduplication, solving critical backup challenges in remote offices and virtual environments.
•
Network-efficient replication with deduplication, delivering fast and reliable bandwidth-efficient, disaster recovery protection.
•
A common index and media database, ensuring reliability and recoverability.
•
Integrated software and hardware deduplication solutions from EMC, delivering unmatched reliability and efficiency. With NetWorker, users can easily and safely evolve from traditional to next-generation data protection.
NetWorker simplifies installation, configuration, and day-to-day data protection management through an easy-to-use, intuitive interface. Capabilities include: •
A customizable web-based GUI with built-in reporting to simplify administration.
•
Wizards to guide setup and modification of device configurations and backup jobs.
•
Multi-tenancy enables cloud-based services.
•
Common sign-on using LDAP and Active Directory.
•
Virtual server auto-discovery and visualization through integration with VMware vCenter.
•
Search and sort to reduce recovery time by finding data quickly
•
Centralized software distribution enables easy remote installation of patches and updates. Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
9
Chapter 2: Key components
•
EMC Data Domain with DD Boost
Event-based backup delivers flexibility to run backups by condition rather than time.
EMC Data Domain deduplication appliances are the industry’s leading performance deduplication storage system. They provide simple and reliable disk-based data protection and disaster recovery (DR) solutions, which support various backup and archive applications and integrate seamlessly with existing infrastructures. They are designed for optimal performance, operational simplicity, and easily scale to meet all sizes of enterprise environment needs. EMC Data Domain systems utilize a disk-based inline deduplication method and leading-edge algorithm, which offers better performance with lower storage space requirements. EMC Data Domain systems incorporate an advanced technology to protect against data loss caused by hardware or software failures. Both of these features are enabled by: •
Stream Informed Segment Layout (SISL) scaling architecture
•
Global compression technology
•
Data Invulnerability Architecture (DIA) architecture
The foundation for the Data Domain system’s industry-leading performance is its SISL scaling architecture. Unlike post-processed deduplication, SISL deduplicates data inline by identifying redundant data in RAM with powerful CPU rather than disk-based processing, which minimizes disk usage and achieves better performance. It makes Data Domain a CPU-centric system, which leverages successive generations of CPU to continuously increase performance, not a spindle-bound architecture like other deduplication platforms. Data Domain uses a global compression algorithm to process the incoming data streams. It combines high-performance global deduplication with an efficient local compression technique. Data Domain uses variable-length deduplication methods to provide more efficient deduplication capabilities. The duplicate data is eliminated firstly with a block-based variable segment deduplication algorithm. Then, this unique data is compressed with a compression algorithm. The joint effort results in the required disk space being dramatically minimized. All data stored on Data Domain systems is protected by DIA, which provides the industry’s best defence against data integrity issues and makes Data Domain ultrasafe storage for reliable recovery. It contains end-to-end data verification and continuous fault detection and self-healing mechanisms coupled with other resiliency features transparent to the application. Unlike other enterprise arrays or file systems, continuous fault detection and self-healing features protect data throughout its lifecycle on all Data Domain systems.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
10
Chapter 2: Key components
Figure 2.
Data Domain DIA
Taken together, all these techniques allow more backups to complete faster while putting less pressure on limited backup windows, making it a good candidate for backup cases such as database, email, and unstructured data. EMC Data Domain Boost is a software option available for all Data Domain systems, and significantly increases backup performance and reliability, simplifies operational and disaster recovery, and allows better leverage of current infrastructure investments. DD Boost is made up of two components: •
A DD Boost library (plug-in) that runs on the backup server or client
•
A DD Boost server component that runs on the Data Domain system
This innovative technology offloads part of the deduplication process (ID segment and compression) to the backup server or client. This allows Data Domain to focus on determining what data is unique and only writing that data to the disk, improving performance. Furthermore, only unique data is sent to the Data Domain system, enabling more efficient use of the existing LAN or SAN.
VMAX3 product overview
The EMC VMAX3 family of storage arrays is built on the strategy of simple, intelligent, modular storage and incorporates a Dynamic Virtual Matrix interface that connects and shares resources across all VMAX3 engines, allowing the storage array to seamlessly grow from an entry-level configuration into the world’s largest storage array. It provides the highest levels of performance and availability featuring new hardware and software capabilities. The EMC VMAX3 family, VMAX 100K, 200K and 400K, deliver the latest in Tier-1 scaleout multi-controller architecture with consolidation and efficiency for the enterprise. It offers dramatic increases in floor tile density, high- capacity flash, and hard disk drives in dense enclosures for both 2.5" and 3.5" drives, and supports both block and file (eNAS).
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
11
Chapter 2: Key components
The VMAX3 family of storage arrays comes pre-configured from the factory to simplify deployment at customer sites and minimize time to first I/O. Each array uses Virtual Provisioning to allow the user easy and quick storage provisioning. While VMAX3 can ship as an all-flash array with the combination of EFD (Enterprise Flash Drives) and large persistent cache that accelerates both writes and reads even further, it can also ship as hybrid, multi-tier storage that excels in providing FAST (Fully Automated Storage Tiering) enabled performance management based on Service Level Objectives (SLO). VMAX3’s new hardware architecture comes with more CPU power, larger persistent cache, and a new Dynamic Virtual Matrix dual InfiniBand fabric interconnect that creates an extremely fast internal memory-to-memory and datacopy fabric. Figure 1 shows possible VMAX3 components. Refer to EMC documentation and release notes to find the latest supported components.
Figure 3.
Epic Caché simulated databases
VMAX3 storage array
An Epic environment contains multiple databases: Caché and Clarity. Caché is a high-performance post-relational database and is utilized as an OLTP database. This document focuses on the Caché database.
Figure 4.
Caché database Write IO pattern example
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
12
Chapter 2: Key components
The Caché database has a unique write cycle, using eight write daemons that are triggered every 80 seconds. As a result, huge bursts of I/O hit the storage system at very high speeds. For testing purposes, EMC utilized a simulated Caché database created with eight file systems and a separate file system for the journal files. The database was created through the use of a test utility created by Epic, known as GeneratelO. The data files were created, then a simulated workload was designed to change 10% of the data on a daily basis, similar to what would be expected in a production Epic environment. Everything in the Epic environment is tightly integrated with Caché. It is typically large, averaging over 2 TB, and daily full backup is required. In this test, the simulated Caché database was used as a backup workload. The total size was about 15 TB and was evenly distributed. VMware vSphere
VMware vSphere is the market-leading virtualization platform used across thousands of IT environments around the world for building cloud infrastructures. It is a trusted virtualization platform offering the highest levels of availability and responsiveness. VMware and EMC work together to build solutions that enable healthcare providers to dramatically reduce capital and operating costs and complexity, which maximizes IT efficiency while giving healthcare organization agility to the new business needs.
Solutions Enabler
Solutions Enabler is an interface that enables the application administrator or other appropriate user to configure storage resources, take snapshots, and perform other operations on the VMAX3 array. Solutions Enabler can also be used by a script or external management entity to perform supported operations on the VMAX3 array, such as taking local data snapshots.
Deduplication
Data deluge makes satisfying current and future backup needs difficult using traditional methods. Additionally, keeping costs relatively low on data protection with desired performance is an inevitable difficulty confronting all enterprises. Deduplication is one of the most attractive technologies to address such an issue. It is a data reduction technique for eliminating redundant copies of repeating data. It offers fast, reliable and cost-effective backup and recovery by shrinking storage requirements and improving bandwidth efficiency. EMC’s advanced deduplication method enables enterprises to maximize efficiency while minimizing TCO. Deduplication ratio is one of the most important indicators when measuring overall efficiency of reduction in storage space, besides different algorithms and work models. The key factors contributing to the ratio include backup candidate data type and backup policy. Depending on the characteristics of the data, those with low change rate, containing redundancy and a small number of large files, will benefit most from deduplication. This means that user data like text files, presentations, spreadsheets, documents, most database types, source code, and Exchange are dedup-friendly. For precompressed data types, the first full backup may provide a high ratio, but following backups generally deduplicate well. Examples include audio, video streams, and scanned images.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
13
Chapter 2: Key components
The backup policy also plays an important role when considering deduplication ratio. Specifically, different frequency of full backups and retention periods yield various results. Typically, longer data retention periods and more frequent full backups result in better commonality with the greater chance that identical data exists and better deduplication ratios are achievable. Epic Caché Database is an excellent candidate for deduplication with its file-based structure, multiple copies, and full backup requirement. In addition, the individual file systems consist of a single large file and numerous smaller files, making deduplication more efficient.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
14
Chapter 3: Solution architecture
3
Solution architecture
Architecture
This solution describes the tests performed to validate the EMC backup and recovery methods for Epic Caché Database enabled by integrating EMC NetWorker, EMC Data Domain, and EMC VMAX3 technologies. It involves simulating a 15 TB Caché database on EMC VMAX3 storage, mounted on a Linux RHEL virtual machine (VM). In addition, there is also an EMC NetWorker (backup tool) server running on a Windows 2008 R2, Windows VM, and a Linux RHEL VM that acts like a proxy to mount the VMAX3 BCV clones of the Caché database production devices. This proxy VM will be the NetWorker client for the backups so that backups will be nonintrusive to the application.
Figure 5.
Solution architecture
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
15
Chapter 3: Solution architecture
Hardware resources
Table 2 lists all the hardware resources used to build this solution, Table 2. Solution hardware Hardware
Quantity 1
EMC VMAX 200K EMC Data Domain 4500
1 2
Cisco UCS C240 Server
2
Brocade DCX8510
Configuration
Notes
176 x 300 GB 15K SAS Drives R1
Primary storage server
60 3 TB SAS 7.2K Disks
Deduplication backup storage
Intel® Xeon® CPU E5-2690 v2 @3.00GHz, 192 GB Memory, 2 dual port HBAs
For the Caché database, restore servers and NetWorker server
16 GB/s FC switches
For dual FC SAN Fabric
1 Gigabit Ethernet switches
Infrastructure Ethernet switch for management
1 Ethernet switch
Software resources Table 3 lists all the software resources used to build this solution. Table 3. Solution software Software
Version
Notes
EMC NetWorker
8.2.1
Windows 2008 R2 server
EMC HYPERMAX® OS
5977.683.676
Operating environment for primary storage VMAX3
EMC Solutions Enabler
8.0.3.0 (Edit Level: 2026)
VMAX management CLI
EMC DDOS
5.5.1.4-464376
Operating environment for Data Domain
InterSystems Caché
NA
Simulated Epic Database
VMware vCenter
5.5.0
vCenter Server appliance
VMware vSphere
5.5.0
Server hypervisor
EMC PowerPath® Virtual Edition
Version 5.9 SP 1 (Build 11)
Multipathing and load balancing for block access
RHEL Linux
6.6
Operating System for production and restore server environment
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
16
Chapter 3: Solution architecture
VM configuration
Table 4 lists all VM configuration used. Table 4. VM configuration VM Caché DB server
Quantity 1
OS Linux RHEL 6.6
vMemory (GB) 50
vCPU 16
NetWorker Server
1
Windows 2008 R2
20
8
Virtual proxy
1
Linux RHEL 6.6
20
8
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
17
Chapter 4: Configuration
4
Configuration
Backup data flow
A dedicated Proxy server is used to mount the Epic production clone. This ensures that no other Epic-specific processes or services were impacted while performing routine backups.
Figure 6.
DD Boost over 10GbE
Figure 7.
DD Boost over FC
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
18
Chapter 4: Configuration
The backup data was transferred from an EMC VMAX3 array through Proxy server to Data Domain, while the metadata was sent from proxy server to NetWorker server. Figure 6 shows an example of backup data flow with DD Boost over 10GbE and Figure 7 with DD Boost over FC. Restoring data flow
The solution allows your site to offer safe, user-driven, individual file-level restore operations. During recovery to a NetWorker client, the Data Domain system converts the stored data to its original non-deduplicated state. When a restore operation is required, data is retrieved from Data Domain storage, decompressed, verified for consistency, and transferred to the backup servers using Ethernet (for NFS, CIFS, DD Boost), or using Fibre Channel (for VTL and DD Boost). Multiple processes can access a Data Domain system simultaneously. Figure 8 shows a restore data flow through the network.
Figure 8.
Storage configuration
Restoring data flow example
This section describes the storage layout used in the VMAX3 storage array for the backup process. To avoid impacting production performance, a virtual proxy server and devices are applied when backing up the DB. Generally, Epic Caché DB consists of multiple separate LUNs. To achieve performance and space efficiency, thin volumes were used configured within a RAID 1 thin pool.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
19
Chapter 4: Configuration
Network configuration
This section describes the network layouts used in this scenario. A 10GbE card and link was used on Proxy Server, Production Server, and Data Domain for better performance compared with using 1GbE link aggregation. A dedicated VLAN was used to isolate traffic. 1GbE can be used on NetWorker Server for the small amount of metadata and log files. This does not require high bandwidth. It is recommended to use jumbo frames in a backup environment. (Make sure all components in the data path are capable of handling jumbo frames. Increase the MTU to 9 KB.)
Figure 9.
Integration configuration for DD Boost over IP
Network configuration example
The combination of NetWorker and Data Domain improves data protection by increasing performance, simplifying management, and minimizing TCO, which enables a centralized, automated, and accelerated backup and recovery solution for enterprises. NetWorker and Data Domain can be integrated by Advanced File Type Devices (AFTD) or Virtual Tape Library (VTL). Either configuration takes advantage of the deduplication system and easily integrates with a backup environment. These methods do not allow NetWorker full visibility into the properties and capabilities of the Data Domain storage system. Therefore, NetWorker and the Data Domain system have to be managed separately. Currently, another more powerful method is to use Data Domain Boost for the integration. Data Domain Boost allows NetWorker to easily talk to the Data Domain system to better manage and obtain more statistics while enabling advanced disaster recovery strategies. Data Domain Boost further increases backup performance and reduces network traffic by distributing parts of the deduplication process to the
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
20
Chapter 4: Configuration
backup server or applications hosts directly. This significantly accelerates performance, enhances replication control, and simplifies administration. The following sections introduce the key steps when integrating NetWorker with Data Domain enabled by DD Boost. There are five phases, each further discussed. Three phases are discussed in this section: • • •
Prerequisites Data Domain settings NetWorker settings
Figure 10.
Integration steps
Under Backup configuration, the following two phases will be discussed: • Proxy server settings • Backup profile settings Prerequisites Before the integration, prerequisites and the limitations should be understood and noted as below: Table 5. Prerequisites
•
Make sure the DD OS and NetWorker version meet the minimum requirements, as shown in the above table. Note some additional features require the latest versions, such as FC DD Boost, direct file access (DFA), etc.
•
Obtain all the required licenses. This example used the Data Domain Boost license, which was applied on Data Domain system. Also used was a license for Data Domain storage system enabler on NetWorker server.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
21
Chapter 4: Configuration
•
A DNS server or local host file to ensure Data Domain, Proxy server (backup Client), and NetWorker sever can consistently resolve their hostnames in the network from multiple locations in both directions. NTP server is required to provide Reliable Timing service for all the components in the environment.
•
Ensure the entire physical link is configured, as shown in the Network configuration section, and that it works properly. In addition, a firewall should be enabled on specific ports to allow communication between Data Domain, NetWorker, and NetWorker Management Console (NMC) servers. For details, refer to NetWorker-8.2 Data Domain Deduplication Devices Integration Guide. For this case, all the components were put into a VLAN.
Data Domain settings Ensure Data Domain has been properly initialized and required licenses have been applied. Then, log in to Data Domain system as administrator and run the CLI, as show in the following steps. 1. Enable NFS service.
You need to enable NFS services on the Data Domain system, even if no users or shares are configured. If NFS is not enabled, DD Boost will not be active. 2. Add and enable a ddboost account.
This step will create a ddboost account, which will be used by NetWorker server. 3. Enable ddboost.
This step will enable ddboost so NetWorker server can access Data Domain ddboost device. 4. Enable and configure SNMP traps.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
22
Chapter 4: Configuration
This step allows NetWorker server to capture Data Domain statistics. 5. Enable distributed segment processing (DSP).
DD Boost DSP feature allows Data Domain to offload part of the deduplication work to the NetWorker storage nodes and NetWorker clients and only sends unique data to the Data Domain device. 6. Create a DD Boost interface group and add 10GbE interfaces to it.
Figure 11.
DD Boost ifgroups
DD Boost better understands the ifgroup and load balancing and failover happens at the application level unlike LACP so it’s much faster and helps with better performance. 7. Choose the local compression type as Lempel-Ziv via Data Domain GUI. This provides a good balance between compression ratio and required backup duration.
Figure 12.
Local Compression Type
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
23
Chapter 4: Configuration
NetWorker settings In this test environment, both NetWorker server and NetWorker Management Console (NMC) server were installed on a Linux VM. To launch NetWorker administration and use NMC wizard to integrate NetWorker with Data Domain, complete the following steps. 1. Add a new Data Domain System via DD Boost, using the “New Device” wizard located in the Devices section.
2. Create a new Data Domain folder and change the NetWorker device associated with it.
3. Create a new media pool type “Backup” or chose an existing one and select label and mount device after creation.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
24
Chapter 4: Configuration
4. Set the virtual proxy server as the storage node. This will allow the data streams to travel to the NetWorker server.
5. Confirm that the created device can be found in the Data Domain GUI. By default, the wizard creates 1 SU (Storage Unit) per NetWorker data zone.
Notes: •
A NetWorker instance can have multiple Data Domain systems integrated and vice versa.
•
A NetWorker instance can integrate with a Data Domain system with mixed access mode. A Data Domain system can be added to a NetWorker server via VTL, NAS, and DD Boost with separate interface connections.
•
Restrict the Data Domain device types in a pool to a single Data Domain system to their own media pool. This improves management and ensures more exact results.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
25
Chapter 4: Configuration
Backup configuration
Setting up and configuring a NetWorker environment is very simple. The following steps explain how to create a backup task with Data Domain in NetWorker.
Figure 13.
Backup configuration steps
Proxy Server settings Ensure Proxy server has sufficient hardware resources and has been properly configured. If a VMware environment is being used, check whether EMC PowerPath is installed and configured properly on the ESX server. Present the BCV clone devices to the proxy server and leave them un-mounted. 1. Install NetWorker Storage Node and Client agent on the proxy server. This will allow the data steams to travel to the NetWorker server. 2. Increase the size of the TCP send/receive buffers. Add the following parameters to the /etc/sysctl.conf file and then run the /sbin/sysctl –p command.
Figure 14.
Network tuning
This step provides the capability to handle larger TCP packets, resulting in more data being carried per TCP packet. More data is transmitted with less block fragmentation, reducing I/O overhead. Notes: •
The tuning is only for the backup environment via NFS/CIFS and IP protocol.
•
These settings are dependent on the operating system and proxy server.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
26
Chapter 4: Configuration
•
Other operating systems with different hardware configurations will have different tuning settings.
•
Some of the values may differ when dealing with 1GbE data path. Make sure all the systems have the same settings in your environment.
Backup profile settings The Device Configuration Wizard enables fast, repeatable operations that provide easy and efficient implementation of Data Domain systems. The following steps describe how to configure a group and client for backup using the wizard. 1. Create a backup group with NMC wizard. Make sure autostart is disabled as all the tests are manual and force a full backup, since this is a requirement.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
27
Chapter 4: Configuration
2. Add a backup client with the NMC wizard and make sure you select the client direct option (DFA) and the correct group and pool. In the saveset field, you need to specify all the Epic Caché DB file systems.
3. Select to back up to a Data Domain device using the IP technology.
4. Select to use 9 streams that match the number of the Caché DB file systems, plus the files journal device. Some of the options shown above are described in detail in the following sections.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
28
Chapter 4: Configuration
Client Direct (DFA) DFA allows clients to be able to process (client-side deduplication), send (Backup), and receive (Recover) data directly to Data Domain. Since it bypasses the NetWorker storage node, it actually distributes some of the workload to those clients, reducing bandwidth usage and storage node overhead, further improving overall backup performance. In order to enable this feature, the client needs direct network access to the Data Domain systems. It only supports specific application modules and is currently supported for filesystems with NetWorker 8.0 or later. In this case, since the storage node and the backup client are the same server, enabling or disabling this option does not matter. It is useful when dealing with a large backup environment. Adjusting client parallelism Client parallelism defines the number of save streams that a client can send simultaneously during backup. This option offers the multi-stream capability to back up the savesets in parallel. If configured properly, it normally means more efficient and higher performance. Setting the appropriate stream count for the client is very important. However, greater stream count does not always provide better performance. Determining the optimum stream count for any environment is difficult. There are many other aspects that need to be considered for sizing (e.g., other parallelism settings, Server parallelism, Savegrp parallelism, if multiple save sets on the same disk, etc). The testing described in this document focused on client parallelism. In this test, 16/8/4 parallelism stream settings were applied respectively to determine the best option. When modifying the settings, ensure that the client has enough hardware resources since greater parallelism requires more horsepower. Also note the maximum sessions for DD Boost devices is 60. Note: In a production environment there will be other file systems that need to be cloned for the backup including journals, applications files, and audit logs that may increase the number of file systems to 12 (including the 8 primary data directories). In addition, the file distribution between the main Caché directories may be not uniformly distributed. In each main Caché data filesystem, identified as epic/prd01 — prd08 in our testing, there will be a large CACHE.DAT file that will account for ~6080% of the total file system size. In practice, a sensible approach to the number of streams is required. Trying to add too many streams may have a negative impact on performance times due to the lO characteristics of the server and clone storage infrastructure. Adding a couple of additional streams to address the largest of the CACHE.DAT files may yield the best results. In summary, increasing stream count works well for customers that have more balanced filesystems. In environments with a large variation in filesystem sizes, more streams may hurt performance, especially if those additional streams are not dedicated to the larger filesystems.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
29
Chapter 4: Configuration
Parallel Save Streams (PSS) The parallel save streams enables the mount point of Unix and Linux clients to be backed up by multiple parallel save streams to one or more destination backup devices. This feature may provide performance gains by splitting the mount point into multiple streams based on client parallelism settings during backup and subsequent recovery. When the parallelism value is equal to or less than the client’s number of mount points, PSS does not work. In most cases, it is better to set the PSS parallelism value to 2x or up to 4x the number of client’s mount points. This ensures multiple streams for each mount point (e.g., if you have 8 mount points in the backup client, enable PSS feature and ensure the client parallelism has been set greater than 16). More performance benefits will be gained when those mount points reside on storage, which is fast enough and has sufficiently high aggregate throughput for concurrent read streams and avoids using slow storage with high disk read latency with PSS. Before enabling the PSS feature, make sure that both the NetWorker server and backup client is at NetWorker 8.1 or later and ensure it does not violate the maximum 60 NetWorker concurrent sessions. Integration configuration for DD Boost over FC
NetWorker 8.1 introduces support for DD Boost over FC. DD OS 5.3 or later is also required to support this feature. By using DD Boost over FC, the backup data will be transferred via Fibre Channel instead of IP network. It enables faster backup compared to using DD VTL, improves disaster recovery, and simplifies backup management in SAN environments. This is also an alternative way to utilize existing environments. In this test, 8 Gbps HBA cards were used on the Data Domain side and also in the ESX servers. Note that the NetWorker server also requires Ethernet IP connections to communicate with all clients, storage nodes, and the Data Domain system which involved in DD Boost operations. The following steps show how to configure DD Boost over FC on Data Domain and NetWorker: 1. Ensure Data Domain HBAs cards have been installed and zoned with the proxy server (NetWorker client and Storage node) and also the NetWorker server. The following shows the Data Domain HBAs WWNs (2 dual port HBAs).
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
30
Chapter 4: Configuration
The following shows the NetWorker Server and Proxy server HBA WWNs.
2. Check if you can list the Data Domain DFC SCSI processor, as shown below. In our test, the NetWorker server is running on a Windows 2008 R2 OS, so if you go to device manager, you can list them under “other devices”.
Figure 15.
Data Domain DFC SCSI processor devices
3. Enable DD Boost over FC service and check the status:
4. Create and configure access groups:
In the current test case, an Epic group was created with all the Data Domain and initiator ports.
5. Configure or use the default ‘DFC-server-name’. In this case, the name ‘DD4500-epic’ was used.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
31
Chapter 4: Configuration
From the Data Domain GUI, you can console edit the DFC server name.
6. Edit the NetWorker device associated with the Data Domain DD Boost folder to use DFC.
7. Finally, in order for the client to send the data to the device using the FC technology instead of 10GbE during the backup process, you need to edit the client properties so you can force this option.
To switch between DD Boost 10GbE and DD Boost FC, only two steps are needed: Enable the FC on the device and then force FC on the client properties. Note: Before using FC, check if you can list the DFC scsi processors on the client and storage node with the inquire –l command for Unix and Linux or the device manager for Windows. Rescan the HBAs, if needed.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
32
Chapter 5: Test scenario and methodology
5
Test scenario and methodology
Methodology for the proxy clones
The method used to clone the Caché DB production volumes was TimeFinder/Mirror. In a real production environment, the first thing to do is to create a clone/mirror of all the production devices against the same number of BCV devices, using a consistent device group.
Figure 16.
Clone method
Once all the production devices are associated with the correspondent BCVs, the next step is to establish the pairs with the “-full” option so you get a full copy of the production devices on the BCVs. The establish process with the “-full” option will create a full copy, so this process will take several minutes to complete. Once you have this full copy, the subsequent establish sessions without the “-full” option will be differential. Note: The activate process is included in the test results.
Figure 17.
Initial full copy
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
33
Chapter 5: Test scenario and methodology
Once the initial setup is done, you can proceed to make regular full backups of the database, as shown in Figure 18. The steps that will contribute to the overall backup time are the mirror establish, split, and the backup itself. The test results will reflect the sum of the duration of these two processes.
Figure 18.
Scenario
Production backup process
Several scenarios with different parallelism stream settings were applied for the testing on the proxy server as show in the following table: Table 6. Test cases No
Methodology
Operation
Streams
Interface
Test Cycle
1
Backup
8/4
10GbE
5/5
2
Backup
8
FC
5
3
Restore
8/4
10GbE
1/1
4
Restore
8/4
FC
1/1
For testing, the EMC Healthcare Verticals Epic team provided a test suite for Caché database simulation, while the EMC E-Lab Verticals team conducted all the tests and captured the performance metrics. •
The number of Test Cycles for each backup scenario is designed as 5 rounds to reflect a typical workweek in a healthcare system setting and allow a representative deduplication period.
•
Backup Workload is generated by the scripts and is large enough at - 15 TB to be representative. In order to simulate a real world case, initial backup together with subsequent 4 days full backup with a 10% daily change rate (approximately 1.5 TB) was based upon the last backup database size as the source data. During the first backup, make a full mirror copy of the production devices to BCVs and after that incremental updates are performed on the BCVs in order to update the changed data.
Figure 19.
Backup workload
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
34
Chapter 5: Test scenario and methodology
•
Environment Reset is required before starting another sub scenario. It ensures the test bed is clean and has all the tests start from the same point. Figure 20 shows an example of an environment reset. Once the Data Domain cleaning process its finished, new NetWorker devices can be created.
Figure 20.
•
The Testing Process is basically the same for each cycle of every stream settings, the establish command without the “-full” option and the split command allow for an incremental update of the clone. This process will also be measured during testing.
Figure 21.
•
Test tools
Environment reset
Testing process
The Observation Method relies mainly on the log or operation reports from NetWorker and Data Domain. Backup duration, RTO, and deduplication/ compression ratio were measured during the test. The backup time measured includes the establish process and the backup time itself.
The following test tools were used. •
EMC NetWorker Management Console delivers a Java-based interface for monitoring, administration, and reporting for NetWorker environments. This option is the essential tool for creating centralized monitoring of all NetWorker events and diagnosing backup problems.
Figure 22.
NetWorker Management Console
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
35
Chapter 5: Test scenario and methodology
•
EMC Solutions Enabler is an interface that enables the application administrator or other appropriate user to configure storage resources, take snapshots, and perform other operations on the VMAX3 array. Solutions Enabler can also be used by a script or external management entity to perform supported operations on the VMAX3 array, such as taking local data snapshots.
•
Symmetrix Performance Analyzer is an automated monitoring and trending tool launched through Symmetrix Management Console to assist in long-term planning, diagnostic drill down to identify performance issues root causes, and real-time monitoring. It provides intuitive analysis of key performance indicators at the application level to optimize performance and improve utilization of your Symmetrix environment.
Figure 23.
Symmetrix Performance Analyzer
•
Data Domain Enterprise Manager is a web-based, feature-rich application for managing Data Domain systems. DD Enterprise Manager provides management and monitoring of all aspects of Data Domain systems from a single interface, including the filesystem, access protocols, data management and integrated system control. The operational simplicity of DD Enterprise Manager’s dashboards provides a high-level overview of system status, and allows drill down into areas of interest.
•
Data Domain CLI is a command set that performs all system functions. Commands configure system settings and provide displays of system hardware status, feature configuration, and operation. The command-line interface is available through a serial console or through an Ethernet connection using SSH or Telnet.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
36
Chapter 6: Test results
6
Test results
Introduction
This section provides detailed information regarding test results and observations. During the test there was no other activity on the environment. Important: These tests were intended for comparison purposes only and are lab based results. Actual durations for backup and restore activities depend on multiple variables both within the data itself and the customer’s environment.
Backup time
The following chart and table show the tracking of the backup time in hours:minutes:seconds for the five daily backups tested with three different scenarios. Table 7. Backup time results Day 1
Day 2
Day 3
Day 4
Day 5
4 Streams 10 GbE
10h:40m:54s
4h:4m:36s
4h:01m:15s
4h:03m:39s
3h:58m:12s
8 Streams 10 GbE
7h:32m:49s
2h:34m:14s
2h:25m:02s
2h:23m:40s
2h:19m:20s
8 Streams FC
8h:21m:32s
2h:50m:03s
2h51m:27s
2h:45:m:40s
2h:44m:08s
Figure 24.
Backup time (hours:minustes:seconds)
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
37
Chapter 6: Test results
The following table shows the tracking of time for the different phases during the five daily backups for the scenario with 8 streams over 10GbE. For the first backup, a TimeFinder/Mirror establish between the VMAX3 production devices and the BCVs with the “full” option is needed, while for the subsequent backups you only need to make an incremental establish. The split will make the target devices ready and the user can mount them on the restore server for backup with NetWorker. Day 1*
Day 2
Day 3
Day 4
Day 5
Establish
1h:2m:45s
17m:13s
16m:47s
15m:48s
16m:23s
Split
6m:38s
6m:5s
6m:50s
6m:31s
5m:58s
Backup Time 7h:32m:49s 2h:34m:14s 2h:25m:2s 2h:23m:40s 2h:19m:20s Day 1* - The establish command must be run with the full option in order to create a full mirror. Observations
Restore time
•
The establish of the mirrors between the production and the devices with the full option will make a full clone so it takes much more time than the incremental establishes.
•
The same applies for the first backup. Data Domain is empty before the backup and so all the data is unique and needs to be processed.
•
The split process that allows the mirrors to be “ready” on the VMAX3 always takes around the same time.
•
Using 8 streams, the backups will be faster since you have 8 file systems and can backup them in parallel.
•
Distributing the DB across a higher number and smaller VMAX3 devices will make the mirror establish process faster.
There are different types of restores that can be done: •
Restore with file granularity. The files can be recovered to the proxy server or to the production host in a different location than the production file systems.
•
Full recover to the production host. With the file overwrite option, all the files will be overwritten if they exist; the data will travel from Data Domain directly to the production host.
•
Full recover to the proxy server. This is used for testing purposes.
•
Full recover to the proxy server. This is followed by a TimeFinder/Mirror recover so the production data will be overwritten at the VMAX3 device level.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
38
Chapter 6: Test results
The following figure shows that you can select the destination host with the NetWorker recovery wizard.
Figure 25.
NetWorker restore wizard
The following table shows the results for the full restore to the proxy server with the file overwrite option. The restore was tested with 8 streams over FC and over 10GbE. Full Restore 8 Streams 10 GbE 5h:22m:36s 8 Streams FC
6h:11m:42s
Observations
Deduplication
•
The restores with 8 streams will be extremely efficient as you can restore the 8 DB file systems in parallel.
•
Depending on the elected restore process, more steps may need to be performed.
The following results were obtained for Data Domain deduplication.
Figure 26.
Deduplication factor
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
39
Chapter 6: Test results
Observations •
The deduplication ratio does not change since the same backup data sets are used regardless of stream count setting.
•
The deduplication ratio for initial backup is low since deduplication algorithm is not applicable. The initial day results are derived from Data Domain Local compression technology.
•
The largest increase in deduplication factor is from the initial to the second full backup. This period has the most redundant data that has not been previously through the deduplication algorithms.
•
The rate of change for the deduplication ratios from the second to the fifth test run is reduced since the amount of redundant data is tied to the 10% growth factor. The deduplication ratio will continue to grow for the following backup cycles, but at a slower rate.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
40
Chapter 7: Conclusion and best practices
7
Conclusion and best practices
Summary
The integration of NetWorker and Data Domain is a more efficient data protection solution, speeding up backups and recovery with less storage space consumption and simplifying daily operations. Our testing proves it a good solution for protecting Epic Caché database. Specifically, Data Domain helps enterprises reduce the amount of data to backup and dramatically reduces the amount of disk storage needed to retain and protect data, while NetWorker provides a common management interface and backup workflow to simplify the whole process. With Data Domain Boost capabilities, Data Domain can be deployed quickly and fit into the backup workflows and policies, lowering management overhead and further improving performance.
Best practices and recommendations
Based on this testing, the following best practices and recommendations can be followed to optimize the solution and get better results. VMAX3 •
Increasing the number of VMAX3 devices will improve the overall TimeFinder/Mirror establish and split performance so the time needed to prepare the mirrors on the backup proxy server for the backup will be reduced. In this test case, 15 devices of 1 TB each were used.
Data Domain •
Using ifgroup for DD Boost backups is the best practice as it provides the most efficient level of load balancing from the client to Data Domain.
•
DD Boost better understands the ifgroup and load balancing and failover happens at the application level, unlike LACP, so it is much faster and helps with better performance.
•
There is less overhead in configuring ifgroup compared to LACP. Ifgroup is an end-to-end connection while LACP is point-to-point connection and all the connection points in the data path need to be configured using LACP.
•
Using physical interfaces to be part of ifgroup is the best practice.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
41
Chapter 7: Conclusion and best practices
NetWorker •
Define the number of streams according to the number of file systems, size, and the proxy server hardware resources available.
•
It is recommended to use jumbo frames in environments capable of handling them. If the source, the computers, and all equipment in the data path are capable of handling jumbo frames, increase the MTU to 9 KB.
•
The minimum required memory for a NetWorker Data Domain-OST device with each device total streams set to 10 is approximately 160 MB. Each OST stream for DD Boost takes an additional 16 MB of memory.
•
For larger databases, it is recommended to install the NetWorker storage node software on the client. For smaller servers or databases, the DFA (Client Direct to Data Domain) should be used.
Conclusions •
A much higher throughput was achieved than the required 250 GB/hour throughput for both backup and restore with DD Boost over IP and FC, respectively.
•
It was found that making the stream count equivalent to the number of Caché file systems (in our case 8) was the optimum setting if enough proxy server resource is available. Note: Increasing stream count works well for customers that have more balanced file systems. In environments with a large variation in file system sizes, more streams may hurt performance, especially if those additional streams are not dedicated to the larger filesystems. o
EMC recommends considering at least 2 GB RAM for each stream. Always refer to the NetWorker best practices for the latest sizing guidelines.
o
EMC recommends considering 1vCPU for each stream.
•
DD Boost takes between 2% and 40% additional CPU time during backup operations as compared to non-client deduplicated backups for a much shorter period of time. However, the overall CPU load of a backup to DD Boost is less when compared to traditional mmd based backups using CIFS/NFS.
•
DD Boost over Fibre Channel leverages the advantage of the boost protocol in a SAN infrastructure. Important: All backup and restore performance levels are subject to a multitude of factors. Under the specific laboratory conditions used in this testing, we were able to achieve performance levels. Performance numbers will vary depending on environment size and specifications.
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
42
Chapter 8: References
8
References The following documents, located on the EMC online support website at https://support.emc.com, provide additional and relevant information. Access to these documents depends on your login credentials. If you do not have access to a document, contact your EMC representative. •
EMC NetWorker 8.2 and EMC Data Domain Boost deduplication devices
•
Data Domain Operating System Administrator Guide 5.5
•
NetWorker 8.2 SP1 Performance Optimization Planning Guide
Backup Solution for Epic Caché Database with EMC NetWorker, EMC Data Domain, and EMC VMAX3 Reference Architecture
43