Transcript
Scaling Trading Systems with Coherence Why multi-core processors and low latency networks are crucial
Brian Oliver Data Grid Solutions Architect Tangosol Oracle
[email protected]
Agenda
• • • • •
Introduction to Coherence Coherence as Trading System infrastructure Multi-core Performance Network Scale-out Observations Conclusions
Copyright 2007. Tangosol
2
Introduction to Coherence
Copyright 2007. Tangosol
Introduction to Coherence
• • •
Data Management & Clustering Solution (In-Memory) Native libraries: Java & .NET (C++ soon) ‣ UDP-Based Traffic
#1: Provides Reliable Clustering Technology ‣
Cluster applications without a container / server
•
#2: Provides Automatic Partitioning of Application Data
• •
#3: Provides Extremely Low Latency Events #4: Provides Grid Agents to Process Data (Affinity)
‣
‣
Copyright 2007. Tangosol
Including partitioning of backups
Not a client / server processing model 4
Partitioned Topology A
get()
A
A
A B
B
get()
B
D
get()
D
C
get()
Application
C
Primary
Primary
D C
D
Backup
Backup
Logical Partitioned Cache
Logical Partitioned Cache JVM 2
A
A C
D
Primary
Primary
D C
Application
B D
B
A
Backup
Backup
Logical Partitioned Cache JVM 3
5
C
JVM 1
B
Application
B
© Copyright 2007. Tangosol Inc. No part of this document may be reproduced without authorization from Tangosol Inc.
C
Logical Partitioned Cache JVM 4
Application
Partitioned Topology
• • •
Partitioned Topology is one of many ‣ Others include Near, Replication, Overflow, *Extend...
Consistent Data Access Latency ‣ One network operation ‣ Any size cluster
Capacity Scales to Available ‣ RAM ‣ Network
Copyright 2007. Tangosol
6
Partitioned Topology A
put()
A
A
A B
B
Primary
Primary
D
D C
C
Application
Backup
Logical Partitioned Cache
D
Backup
Partitioned Cache JVM 2
A
A
C
put()
C
Logical
C
D
Primary
Primary
D
Application
B D
B
A
Backup
Backup
Logical Partitioned Cache JVM 3
7
C
JVM 1
B
Application
B
© Copyright 2007. Tangosol Inc. No part of this document may be reproduced without authorization from Tangosol Inc.
C
Logical Partitioned Cache JVM 4
Application
Partitioned Topology
• •
Consistent Data Update Latency ‣ Two network operations (worst case) ‣ Any size cluster
Capacity Scales to Available ‣ RAM ‣ Network
Copyright 2007. Tangosol
8
Partitioned Topology Recovery A
get()
A
A
A B
get()
B
D
get()
D
get()
Application
C
Primary
D
C
D
Backup
Backup
Partitioned Cache
JVM 1
C
A
Primary promoted
D C
Application
D backup
Primary
D
Backup
Logical Partitioned Cache
© Copyright 2007. Tangosol Inc. No part of this document may be reproduced without authorization from Tangosol Inc.
B
backup
D
B
B
JVM 3
9
Application
JVM 2
B B
C
Logical
Logical Partitioned Cache
A
B
D
C
Primary
EA D
B
A
Backup
C
Logical Partitioned Cache JVM 4
Application
Partitioned Topology Recovery
•
Dynamic Recovery ‣
All cluster members have equal responsibility for Data (and recover)
‣
No Management Console = Unattended Health Maintenance
‣
No in-flight operations lost / No stop-the-world recovery
‣
Realtime events for Cluster membership
•
More Processes = Faster Recovery
•
There is an opportunity for data loss
•
‣
Repartitioning occurs in parallel
‣
When Cluster too Small
‣
Catastrophic Failure (many processes die at the same time)
‣
Not cluster loss. Data often is recoverable!
Like RAID for Application Data without a Server!
Copyright 2007. Tangosol
10
Partitioned Topology Affinity Primary D
A
Primary B A
B
get()
get()
Primary C A
get()
D
D
C
Near Backup Coherence Cache JVM 1
Spring App
A
B C
C
get()
Backup
Near Coherence Backup Cache JVM 2
Spring App
Primary D
D
A Process EntryProcessor Invoked
D
D
Spring App
11
B
Near Backup Coherence Cache JVM 3
© Copyright 2007. Tangosol Inc. No part of this document may be reproduced without authorization from Tangosol Inc.
A
Backup
Near Coherence Cache JVM 4
Spring App
Partitioned Topology Affinity
•
Affinity dramatically improves performance & scalability ‣
Data Affinity = related data co-located
‣
Processing Affinity = processing occurs where data is located
•
Significantly reduces network operations
•
Reducing network increases both performance & scalability Types of Data Processing (in Parallel)
•
‣
eg: 14 to 4 reduction in data updates
‣
Indexing
‣
Updating
‣
Aggregating
Copyright 2007. Tangosol
12
Coherence as Infrastructure
Copyright 2007. Tangosol
Coherence Applied to Trading
• • •
Coherence as Infrastructure ‣
Reliable In-Memory Data Management
‣
Highly Available (even during recovery and scale-out)
‣
Known access and update latencies
‣
Dynamically Scalable & Fault Tolerant
Natural Data Partitioning ‣
Instruments, Securities, Trades, Prices, Orders, Executions, Portfolios
‣
No longer a developer or operational concern
Natural Data Affinity ‣
Instruments and Prices,
‣
Orders and Executions
Copyright 2007. Tangosol
14
Coherence Applied to Trading
• • •
Natural Processing Affinity (Extreme Parallelism) ‣
Trades, Prices, Orders, Execution updates may occur in parallel
‣
Indexing of Portfolios
‣
Aggregation of Positions, Risk...
Low Latency Events ‣
No messaging infrastructure required
‣
Cluster Membership and Data Updates
‣
Thousands of Clients (Desktops & Servers)
‣
Continuous Views of Data (as it changes)
WAN and MAN support
Copyright 2007. Tangosol
15
Multi-core Comparison
Copyright 2007. Tangosol
Baseline Multi-core Test
• • •
•
Xeon ‣
2 x 3.6Ghz Single Core CPUs / Server
Woodcrest ‣
2 x 2.6Ghz Dual Core CPUs / Server
Common ‣
4 GB RAM
‣
3 x Servers
‣
1Gb Network
‣
Linux 2.6 SMP Kernel
‣
3 Processes / Server
Scale-out Test ‣
Copyright 2007. Tangosol
100 x Servers, 10 x Processes / Server 17
Multi-core Load Comparison Xeon
Woodcrest
Woodcrest v’s Xeon
50,000
46,000
37,500 28,300
2.0 1.6
1.5
1.6 1.5
25,000
1.0
12,500 5,400
2KB
8,800
10KB
0 5
7
10MB
Entries / Second Copyright 2007. Tangosol
0.5
18
2KB
10KB
0 10MB
Throughput
Multi-core Load Comparison Xeon
Woodcrest 70.0%
69% 66% 51%
55%
52.5% 56%
35.0% 40%
2KB
0% 10KB
10MB
CPU Utilisation Copyright 2007. Tangosol
19
17.5%
Network & Scale-out Observations
Copyright 2007. Tangosol
Network Scale-out Aggregations / Server (1Gb) 3
24
48
58mill/sec
15mill/sec 2mill/sec
Copyright 2007. Tangosol
24
48 Entries / Second
21
96
Aggregation Latency (secs)
60mill/sec
1.50s
45mill/sec
1.13s
30mill/sec
29mill/sec
3
96
1.1s
1.1s
1.1s
1.2s
0.75s
15mill/sec
0.38s
0mill/sec
0s 3
24
48
Servers
96
InfiniBand v’s 10GbE 10 GbE
IB
950.0MB/sec 712.5MB/sec 475.0MB/sec 237.5MB/sec 0MB/sec 1968
3968
7968
15968
31968
Packet Size (bytes)
Copyright 2007. Tangosol
22
63968
Conclusions
• • • •
All features are important for a solution! Reliable Cluster Management is important ‣
No time for manually manage a cluster - especially if it’s large!
‣
Recovery & Scale-out is automatic - great for large clusters
Automatic Partitioning is important ‣
Removes developer and operational maintenance requirements (cf:DB)
‣
Removes hot-member bottlenecks (unless you have hot data points)
‣
Provides natural mechanism for scaling out
Data and Processing Affinity is important ‣
Reduces network bandwidth requirements
‣
Increases scalability and performance
Copyright 2007. Tangosol
23
Conclusions
•
Multi-core CPU ‣
Processing Affinity pushes CPU’s harder
‣
Pushing a 1Gb+ network requires a lot of CPU
‣
Some applications don’t use a lot of CPU = increase processes to utilitize CPU and increase resilience + availability
‣
Higher bandwith = more CPU load for Coherence (Traffic Management)
Copyright 2007. Tangosol
24
Conclusions
•
•
Network ‣
Affinity significantly reduces Network bandwidth requirements
‣
ie: Less copying and context switching in an application
‣
But data volumes are increasing... More applications using a single cluster... Grids are becoming larger...
‣
Network remains the major impediment to scalability and performance
‣
As bandwidth increases, CPU utilization increases! (traffic management)
InfiniBand ‣
Ideal for large objects
‣
Ideal when packet bundling (Coherence 3.3) is applicable
Copyright 2007. Tangosol
25
Thank you
Copyright 2007. Tangosol