An important requirement for HA and to provide scalability is to detect problems and resolve them quickly before the user sessions get affected. Oracle RAC along with its Family of Solutions work together cohesively to detect conditions such as "Un-responsive Instances", Network issues quickly and resolve them by either redirecting the work to other instances or redundant network paths
3. Quickly Detect Outage Quickly Resolve Outage
with minimum
disruption
Failover to disaster
recovery site for site
level failures
How to achieve Maximum Availability?
5. Detect Outage
Oracle Cluster
Synchronization
Services
Oracle LMS process
Oracle Clusterware
Agents
Oracle LMON process
Oracle ASM
Oracle Memory Guard
Resolve Outage
Node Eviction by
CSS/Agents
Instance eviction by
LMON
Resource move by
Oracle Clusterware
Agents
Service Shutdown by
Memory Guard
5
Failover to remote site
Data Guard
Oracle Database HA Features*
* Not a complete list
Process, Resource, Instance, Node Failures Compete Site Outage
7. CSSD provides Node Membership services
• CSSD is started by CSSDAgent
• Runs as Oracle User
• Sends Heartbeat to both Voting disk and via Private network to
remote CSSD
• Evicts the node if Heartbeats are missing
7
11. Oracle Clusterware Agents
• Agents are spawned by OHASD and
CRSD and they monitor the
corresponding resources.
• Actions based on policy master
• They are persistent processes, therefore
they have better performance over the
script based CRS resource action in pre-
11.2 releases.
• For example, CHECK_INTERVAL of VIP
resource is 1s starting with 11.2.*
12. Agents in Cluster Startup
• OHASD invokes the following agents
• cssdagent
• orarootagent
• oraagent
• cssdmonitor
• CRSD invokes the following agents
• orarootagent
• oraagent
• orajagent aka Java Agent (new in 12.2)
• Any user defined agents
13. Agent Actions
• START, STOP
• CHECK: If it notices any state change during this action, then the agent
framework notifies Oracle Clusterware about the change in the state of the
specific resource.
• CLEAN: The CLEAN entry point acts whenever there is a need to clean up a
resource. It is a non-graceful operation that is invoked when users must
forcefully terminate a resource. This command cleans up the resource-specific
environment so that the resource can be restarted.
• ABORT: If any of the other entry points hang, the agent framework calls the
ABORT entry point to abort the ongoing action.
14. Resource State Information
• Check returns one of the following values to indicate the
resource state:
• ONLINE
• UNPLANNED_OFFLINE
• PLANNED_OFFLINE
• UNKNOWN
• PARTIAL
• FAILED
• Checks are implicitly called after start, stop, clean.
15. Check Action (CRSD)
• CRSD also has a deep check implementation
once every 10 checks
• Deep check involves making sure that the
OCR thread within the CRSD Process is not
hung
• Deep check also involves making sure the
Policy Engine Module within CRSD is not
hung
• Agent will ignore the first two consecutive
deep check failures before declaring that
the daemon has failed
17. HAIP Network Configuration
Node A Node B Node C
SW1 - 192.168.0.0/24
SW2 - 10.0.0.0/24
• Highly Available Network providing
redundancy and aggregation
functions for the private
interconnect.
• No longer requires OS level
bonding configuration
• Better utilization of private
interfaces configured in the cluster
profile.
• Used by both Oracle Clusterware
components and the database.
18. HAIP Implementation Details
• All networks configured in cluster profile is used.
• Configures HAIP Addresses on the private interconnect.
• Addresses created through Link Local Address Protocol.
• Creates IP address in the 169.254.0.0/16 subnet
• Maximum of 4 HAIP addresses configured on any node
• Tolerates interface failures
• HAIP address on failed interface dynamically moved to another
interface
• Dynamically add/remove interfaces from the cluster profile
25. • LMS ships blocks based on
requests by remote clients
• LMS has its own retry
mechanism to handle block
shipping failures
• Very Expensive
• Bad for performance
• LMS can offload to its slaves
to mitigate outliers
• LMS CR Slaves
• LMS monitored by LMHB
LMS manages the Global Buffer Cache
Buffer
Cache
LMS*
Shared
Pool
In-
Memory
Misc
Buffer
Cache
LMS*
Shared
Pool
In-
Memory
Misc
Buffer
Cache
LMS*
Shared
Pool
In-
Memory
Misc
Buffer
Cache
LMS*
Shared
Pool
In-
Memory
Misc
Total SGA
S
G
A
S
G
A
S
G
A
26. CR Slaves to Mitigate Performance Outliers
• In previous releases, LMS work on incoming
consistent read requests in sequential fashion
• Sessions requesting consistent blocks that
require applying lot of undo may cause LMS to
be busy
• Starting with Oracle RAC 12c Release 2, LMS
offloads work to ‘CR slaves’ if the amount of
UNDO to be applied exceeds a certain, dynamic
threshold
• Default is 1 slave and additional slaves are
spawned as needed
26
Time Account Amount
T 13579 $2500
T+1 13579 $2000
T+2 13579 $1000
T+3 13579 $200
28. LMON Has the Final Word on
Which Instances are Part of a
Cluster DB
LMON has it’s own heartbeat to the
other LMONs and to the control file
If there is a timeout, LMON can
evict another instance
IMR – Instance Membership Recovery
29. IMR – Send Timeouts
• "IPC send timeout" occurs
when a cross instance
message is not acknowledged
by the remote instance within
5 min (default timeout)
resulting in an ORA-29740
• ORA-29770 and ORA-29771
error messages introduced in
11.2+ to take action before we
hit an “IPC send timeout” in
most cases.
30. Review: IPC Send Timeouts
• This example is from a 4 node cluster. The alert log from
instance 1 showed a send timeout and the receiver was on
instance 4:
alert_p599a.log-Mon Apr 17 09:42:10 2006
alert_p599a.log:IPC Send timeout detected. Sender ospid 3859
alert_p599d.log-Mon Apr 17 09:42:11 2006
alert_p599d.log:IPC Send timeout detected. Receiver ospid 9014
alert_p599d.log-Mon Apr 17 09:42:11 2006
31. Solving IPC Send Timeouts
• In 11.2 a non-fatal background process called LMHB (Heart beat
monitor) created to monitor health via periodic heart beats
• Processes monitored by LMHB:
• LMON (global enqueue service monitor)
• LMD0 (global enqueue service daemon)
• LMS* (global cache service process)
• LCK0 (Lock Process)
• DIAG and DIA0 (Diagnostic Processes)
• RMS0 (Oracle RAC management server)
• Possibly more depending on version
32. ORA-29770 and ORA-29771
• Any non fatal processes blocking the monitored processes (e.g.
holding latches) will be terminated after a timeout regardless of
system load.
• Fatal processes will only be terminated when load is low.
• Exceptions (no kill) are given when any of LM*, LCK, DIA*
processes are in the middle of CF enqueue or CF I/O operations,
row cache and library cache background operations, or doing
system state dump.
33. Evict Sick Unresponsive Nodes
• LMS1 (ospid: 22636) has detected no messaging activity from instance 1
LMS1 (ospid: 22636) issues an IMR to resolve the situation
Communications reconfiguration: instance number 1
• Evicting instance 1 from cluster
Waiting for instances to leave: 1
Sat Jul 24 10:38:45 2010
Remote instance kill is issued with system inc 10
Remote instance kill map (size 1) : 1
Waiting for instances to leave: 1
• Analysis: Instance 1 was hanging and not responding, so instance 2 evicted
instance 1 and waited instance 1 to abort.
35. Oracle-ASM Automatic Storage Management
35
ASM Cluster Pool of Storage
Disk Group BDisk Group AShared Disk
Groups
Wide File Striping
One to One
Mapping of ASM
Instances to
Servers
ASM Instance
Database Instance
ASM Disk
RAC Cluster
Node4Node3Node2Node1 Node5ASM ASM ASM ASM ASM
ASM Instance
Database Instance
DBA DBA DBCDBB DBBDBB
36. 36
Removal of One to One Mapping and HA
Oracle Flex ASM provides even higher HA
ASM Cluster Pool of Storage
Disk Group BDisk Group A
Databases share
ASM instances
ASM Instance
Database Instance
ASM Disk
RAC Cluster
Node5Node4Node3Node2Node1
Node1 runs as ASM
Client to Node2
ASM ASM
ASM Instance
DBA DBA DBCDBB DBBDBB
ASM
37. 37
Removal of One to One Mapping and HA
Oracle Flex ASM provides even higher HA
ASM Cluster Pool of Storage
Disk Group BDisk Group A
Databases share
ASM instances
ASM Instance
Database Instance
ASM Disk
RAC Cluster
Node5Node4Node3Node2Node1 ASMASM ASM
ASM Instance
DBA DBA DBCDBB DBBDBB
Node1 runs as ASM
Client to Node4
39. Flex Disk Group
DB1
File 1
File 2
File 3
DB2
File 1
File 2
File 3
File 4
DB3
File 1
File 2
File 3
39
Database-oriented Storage Management for more flexibility and availability
ASM Flex Disk Groups
Quota
DB3
File 1
File 2
File 3
12.2 Flex Disk Group Organization • Flex Disk Groups enable
– Quota Management - limit the space
databases can allocate in a diskgroup and
thereby improve the customers’ ability to
consolidate databases into fewer DGs
– Redundancy Change – utilize lower
redundancy for less critical databases and
even change redundancies online.
40. Confidential – Oracle Internal/Restricted/Highly Restricted40
Hang Manager
Detects and Resolves Hangs and Deadlocks
41. Overlooked & Underestimated – Hang Manager
Customers experience database hangs for a variety of reasons
High system load, workload contention, network congestion or errors
Before Hang Manager was introduced with Oracle RAC 11.2.0.2
Oracle required information to troubleshoot a hang - e.g.:
System state dumps
For RAC: global system state dumps
Customer usually had to reproduce with additional events
41
Why is a Hang Manager required?
42. 42
Hang Manager - Workings
• Always on - Enabled by default
• Reliably detects database hangs
• Autonomically resolves them
• Considers QoS policies during Hang
Resolution
• Logs all detected hangs and their
resolutions
• New SQL interface to configure sensitivity
(Normal/High)
43. Hang Manager auto-tunes itself by
periodically collecting instance-and
cluster-wide hang statistics
Metrics like Cluster Health/Instance
health is tracked over a moving
average
This moving Average considered
during resolution
Holders waiting on SQL*Net
break/reset are fast tracked
Hang Manager Optimizations
43
44. Early Warning exposed via (V$ view)
Sensitivity can be set higher, if the user
feels the default level is too
conservative.
Hang Manager behavior can be further
fine-tuned by setting appropriate QoS
policies
DBMS_HANG_MANAGER.Sensitivity
44
Hang
Sensitivity
Level
Description Note
NORMAL Hang Manager uses its
default internal operating
parameters to try to meet
typical requirements for any
environments.
Default
HIGH Hang Manager is more alert
to sessions waiting in a chain
than when sensitivity is in
NORMAL level.
45. Confidential – Oracle Internal/Restricted/Highly Restricted45
Data Guard
Respond to catastrophic site failures
46. 46
Included with Oracle Database Enterprise Edition
Data Guard: Real-time Data Protection
Automatic Block Repair
Data Guard Broker
(Enterprise Manager Cloud Control or DGMGRL)
Failover to remote site
Primary Data Center DR Data Center
47. A licensable option to the Oracle Database Enterprise Edition
Active Data Guard: Advanced Capabilities
Zero data loss at any distance
Automatic Block Repair
Data Guard Broker
(Enterprise Manager Cloud Control or DGMGRL)
Offload Fast
Incremental
Backups
Offload read-only
workload to open
standby database
Primary Data Center DR Data Center
48. 48
Getting most of your Active Data Guard DR site
Active Data Guard: Advanced Capabilities
Zero data loss at any distance
Primary Data Center DR Data Center
Automatic Block Repair
DML Redirection
Offload Fast
Incremental
Backups
Offload read-
mostly workload
to open standby
database
Data Guard Broker
(Enterprise Manager Cloud Control or DGMGRL)
49. Data Guard Standby Redo Apply
In a typical RAC Primary and RAC standby, Only one node of the
standby can apply redo
Other RAC nodes of the standby instance are typically in waiting mode
even if the apply is CPU bound.
Other instance only takes over redo apply only if the instance applying
redo crashes
51. Multi-Instance Redo Apply
• Utilize all RAC nodes on standby to apply Redo
• Parallel, multi-instance recovery means “the standby DB will keep up”
• Standby recovery - utilizes CPU and I/O across all nodes of RAC standby
• Up to 3500MB+/sec apply rate on an 8 node RAC
• Multi-Instance Apply runs on all MOUNTED instances or all OPEN
Instances
• Exposed in the Broker with the ‘ApplyInstances’ property on
standby
recover managed standby database disconnect using instances 4;
53. Multi-Instance Redo Apply Performance
Utilize all Oracle RAC
instances on the Standby
database to parallelize
recovery
190 380
740
1480
700
1400
2752
5000
0
1000
2000
3000
4000
5000
6000
7000
1 Instance 2 Instances 4 Instances 8 Instances
Batch OLTP
Standby Apply rates in MB/sec running OLTP, Batch workload on Exadata
54. Autonomous Database = RAC on Exadata (& More)
Autonomous
Database
Automated
Data Center Operations
Oracle Cloud
• Oracle RAC is enabled on Oracle Autonomous Cloud offering
• Oracle RAC meets and exceeds the stringent Autonomous Transaction Processing
Dedicated (ATP-D) requirements
• Successfully providing scalability and availability to the Oracle Database for all
55. Oracle RAC Family of Solutions is
an integrated that works together
cohesively to ensure that
regardless of the failure, the stack
will continue to run with
minimum or no interruptions to
user sessions on both On
premise and Oracle Cloud
environments
55
Summary