Smart monitoring how does oracle rac manage resource, state ukoug19

1
Smart Monitoring! How does Oracle
RAC manage Resource and State?
Copyright © 2019 Oracle and/or its affiliates.
Anil Nair
Sr Principal Product Manager,
Oracle Real Application Clusters (RAC)
@RACMasterPM
http://www.linkedin.com/in/anil-nair-01960b6
http://www.slideshare.net/AnilNair27/

The preceding is intended to outline our general product direction. It is intended for information purposes
only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing decisions. The development,
release, timing, and pricing of any features or functionality described for Oracle’s products may change
and remains at the sole discretion of Oracle Corporation.
Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and
prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed
discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and
Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q
under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website
at http://www.oracle.com/investor. All information in this presentation is current as of September
2019 and Oracle undertakes no duty to update any statement in light of new information or future events.
Safe Harbor
Copyright © 2019 Oracle and/or its affiliates.

Quickly Detect Outage Quickly Resolve Outage
with minimum
disruption
Failover to disaster
recovery site for site
level failures
How to achieve Maximum Availability?

Application/Client
Outage
Host CPU, Memory,
Network, Storage
Outage
Complete site Outage
Scope of Outage
Very Important to identify the failure quickly and attempt to resolve it locally

Detect Outage
Oracle Cluster
Synchronization
Services
Oracle LMS process
Oracle Clusterware
Agents
Oracle LMON process
Oracle ASM
Oracle Memory Guard
Resolve Outage
Node Eviction by
CSS/Agents
Instance eviction by
LMON
Resource move by
Oracle Clusterware
Agents
Service Shutdown by
Memory Guard
5
Failover to remote site
Data Guard
Oracle Database HA Features*
* Not a complete list
Process, Resource, Instance, Node Failures Compete Site Outage

6
Oracle Cluster Synchronization Services
Detect and Evict Un-Responsive Nodes

CSSD provides Node Membership services
• CSSD is started by CSSDAgent
• Runs as Oracle User
• Sends Heartbeat to both Voting disk and via Private network to
remote CSSD
• Evicts the node if Heartbeats are missing
7

8
Oracle Clusterware CSSD Monitor & Agent
Monitors CSSD and other critical processes

Clusterware Agent & Resource(s) Startup

Oracle Clusterware Agents
• Agents are spawned by OHASD and
CRSD and they monitor the
corresponding resources.
• Actions based on policy master
• They are persistent processes, therefore
they have better performance over the
script based CRS resource action in pre-
11.2 releases.
• For example, CHECK_INTERVAL of VIP
resource is 1s starting with 11.2.*

Agents in Cluster Startup
• OHASD invokes the following agents
• cssdagent
• orarootagent
• oraagent
• cssdmonitor
• CRSD invokes the following agents
• orarootagent
• oraagent
• orajagent aka Java Agent (new in 12.2)
• Any user defined agents

Agent Actions
• START, STOP
• CHECK: If it notices any state change during this action, then the agent
framework notifies Oracle Clusterware about the change in the state of the
specific resource.
• CLEAN: The CLEAN entry point acts whenever there is a need to clean up a
resource. It is a non-graceful operation that is invoked when users must
forcefully terminate a resource. This command cleans up the resource-specific
environment so that the resource can be restarted.
• ABORT: If any of the other entry points hang, the agent framework calls the
ABORT entry point to abort the ongoing action.

Resource State Information
• Check returns one of the following values to indicate the
resource state:
• ONLINE
• UNPLANNED_OFFLINE
• PLANNED_OFFLINE
• UNKNOWN
• PARTIAL
• FAILED
• Checks are implicitly called after start, stop, clean.

Check Action (CRSD)
• CRSD also has a deep check implementation
once every 10 checks
• Deep check involves making sure that the
OCR thread within the CRSD Process is not
hung
• Deep check also involves making sure the
Policy Engine Module within CRSD is not
hung
• Agent will ignore the first two consecutive
deep check failures before declaring that
the daemon has failed

HAIP
High Availability for the Private Network

HAIP Network Configuration
Node A Node B Node C
SW1 - 192.168.0.0/24
SW2 - 10.0.0.0/24
• Highly Available Network providing
redundancy and aggregation
functions for the private
interconnect.
• No longer requires OS level
bonding configuration
• Better utilization of private
interfaces configured in the cluster
profile.
• Used by both Oracle Clusterware
components and the database.

HAIP Implementation Details
• All networks configured in cluster profile is used.
• Configures HAIP Addresses on the private interconnect.
• Addresses created through Link Local Address Protocol.
• Creates IP address in the 169.254.0.0/16 subnet
• Maximum of 4 HAIP addresses configured on any node
• Tolerates interface failures
• HAIP address on failed interface dynamically moved to another
interface
• Dynamically add/remove interfaces from the cluster profile

HAIP Failure Handling
Inst 1 Inst 2
SW1 - 10.0.0.0/24
Inst 3
SW1 - 192.168.0.0/24
169.254.0.1 169.254.0.3169.254.0.2
169.254.128.1 169.254.128.3169.254.128.2169.254.128.1 169.254.128.2 169.254.128.3

Inst 1 Inst 2
SW1 - 10.0.0.0/24
Inst 3
SW1 - 192.168.0.0/24
169.254.0.1 169.254.0.3169.254.0.2
169.254.128.3169.254.128.2
169.254.128.1
169.254.128.2 169.254.128.3

Inst 1 Inst 2
SW1 - 10.0.0.0/24
Inst 3
SW1 - 192.168.0.0/24
169.254.0.1 169.254.0.3169.254.0.2
169.254.128.3
169.254.128.1 169.254.128.2
169.254.128.3

Inst 1 Inst 2
SW1 - 10.0.0.0/24
Inst 3
SW1 - 192.168.0.0/24
169.254.0.1 169.254.0.3169.254.0.2
169.254.128.1 169.254.128.2 169.254.128.3

Inst 1 Inst 2
SW1 - 10.0.0.0/24
Inst 3
SW1 - 192.168.0.0/24
169.254.0.1 169.254.0.3169.254.0.2
169.254.128.2169.254.128.1 169.254.128.3

24
LMS
Manage Global Buffer Cache

• LMS ships blocks based on
requests by remote clients
• LMS has its own retry
mechanism to handle block
shipping failures
• Very Expensive
• Bad for performance
• LMS can offload to its slaves
to mitigate outliers
• LMS CR Slaves
• LMS monitored by LMHB
LMS manages the Global Buffer Cache
Buffer
Cache
LMS*
Shared
Pool
In-
Memory
Misc
Buffer
Cache
LMS*
Shared
Pool
In-
Memory
Misc
Buffer
Cache
LMS*
Shared
Pool
In-
Memory
Misc
Buffer
Cache
LMS*
Shared
Pool
In-
Memory
Misc
Total SGA
S
G
A
S
G
A
S
G
A

CR Slaves to Mitigate Performance Outliers
• In previous releases, LMS work on incoming
consistent read requests in sequential fashion
• Sessions requesting consistent blocks that
require applying lot of undo may cause LMS to
be busy
• Starting with Oracle RAC 12c Release 2, LMS
offloads work to ‘CR slaves’ if the amount of
UNDO to be applied exceeds a certain, dynamic
threshold
• Default is 1 slave and additional slaves are
spawned as needed
26
Time Account Amount
T 13579 $2500
T+1 13579 $2000
T+2 13579 $1000
T+3 13579 $200

LMON Has the Final Word on
Which Instances are Part of a
Cluster DB
LMON has it’s own heartbeat to the
other LMONs and to the control file
If there is a timeout, LMON can
evict another instance
IMR – Instance Membership Recovery

IMR – Send Timeouts
• "IPC send timeout" occurs
when a cross instance
message is not acknowledged
by the remote instance within
5 min (default timeout)
resulting in an ORA-29740
• ORA-29770 and ORA-29771
error messages introduced in
11.2+ to take action before we
hit an “IPC send timeout” in
most cases.

Review: IPC Send Timeouts
• This example is from a 4 node cluster. The alert log from
instance 1 showed a send timeout and the receiver was on
instance 4:
alert_p599a.log-Mon Apr 17 09:42:10 2006
alert_p599a.log:IPC Send timeout detected. Sender ospid 3859
alert_p599d.log-Mon Apr 17 09:42:11 2006
alert_p599d.log:IPC Send timeout detected. Receiver ospid 9014
alert_p599d.log-Mon Apr 17 09:42:11 2006

Solving IPC Send Timeouts
• In 11.2 a non-fatal background process called LMHB (Heart beat
monitor) created to monitor health via periodic heart beats
• Processes monitored by LMHB:
• LMON (global enqueue service monitor)
• LMD0 (global enqueue service daemon)
• LMS* (global cache service process)
• LCK0 (Lock Process)
• DIAG and DIA0 (Diagnostic Processes)
• RMS0 (Oracle RAC management server)
• Possibly more depending on version

ORA-29770 and ORA-29771
• Any non fatal processes blocking the monitored processes (e.g.
holding latches) will be terminated after a timeout regardless of
system load.
• Fatal processes will only be terminated when load is low.
• Exceptions (no kill) are given when any of LM*, LCK, DIA*
processes are in the middle of CF enqueue or CF I/O operations,
row cache and library cache background operations, or doing
system state dump.

Evict Sick Unresponsive Nodes
• LMS1 (ospid: 22636) has detected no messaging activity from instance 1
LMS1 (ospid: 22636) issues an IMR to resolve the situation
Communications reconfiguration: instance number 1
• Evicting instance 1 from cluster
Waiting for instances to leave: 1
Sat Jul 24 10:38:45 2010
Remote instance kill is issued with system inc 10
Remote instance kill map (size 1) : 1
Waiting for instances to leave: 1
• Analysis: Instance 1 was hanging and not responding, so instance 2 evicted
instance 1 and waited instance 1 to abort.

Confidential – Oracle Internal/Restricted/Highly Restricted34
Automatic Storage Management
Stripe and Mirror Everything

Oracle-ASM Automatic Storage Management
35
ASM Cluster Pool of Storage
Disk Group BDisk Group AShared Disk
Groups
Wide File Striping
One to One
Mapping of ASM
Instances to
Servers
ASM Instance
Database Instance
ASM Disk
RAC Cluster
Node4Node3Node2Node1 Node5ASM ASM ASM ASM ASM
ASM Instance
Database Instance
DBA DBA DBCDBB DBBDBB

36
Removal of One to One Mapping and HA
Oracle Flex ASM provides even higher HA
Disk Group BDisk Group A
Databases share
ASM instances
ASM Instance
Database Instance
ASM Disk
RAC Cluster
Node5Node4Node3Node2Node1
Node1 runs as ASM
Client to Node2
ASM ASM
ASM Instance
ASM

37
Removal of One to One Mapping and HA
Oracle Flex ASM provides even higher HA
Disk Group BDisk Group A
Databases share
ASM instances
ASM Instance
Database Instance
ASM Disk
RAC Cluster
Node5Node4Node3Node2Node1 ASMASM ASM
ASM Instance
Node1 runs as ASM
Client to Node4

38
ASM Flex Disk Groups
Diskgroup
DB3 : File 1
DB2 : File 2 DB1 : File 3
DB3 : File 3
DB2 : File 1
DB1 : File 1
DB1 : File 2
DB2 : File 3DB3 : File 2
DB2 : File 4
Flex Diskgroup
DB1
File 1
File 2
File 3
DB2
File 1
File 2
File 3
File 4
DB3
File 1
File 2
File 3
Database-oriented Storage Management for additional flexibility and availability
File Group

Flex Disk Group
DB1
File 1
File 2
File 3
DB2
File 1
File 2
File 3
File 4
DB3
File 1
File 2
File 3
39
Database-oriented Storage Management for more flexibility and availability
ASM Flex Disk Groups
Quota
DB3
File 1
File 2
File 3
12.2 Flex Disk Group Organization • Flex Disk Groups enable
– Quota Management - limit the space
databases can allocate in a diskgroup and
thereby improve the customers’ ability to
consolidate databases into fewer DGs
– Redundancy Change – utilize lower
redundancy for less critical databases and
even change redundancies online.

Hang Manager
Detects and Resolves Hangs and Deadlocks

Overlooked & Underestimated – Hang Manager
Customers experience database hangs for a variety of reasons
High system load, workload contention, network congestion or errors
Before Hang Manager was introduced with Oracle RAC 11.2.0.2
Oracle required information to troubleshoot a hang - e.g.:
System state dumps
For RAC: global system state dumps
Customer usually had to reproduce with additional events
41
Why is a Hang Manager required?

42
Hang Manager - Workings
• Always on - Enabled by default
• Reliably detects database hangs
• Autonomically resolves them
• Considers QoS policies during Hang
Resolution
• Logs all detected hangs and their
resolutions
• New SQL interface to configure sensitivity
(Normal/High)

Hang Manager auto-tunes itself by
periodically collecting instance-and
cluster-wide hang statistics
Metrics like Cluster Health/Instance
health is tracked over a moving
average
This moving Average considered
during resolution
Holders waiting on SQL*Net
break/reset are fast tracked
Hang Manager Optimizations
43

Early Warning exposed via (V$ view)
Sensitivity can be set higher, if the user
feels the default level is too
conservative.
Hang Manager behavior can be further
fine-tuned by setting appropriate QoS
policies
DBMS_HANG_MANAGER.Sensitivity
44
Hang
Sensitivity
Level
Description Note
NORMAL Hang Manager uses its
default internal operating
parameters to try to meet
typical requirements for any
environments.
Default
HIGH Hang Manager is more alert
to sessions waiting in a chain
than when sensitivity is in
NORMAL level.

Data Guard
Respond to catastrophic site failures

46
Included with Oracle Database Enterprise Edition
Data Guard: Real-time Data Protection
Automatic Block Repair
Data Guard Broker
(Enterprise Manager Cloud Control or DGMGRL)
Failover to remote site
Primary Data Center DR Data Center

A licensable option to the Oracle Database Enterprise Edition
Active Data Guard: Advanced Capabilities
Zero data loss at any distance
Data Guard Broker
Offload Fast
Incremental
Backups
Offload read-only
workload to open
standby database

48
Getting most of your Active Data Guard DR site
Active Data Guard: Advanced Capabilities
Zero data loss at any distance
DML Redirection
Offload Fast
Incremental
Backups
Offload read-
mostly workload
to open standby
database
Data Guard Broker

Data Guard Standby Redo Apply
In a typical RAC Primary and RAC standby, Only one node of the
standby can apply redo
Other RAC nodes of the standby instance are typically in waiting mode
even if the apply is CPU bound.
Other instance only takes over redo apply only if the instance applying
redo crashes

Data Guard Standby Redo Apply
50

Multi-Instance Redo Apply
• Utilize all RAC nodes on standby to apply Redo
• Parallel, multi-instance recovery means “the standby DB will keep up”
• Standby recovery - utilizes CPU and I/O across all nodes of RAC standby
• Up to 3500MB+/sec apply rate on an 8 node RAC
• Multi-Instance Apply runs on all MOUNTED instances or all OPEN
Instances
• Exposed in the Broker with the ‘ApplyInstances’ property on
standby
recover managed standby database disconnect using instances 4;

Multi-Instance Redo Apply Performance
Utilize all Oracle RAC
instances on the Standby
database to parallelize
recovery
190 380
740
1480
700
1400
2752
5000
0
1000
2000
3000
4000
5000
6000
7000
1 Instance 2 Instances 4 Instances 8 Instances
Batch OLTP
Standby Apply rates in MB/sec running OLTP, Batch workload on Exadata

Autonomous Database = RAC on Exadata (& More)
Autonomous
Database
Automated
Data Center Operations
Oracle Cloud
• Oracle RAC is enabled on Oracle Autonomous Cloud offering
• Oracle RAC meets and exceeds the stringent Autonomous Transaction Processing
Dedicated (ATP-D) requirements
• Successfully providing scalability and availability to the Oracle Database for all

Oracle RAC Family of Solutions is
an integrated that works together
cohesively to ensure that
regardless of the failure, the stack
will continue to run with
minimum or no interruptions to
user sessions on both On
premise and Oracle Cloud
environments
55
Summary

Smart monitoring how does oracle rac manage resource, state ukoug19

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Smart monitoring how does oracle rac manage resource, state ukoug19

Similar to Smart monitoring how does oracle rac manage resource, state ukoug19 (20)

More from Anil Nair

More from Anil Nair (7)

Recently uploaded

Recently uploaded (20)

Smart monitoring how does oracle rac manage resource, state ukoug19