Benchmarking Oracle I/O Performance with Orion by Alex Gorbachev

Benchmarking Oracle I/O
Performance with ORION
Alex Gorbachev
Ottawa, ON
4-Feb-2013

Alex Gorbachev

• CTO, The Pythian Group
• Blogger

• OakTable Network member
• Oracle ACE Director

• BattleAgainstAnyGuess.com

• IOUG, Director of Communities

2 © 2011-2012 Pythian

Why Pythian
Recognized Leader:
• Global industry-leader in remote database administration services and consulting for Oracle,
Oracle Applications, MySQL and SQL Server

• Work with over 150 multinational companies such as Forbes.com, Fox Sports, Nordion and
Western Union to help manage their complex IT deployments

Expertise:
• One of the world’s largest concentrations of dedicated, full-time DBA expertise. Employ 7
Oracle ACEs/ACE Directors

• Hold 7 Specializations under Oracle Platinum Partner program, including Oracle Exadata, Oracle
GoldenGate & Oracle RAC

Global Reach & Scalability:
• 24/7/365 global remote support for DBA and consulting, systems administration, special
projects or emergency response

© 2011-2012 Pythian

We are a managed services, consulting and solution provider of elite database and system administration skills in Oracle, MySQL and
Microsoft SQL Server environments.

ORION - ORacle I/O Numbers

Generate I/O workload similar
to database patterns
&
measure I/O performance

5 © 2011-2012 Pythian

Orion is designed to

stress test
the I/O subsystem

6 © 2011-2012 Pythian

Orion isnot perfect
for simulation but

good enough

7 © 2011-2012 Pythian

Use Orion before moving/
deploying databases to the new
platform

8 © 2011-2012 Pythian

Orion is used in
two scenarios

9 © 2011-2012 Pythian

You know what you need
and want to
ensure you have it

or

You have no idea what you need
and want to ensure you get
the best you can
10 © 2011-2012 Pythian

The ﬁrst one is based on capacity planning.
The second you can call an infrastructure tuning

Infrastructure tuning - what’s the goal?

• When you don’t know how much you need you
try at least to ensure you take all you can
• Assess what’s your possible bottlenecks
• 1 Gbit Ethernet => 100+ MBPS or 10,000+ IOPS (8K)
• 15K RPM disk
• will easily serve 100-150 IOPS with average resp. time
<10ms
• can get to 200-250 IOPS but response time increase to 20
ms
• SSD - see vendors specs
• reads: random vs sequential... small vs large... no matter
• writes: pattern matters

11 © 2011-2012 Pythian

Orion

• Uses code-base similar to Oracle database kernel
• Standalone binary or part of Oracle home since 11.2.0.1
• Standalone Orion downloadable version is only 11.1
• Tests only I/O subsystem
• Minimal CPU consumption
• Async I/O is used to submit concurrent I/O requests
• Each run includes multiple data points / tests
• Scaling concurrency of small and large I/Os

12 © 2011-2012 Pythian

Controlling Orion

• Workload patterns
• Small random I/O size and scale
• Large I/O size, scale and pattern (random vs
sequential)
• Write percentage
• Cache warming

• Duration of each test (data point)

• Data layout (concatenation vs striping)

13 © 2011-2012 Pythian

Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...
..............................
..............................
..............................

14 © 2011-2012 Pythian

Each Orion run performs several tests and collects metrics for each test. The set of metrics for one test is a data
point. Based on the run configuration, Orion collects several data points scaling concurrency of small random IOs
and concurrency of large IOs.

Each data point is defined by the number of concurrent small I/O requests and the number of concurrent large IO
streams.

Orion iterates through concurrency of large I/Os from minimal to maximum (which can be the only one depending
on the run configuration) and then for each large IO concurrency level, it iterates through concurrency levels of
small IOs from minimum to maximum (which can be the only one as well depending on the run configuration). We
will see how it these ranges are selected later.

If you look at the matrix then you can imagine this process as running the tests row by row from top to bottom
and for each row, the sequence of tests is from left to right. Just like in English writing.

As Orion performs the tests, it writes the results in the trace file and at the end of the test it produces several
matrix files with collected metrics.

Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...

-run advanced -matrix detailed
# of tests = (Xlarge + 1) * (Xsmall + 1)

15 © 2011-2012 Pythian

There are several types of runs. Let’s first look into “advanced” mode and the rest of the runs are simpler versions
which present some of the parameters for you. You can think of them as wizard modes.

To define which data points are collected by Orion, the matrix type is defined. Detailed matrix is the most time
consuming to run - Orion will test every combination of large and small I/O workload - it will iterate from 0
concurrency level to maximum concurrency level for both large and small IOs.

Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...

-run advanced -matrix row -num_large 2
# of tests = Xsmall + 1

16 © 2011-2012 Pythian

Matrix row ﬁxes number of concurrent large I/O streams to a conﬁgurable number (can be zero) and iterates
through concurrency of small IOs.

Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...

-run advanced -matrix col -num_small 3
# of tests = Xlarge + 1

17 © 2011-2012 Pythian

Matrix col ﬁxes number of concurrent small IOs to a conﬁgurable number (can be zero) and iterates through
concurrency of large IO streams.

Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...

-run advanced -matrix basic
# of tests = Xlarge + Xsmall + 1

18 © 2011-2012 Pythian

Matrix basic performs tests of non-mixed small and large workloads.
First, Orion iterates through different concurrency levels of small IOs without any large IO streams.
Then, Orion iterates through concurrency of large IO streams without any small IOs.

Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
0, x, x, x, x, x, x, x, x, x x
1, x, x, x, x, x, x, x, x, x x
2, x, x, x, x, x, x, x, x, x x
3, x, x, x, x, x, x, x, x, x x
4, x, x, x, x, x, x, x, x, x x
5, x, x, x, x, x, x, x, x, x x
6, x, x, x, x, x, x, x, x, x x
7, x, x, x, x, x, x, x, x, x x
8, x, x, x, x, x, x, x, x, x x
9, x, x, x, x, x, x, x, x, x x
10, x, x, x, x, x, x, x, x, x x
11, x, x, x, x, x, x, x, x, x, x

-run advanced -matrix max
# of tests = Xlarge + Xsmall + 1

19 © 2011-2012 Pythian

Matrix max is similar to basic but instead of performing no large IO activity while iterating through small IOs,
Orion performs maximum number of large IO streams. The same with iterating through large IO streams
concurrency -- Orion will run at maximum concurrent small I/Os.

Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...

-run advanced -matrix point -num_large 2 -num_small 3
# of tests = 1

20 © 2011-2012 Pythian

Matrix point is the fastest run as it runs exactly one test deﬁned.

Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...

-run simple

21 © 2011-2012 Pythian

Non-advanced runs automatically deﬁne matrix type as well as most of other parameters.

Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...

-run normal

22 © 2011-2012 Pythian


Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...

-run oltp

23 © 2011-2012 Pythian


Data Points Matrix

Large/Small, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
0, x, x, x, x, x, x, x, x, x, x...
1, x, x, x, x, x, x, x, x, x, x...
2, x, x, x, x, x, x, x, x, x, x...
3, x, x, x, x, x, x, x, x, x, x...
4, x, x, x, x, x, x, x, x, x, x...
5, x, x, x, x, x, x, x, x, x, x...
6, x, x, x, x, x, x, x, x, x, x...
7, x, x, x, x, x, x, x, x, x, x...
8, x, x, x, x, x, x, x, x, x, x...
9, x, x, x, x, x, x, x, x, x, x...
10, x, x, x, x, x, x, x, x, x, x...
11, x, x, x, x, x, x, x, x, x, x...

-run dss

24 © 2011-2012 Pythian


Orion I/O Performance Metrics

• Small IOs
• iops - average number of IOs per second
• {test name}_{date}_{time}_iops.csv
• lat - average IO response time
• {test name}_{date}_{time}_lat.csv
• Large IOs
• mbps - throughput MB per second
• {test name}_{date}_{time}_mbps.csv

25 © 2011-2012 Pythian

Sample for -matrix detailed
iops
Large/Small, 1, 2, 3, 4, 5
0, 58, 114, 117, 127, 84
1, 11, 29, 49, 63, 81
2, 12, 23, 30, 24, 31

lat (us)
Large/Small, 1, 2, 3, 4, 5
0, 17184.84, 17487.14, 25594.11, 31505.73, 59205.26
1, 88272.75, 66781.92, 60642.59, 62514.76, 61699.40
2, 80854.55, 83085.06, 99019.72, 155528.65, 156500.44

mbps
Large/Small, 0, 1, 2, 3, 4, 5
1, 18.35, 12.14, 15.99, 16.99, 16.48, 16.37
2, 29.74, 27.07, 25.19, 21.18, 13.04, 13.33

26 © 2011-2012 Pythian

Orion 11.1.0.7 and earlier reports response time in ms.
11.2.0.1+ reports latency in us (microseconds)

Note how matrix is slightly different:
- iops and lat matrix exclude column with zero small IOs
- mbps matrix excludes row with zero large IOs

Sample for -matrix basic
iops
Large/Small, 1, 2, 3, 4, 5
0, 80, 153, 165, 163, 197
1
2

lat (us)
Large/Small, 1, 2, 3, 4, 5
0, 12370.09, 13060.23, 18112.16, 24448.27, 25250.33
1
2

mbps
Large/Small, 0, 1, 2, 3, 4, 5
1, 31.84
2, 29.87

27 © 2011-2012 Pythian

Trace file content
ran (small):
VLun = 0 Size = 10737418240
ran (small):
Index = 0 Avg Lat = 22996.61 us Count = 431
ran (small):
ran (small):
nio=848 nior=652 niow=196 req w%=25 act w%=23
ran (small):
my 2 oth 1 iops 65 lat 26081 us, bw = 0.51 MBps
dur 9.96 s size 8 K, min lat 932 us, max lat 227524 us READ
ran (small): my 2 oth 1 iops 19 lat 14499 us, bw = 0.15 MBps
dur 9.96 s size 8 K, min lat 1422 us, max lat 120529 us WRITE
ran (small): my 2 oth 1 iops 85 lat 23404 us, bw = 0.66 MBps
dur 9.96 s size 8 K, min lat 932 us, max lat 227524 us TOTAL

seq (large):
VLun = 0 Size = 10737418240
seq (large):
seq (large):
Stream = 0 VLun = 0 Start = 2675965952 End = 3152019456
seq (large):
Stream = 0 Avg Lat = 22038.99 us CIO = 1 NIO Count = 450
seq (large):
nio=450 nior=450 niow=0 req w%=25 act w%=0
seq (large):
my 1 oth 2 iops 45 lat 22039 us, bw = 45.22 MBps dur 9.95 s
size 1024 K, min lat 9976 us, max lat 223534 us READ
seq (large): my 1 oth 2 iops 0 lat 0 us, bw = 0.00 MBps dur 9.95 s
size 1024 K, min lat 18446744073709551614 us, max lat 0 us WRITE
seq (large): my 1 oth 2 iops 45 lat 22039 us, bw = 45.22 MBps dur 9.95 s
size 1024 K, min lat 9976 us, max lat 223534 us TOTAL

28 © 2011-2012 Pythian

Separate read and write statistics.
Actual write percentage is important for sequential large I/O
because it assigns streams to write or read.
IOPS, LAT and MBPS are actually calculated for all types of IO but
matrix doesn’t report them all.

Can parse trace ﬁle to extract all statistics available.

Note: write stats for large sequential IO is bogus since there was
no writes done.

Concurrent
I/O requests
=
number of
outstanding I/Os

Separate process
for large and
small I/Os

29 © 2011-2012 Pythian

For each task, Orion forks 2 separate processes performing large
and small IOs. If only large or only small IOs are performed then
only one process is forked.

Setting Scale of Concurrent I/Os

• Range of concurrency is {0..max}
• unless specified with -num_small or -num_large or fixed by run type
• max for small IOs
• num_disks * 5 for advanced, simple and normal runs
• num_disks * 20 for OLTP run
• max for large IOs
• num_disks * 15 for DSS run

30 © 2011-2012 Pythian

OLTP and DSS runs are impractical*

• Range 20 steps with interval num_disks
of concurrency is {0..max}
{num_disks..num_disks*20)
• unless specified with -num_small or -num_large or fixed by run type
• max for small IOs
• num_disks * 20 for oltp run To much
concurrency
• max for large IOs
• num_disks * 15 for dss run

15 steps with interval num_disks
{num_disks..num_disks*15)
* 11.2.0.3 behavior

31 © 2011-2012 Pythian

Orion command-line syntax
required arguments: -testname & -run

orion -testname {testname}
-run advanced | normal | simple | oltp | dss
-matrix detailed | col | row | basic | max | point
-duration {seconds}
-num_disks {disks}
-num_large {num}
-num_streamIO {num} Defines input file with the
-size_large {Kb} list of disks {testname}.lun
-type rand|seq in the current directory
-num_small {num} # cat mytest.lun
-size_small {Kb}
-simulate concat|raid0 /dev/sdc
-stripe {Mb} /dev/sdd
-write {%} /dev/sde
-cache_size {MB}
-verbose

32 © 2011-2012 Pythian

This is the full command-line syntax.
The two parameters that are always required are -testname and -
run.

-testname identifies the only input file that Orion needs with the
list of disks - each disk is a path on the new line. The file name
must be testname with added .lun extension and the file must be
in the current directory. Orion will also prefix the output results
with testname.

-run defines types of Orion run and the rest of parameters depend
on it.

-run normal

-duration 60
-num_disks {disks}
-num_large {num}
-type rand
-num_streamIO {num}
-size_large 1024
-num_small {num}
-size_small 8
-simulate concat
-stripe 1
-write 0
-cache_size {MB}
-verbose this is preset
this can’t be
this can be set

33 © 2011-2012 Pythian

For -run normal, Orion sets most of the parameters to predeﬁned
value and you can only specify -num_disks, -cache_size and -
verbose.

-run simple

-duration 60
-num_disks {disks}
-num_large {num}
-type rand
-num_streamIO {num}
-size_large 1024
-num_small {num}
-size_small 8
-simulate concat
-stripe 1
-write 0
-cache_size {MB}
this can’t be
this can be set

34 © 2011-2012 Pythian

-run simple has identical settings but the the -matrix is basic.

-run oltp

-duration {seconds}
-num_disks {disks}
-num_large {num}
-type rand|seq
-num_streamIO {num}
-size_large {Kb}
-num_small {num}
-size_small {Kb}
-simulate concat|raid0
-stripe {Mb}
-write {%}
-cache_size {MB}
this can’t be
this can be set

35 © 2011-2012 Pythian

-run oltp (make sure it’s lower case) lets you specify most of the
other parameters but you really only need to care about
parameters affecting small IOs. Defaults are used if you don’t
deﬁne a speciﬁc value.

-run dss

-duration {seconds}
-num_disks {disks}
-num_large {num}
-type rand|seq
-num_streamIO {num}
-size_large {Kb}
-num_small {num}
-size_small {Kb}
-stripe {Mb}
-write {%}
-cache_size {MB}
this can’t be
this can be set

36 © 2011-2012 Pythian

-run dss (make sure it’s lower case) lets you specify most of the
other parameters (except switching to sequential large IO streams)
but parameter controlling small IOs don’t matter. Defaults are
used if you don’t deﬁne a speciﬁc value.

-run advanced -matrix detailed | basic | max

-duration {seconds}
-num_disks {disks}
-num_large {num}
-type rand|seq
-num_streamIO {num}
-size_large {Kb}
-num_small {num}
-size_small {Kb}
-stripe {Mb}
-write {%}
-cache_size {MB}
this can’t be
this can be set

37 © 2011-2012 Pythian

Run -advanced is the most ﬂexible mode and depending on the
matrix type selected, most of the parameters can be speciﬁed.
When selecting -matrix detailed, basic or max, Orion selects
concurrency ranges for large and small IOs based on -num_disks
so -num_large and -num_small cannot be set explicitly.

-run advanced -matrix col

-duration {seconds}
-num_disks {disks}
-num_large {num}
-type rand|seq
-num_streamIO {num}
-size_large {Kb}
-num_small {num}
-size_small {Kb}
-stripe {Mb}
-write {%}
-cache_size {MB}
this can’t be
this can be set

38 © 2011-2012 Pythian

When selecting -matrix col (for column), you must specify -
num_small to deﬁne the column of data points to collect while -
num_large is not relevant.

-run advanced -matrix row

-duration {seconds}
-num_disks {disks}
-num_large {num}
-type rand|seq
-num_streamIO {num}
-size_large {Kb}
-num_small {num}
-size_small {Kb}
-stripe {Mb}
-write {%}
-cache_size {MB}
this can’t be
this can be set

39 © 2011-2012 Pythian

-matrix row is reverse to col - you must specify -num_large to
deﬁne the row of data points to collect while -num_small is not
relevant.

-run advanced -matrix point

-duration {seconds}
-num_disks {disks}
-num_large {num}
-type rand|seq
-num_streamIO {num}
-size_large {Kb}
-num_small {num}
-size_small {Kb}
-stripe {Mb}
-write {%}
-cache_size {MB}
this can’t be
this can be set

40 © 2011-2012 Pythian

To specify -matrix point, you need to explicitly set both -
num_small and -num_large to identify the data point to collect.

-simulate raid0

-duration {seconds}
-num_disks {disks}
-num_large {num}
-type rand|seq Great way to
-num_streamIO {num} simulate ASM
-size_large {Kb}
-num_small {num}
striping
-size_small {Kb}
-stripe {Mb}
-write {%}
-cache_size {MB}
-verbose

41 © 2011-2012 Pythian

Parameter -simulate controls how Orion treats multiple disks and
it has two options:
1. “concat” - all disks are concatenated sequentially into one
single virtual disk against which Orion submits IO requests.
2. “raid0” - Orion organizes a sing virtual disk by striping across
all disks deﬁned in the testname.lun ﬁle using stripe size that can
be set by -stripe parameter (default 1Mb). This is the best way to
simulate ASM striping.

-type seq

-duration {seconds}
-num_disks {disks}
-num_large {num}
-type rand|seq
-num_streamIO {num default 4}
-size_large {Kb}
-num_small {num}
-size_small {Kb}
-stripe {Mb}
-write {%}
-cache_size {MB}
-verbose

42 © 2011-2012 Pythian

Parameter -type controls large IO pattern:
1. “rand” - Orion performs large IOs across randomly selecting the
offset for each IO request from the whole virtual disk.
2. “seq” - Orion establishes multiple sequential IO streams
starting from predeﬁned offsets of the virtual disk (that produced
by concatenating or striping). The starting offsets are selected at
the beginning of each test by splitting the virtual disks in equal
chunks of number of concurrent stream.

Orion Sequential I/O
e t
on wai
ule d e
h ed st an on ait
Sc ue le w
q e du and
re ch est
IO S u
q
re e one
IO Sc hedul d wait
st an
IO reque

-num_streamIO 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

-num_streamIO 4 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5

Sch Sch
IO re edule fo IO re edule fo
Sch ques u
ques u
ts an r IO re edule fo ts an r
d wa ques u d wa
it ts an r it
d wa
it

43 © 2011-2012 Pythian

Each stream can also have multiple IO threads simulated (and by
default there are 4 threads). Thus when you are testing sequential
large IO, your real number of concurrent IO requests might
actually by much higher than you think because of default value
for -num_streamIO set to 4.

What I/O in Oracle behaves like -num_streamIO 4?

*n
• Some examples: ee
ds
• serial direct parallel read ve
r iﬁ
• ARCH reads of redo logs ca
tio
• some operations with temporary segments n
*

• How do you verify/know?
• Enable 10046 trace and OS trace (strace/truss/tusc)

44 © 2011-2012 Pythian

Orion Flexibility (Inflexibility?)

• Single Orion run is enough to assess scalability at defined
settings
• Need several separate Orion runs to vary
• write %
• large IO pattern
• IO size
• striping
• Need multiple concurrent runs to
• simulate more complex IO patterns
• simulate RAC

45 © 2011-2012 Pythian

Orion has lots of ﬂexibility in the settings. However, for a single
run there is very limited control on data points collected. Variation
of any settings other then concurrency requires separate Orion
runs.
When simulating more complex scenarios, you would also need to
combine multiple run and make sure they are running in sync. To
simplify synchronization, you would use -matrix point.
Otherwise, sync different data points is a nightmare especially that
Orion can’t be used to collect the same data point multiple times
over and over in the same run while another run (or runs) iterates
through other data points.

Scenarios: OLTP traffic

• -run advanced -matrix row -large_num 0
• Shadow processes’ “db file sequential reads”
• DBWR’s “db file parallel write”
• Optionally several runs with different settings like -write %
• Analyze IOPS & response time

46 © 2011-2012 Pythian

Instead of using “-run oltp” use advanced run settings. This run
will simulate random reads that foreground processes are doing
as well as background random writes performed by DBWR.

One almost universally good variation to drill into is write
percentage - this will let you assess how well I/O subsystem can
handle random writes as opposed to random reads. These tests
usually show that no matter what storage vendors claim about
their super smart storage arrays and caching algorithms,
sustained random writes ruin any parity based mirroring.

Scenarios: OLTP traffic visualization
Oracle Database Appliance example
ODA: Small IOPS scalability / HDDs
5,000 25

IOPS Response Time

4,000 20

IO Response Time, ms
3,000 15
Throughput, IOPS

2,000 10

1,000 5

0 0
1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

47 © 2011-2012 Pythian

This is an example of the ﬁrst Orion run of Oracle Database
Appliance to assess OLTP traffic scalability for read only workload.

Scenarios: OLTP traffic variation analysis
Varying write percentage in ODA
Small IOPS by writes percentage Oracle Database Appliance / OLPT / whole HDDs
7,000 80

6,000 70

60
5,000

Throughput, IOPS

50
4,000
40
3,000
30

2,000
20

1,000 10

0 0
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400
Concurrent IO requests

IOPS wrt 0% IOPS wrt 10% IOPS wrt 20% IOPS wrt 40% IOPS wrt 60%
Latency wrt 0% Latency wrt 10% Latency wrt 20% Latency wrt 40% Latency wrt 60%

48 © 2011-2012 Pythian

Now let’s introduce variable write percentage and assess the
impact. Because ODA doesn’t use any RAID technology, we see
almost no degradation.
However, since ASM will be doing host based tripple mirroring (for
this purpose comparable to RAID1), this IOPS metrics are from
disks perspective and not from the database perspective. We need
to adjust IOPS and write percentage to see the numbers from
database perspective after ASM mirroring.

Scenarios: OLTP traffic variation analysis
Write percentage adjusted for ASM mirroring
Small IOPS by writes percentage Oracle Database Appliance / OLPT / whole HDDs
7,000 80

6,000 70

60
5,000

Throughput, IOPS

50
4,000
40
3,000
30

2,000
20

1,000 10

0 0
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400
Concurrent IO requests

IOPS wrt 0% IOPS wrt 4% IOPS wrt 8% IOPS wrt 18% IOPS wrt 33%
Latency wrt 0% Latency wrt 4% Latency wrt 8% Latency wrt 18% Latency wrt 33%

49 © 2011-2012 Pythian

This is the adjusted IOPS and percentage values.

Impact of writes on RAID5 is huge
40% writes => 4 times lower IOPS

50 © 2011-2012 Pythian

Here is an explicit example of RAID5 shortcomings.

Same disks reconfigured as RAID1+0
40% writes => less than 50% hit

51 © 2011-2012 Pythian

Much less writes impact with RAID10 which actually becomes
noticeable closer to saturation point anyway.

Scenarios: Data Warehouse queries

• -runadvanced -matrix col -small_num 0
• Keep read only (-write 0)

• Concurrent users environment
• -type rand
• Single dedicated user performance
• -type seq
• -num_streamIO 1
• Most reads in the DB are synchronous
• Analyze MBPS

52 © 2011-2012 Pythian

To simulate data warehousing workload from concurrent users use
read only workload with random large reads. Even though
individual queries might be scanning tables in more sequential
manner, the high concurrency level makes them look like random.
Environments with low concurrency levels will probably look more
like multiple sequential scan streams.
For data warehouse performance you are normally interested in
the scan throughput measured as MB per second.

Scenarios: Data Warehouse IO visualization
Large IOs throughput
300

225
Throughput, MBPS

150

75

0
1 2 4 6 8 10 12 14 18 20 22 24 26 28 30 32
Concurrent threads

53 © 2011-2012 Pythian

Simple way to visualize.
You could also add throughput per reading stream to see
performance that each user doing serial scans will get, for
example.

Scenarios: RMAN backup

• -runadvanced -matrix col -small_num 0 -type seq
-num_streamIO 1
• Backup source only => -write 0

• Backup destination only => -write 100
• Database and backup destination combined => -write 50
• Watch for actual write percentage
• 1 thread => 0% actual writes
• 2 threads => 50% actual writes
• 3 threads => 33% actual writes
• 4 threads => 50% actual writes and etc...
• Analyze MBPS

54 © 2011-2012 Pythian

No backup compression overhead accounted for.

Orion will be actually more aggressive sending IO requests
because it will keep either writing non-stop or reading non-stop
while an RMAN process needs to read and write, read and write,
read ...

Scenarios: LGWR writes

• -run
advanced -matrix point -small_num 0 -type seq
-num_streamIO 1 -write 100 -num_large 1 -size_large 5
• -size_large should be set to average LGWR write size which is often
about 5-20k for OLTP systems
• -num_large n
• multiple instances
• multiple LGWR threads in RAC
• redo logs multiplexing
• Analyze IOPS and response time
• Gather from Orion run’s trace file

55 © 2011-2012 Pythian

Scenarios: LGWR writes visualization
ODA SSD sequential 32K IO streams (tripple mirroring)
8000 1.00

6400 0.80

Average Response time, ms
Writes per second

4800 0.60

3200 0.40

1600 0.20

0 0
2 4 6 8 10 12 14 16
Concurrent Threads

IOPS Response Time, ms

56 © 2011-2012 Pythian

Because you can’t throttle down each thread, each thread will go
as fast as it can so you you always pushing some kind of a limit
and you will be throttled by the maximum what an I/O subsystem
can deliver or by CPU but Orion consumes very little CPU so you
ignore it.

Combining different workloads

• Startmultiple parallel Orion runs
• OLTP -matrix point -num_large 0 -num_small X

• LGWR -matrix point -num_large 1 -num_small 0 -write 100
• ARCH -matrix point -num_small 0 -write {0 | 50}

• RMAN - matrix point -num_small 0 -write {0 | 50}

• Add batch data load with large parallel writes

• Add batch reporting (DW-like) with large reads

Cannot schedule a run
Cannot throttle IO other
with repetitive data points
than controlling number
- must schedule multiple
of outstanding IOs
consecutive runs

57 © 2011-2012 Pythian

Combining multiple runs is only reliable if using -matrix point.

EC2 large 5 EBS disks: first run to test scalability
Initial OTLP test with 5 disks and 20% writes
1,500 20

IOPS Response time, ms

1,125 15

Average response time, ms
IOPS

750 10

375 5

0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Number of concurrent IOs

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
IOPS 156 326 178 411 532 729 928 1,103 1,023 1,070 964 1,202 1,285 1,232 1,204 1,245 1,352 1,338 1,360 1,149 1,379 1,327 1,334 1,362 1,363
Response time, ms 6.4 6.1 10.2 9.7 9.3 8.2 7.5 7.2 8.8 9.3 11.4 10 10.1 11.3 12.4 12.8 12.6 13.4 14 17.3 15.2 16.5 17.2 17.6 18.2

58 © 2011-2012 Pythian

My initial run gives me general idea how my subsystem would
scale under different OLTP load with 20% writes. If I’m curios, I
might go further and perform few runs with different write
percentage and visualize the difference.

Let’s mix in additional I/O workloads

DUR=60

# OLTP test of scalability - original first run
# /root/orion11203/orion -testname baseoltp -run advanced -duration $DUR
-matrix row -num_large 0 -write 20

# OLTP point
/root/orion11203/orion -testname oltp -run advanced -duration $DUR -matrix point
-num_large 0 -num_small 10 -write 20 &
# Adding LGWR
/root/orion11203/orion -testname lgwr -run advanced -duration $DUR -matrix point
-num_large 1 -num_small 0 -type seq -num_streamIO 1 -size_large 5 -write 100 &
# Adding ARCH
/root/orion11203/orion -testname arch -run advanced -duration $DUR -matrix point
# Backup in 1 channel
# /root/orion11203/orion -testname backup -run advanced -duration $DUR -matrix point
# Backup in 4 channels
# /root/orion11203/orion -testname backup -run advanced -duration $DUR -matrix point

wait

59 © 2011-2012 Pythian

The ﬁrst commented out command I used to assess initial
scalability and build the run visualized on the previous slide.

I then take it and convert to “OLTP point” and run it.
Next step I add “Adding LGWR” to run in parallel.
After that I add ARCH and collect another data point and etc.

Note that they are all starting at the same time and run in parallel
in the background and the script waits for all background jobs to
complete at the end using “wait” command.

EC2... visualizing combined workload impact
150 75

LGWR writes per second
120 60

LGWR write, ms
90 45

60 30

30 15

0 0

OLTP IOPS Response time, ms LGWR writes LGWR write, ms

1500 20

Response time, ms
1200 16

900 12
IOPS

600 8

300 4

0 0
y R N1 N4
OLT P onl +LGW +ARC
H
RMA RMA
OLTP +LGWR +LG WR+ +LG WR+
OLTP OLTP OLTP

OLTP IOPS Response time, ms LGWR writes LGWR write, ms
OLTP only 1306 7.7
OLTP +LGWR 1239 8.1 139 7.1
OLTP+LGWR+ARCH 576 17.4 17 56.0
OLTP+LGWR+RMAN1 778 12.8 38 26.1
OLTP+LGWR+RMAN4 571 17.5 49 20.3

60 © 2011-2012 Pythian

I can then record how my OLTP traffic is affected in different
scenarios including LGWR performance.

The best Orion 11.2 new feature

Histograms!
Bucket LGWR no LGWR with no ARCH
ARCH ARCH
0 - 128 0 0
128 - 256 0 0
256 - 512 0 0
512 - 1024 1085 1
1024 - 2048 3376 8
2048 - 4096 395 1 with ARCH
4096 - 8192 845 0
8192 - 16384 1406 2
16384 - 32768 1115 161
32768 - 65536 161 699
65536 - 131072 4 169
0 - 128

128 - 256

256 - 512

512 - 1024

1024 - 2048

2048 - 4096

4096 - 8192

8192 - 16384

16384 - 32768

32768 - 65536

65536 - 131072

131072 - 262144

262144 - 524288

524288 - 1048576

1048576 - 2097152
131072 - 262144 0 17
262144 - 524288 1 10
524288 - 1048576 0 2
1048576 - 2097152 0 1

61 © 2011-2012 Pythian

Got RAC? Schedule parallel runs on each node

HP blades
HP Virtual Connect
Flex10
Big NetApp box
100 disks

62 © 2011-2012 Pythian

Example of Failed Expectations
NetApp NAS, 1 Gbit Ethernet, 42 disks

5000 30.0

4000
Read only 22.5

Latency, ms
3000
IOPS

15.0
2000

7.5
1000

0 0
1 2 3 4 5 10 20 30 40 50 60 70 80 90 100

IOPS Latency

5000 50

4000 40
Read write

Latency, ms
3000 30
IOPS

2000 20

1000 10

0 0
1 2 3 4 5 10 20 30 40 50 60 70 80 90 100

63 © 2011-2012 Pythian

Tune-Up Results
Switched from Intel to Broadcom NICs and disabled snapshots

IOPS Latency
10000 12

10
8000

8

Latency, ms
6000
IOPS

6
4000
4

2000
2

0 0
1 2 3 4 5 10 20 30 40 50 60 70 80 90 100
15000 8

12500
6
10000

Latency, ms
IOPS

7500 4

5000
2
2500

0 0
1 2 3 4 5 10 20 30 40 50 60 70 80 90 100

64 © 2011-2012 Pythian

Possible “What-If” scenarios

• Impact of a failed disk in a RAID group
• Different block size

• Different ASM allocation unit size (-stripe)
• Assess foreign workload impact (shared SAN with other
servers)
• Test impact of configuration / infrastructure changes

• Impact of backup or a batch job

• Impact of decreased MTTR target (higher -write %)

• Platform stability test (repeating the same data point for
many days)
• Impact of CPU starvation

65 © 2011-2012 Pythian

Concurrent IOs on axis X is not always the best...

ODA: Small IOPS scalability and data placement / HDDs
6,000 25
IOPS whole disk IOPS outside 40% IOPS inside 60%
Latency whole disk Latency outside 40% Latency inside 60%

4,800 20

Throughput, IOPS

3,600 15

2,400 10

1,200 5

0 0
1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

66 © 2011-2012 Pythian

Smarter presentation
50% IOPS at the same response time
ODA: Improving IO throughput by data placement
6000

4800

3600
IOPS

2400

whole disk
1200 outside 40%
inside 60%

0
0 5 10 15 20 25

IO response time

67 © 2011-2012 Pythian

Storage types

• Anything as long as ASYNC IO is supported

• Local storage (LUNs or filesystem)
• NAS via NFS

• iSCSI / FC devices (any block or raw device)

• Cluster filesystem should work just fine

68 © 2011-2012 Pythian

Beware of thin provisioning and other NAS magic

• Smart storage technologies play bad jokes
• If in doubt - “initialize” disks with non-zeroes

69 © 2011-2012 Pythian

Orion 11.2
Included in
• Database
• Grid home
• Client (tested Administrative option)

Dependencies
11.2.0.1 11.2.0.2 11.2.0.3
libcell11.so x x x
libclntsh.so.11.1 x x
libskgxp11.so x x
libnnz11.so x x

70 © 2011-2012 Pythian

Orion with SLOB (Silly Little Oracle Benchmark)

• Orion gives more control
• Orion is easier to setup

• Orion uses very little CPU - it doesn’t do anything with
data
• Easier to saturate IO subsystem without CPU starvation
• Less realistic results if you want to account database CPU use for
LIO and processing the data
• Less realistic for multiprocess orchestration

• SLOB - is more realistic but more difficult to control

71 © 2011-2012 Pythian

Thank you and Q&A
To contact us…

sales@pythian.com or hr@pythian.com

1-866-PYTHIAN

gorbachev@pythian.com
To follow us…
http://www.pythian.com/news/

http://www.facebook.com/pages/The-Pythian-Group/

http://twitter.com/pythian

http://www.linkedin.com/company/pythian

73 © 2011-2012 Pythian

Benchmarking Oracle I/O Performance with Orion by Alex Gorbachev

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Benchmarking Oracle I/O Performance with Orion by Alex Gorbachev

Similar to Benchmarking Oracle I/O Performance with Orion by Alex Gorbachev (13)

More from Alex Gorbachev

More from Alex Gorbachev (9)

Recently uploaded

Recently uploaded (20)

Benchmarking Oracle I/O Performance with Orion by Alex Gorbachev