Profiling Mondrian MDX Requests in a Production Environment

Profiling Mondrian MDX Requests
in a Production Environment
Raimonds Simanovskis
@rsim

Make All Mondrian MDX Requests
Super Fast in a Production Environment
What Takes So Long Time When
Mondrian MDX Requests Are Slow?

Technology stack
eazybi.com github.com/rsim/mondrian-olap
github.com/rsim/mondrian_udf
Mondrian

Single JVM process
Big Monolithic Application
Multi-tenant web application
mondrian-olap JRuby library
Mondrian
Schema and
segment cache
DB schema 1
Dimension4Dimension3
Measure
DB schema 2
Measure
DB schema 10 000
Measure
DB schema 10 001
Measure
… …

Black Mondrian Magic
Mondrian
Schema
and
segment
cache
DB schema
Measure
SELECT {[Measures].[Store Sales],
[Measures].[Store Cost],
[Measures].[Unit Sales] } ON COLUMNS,
[Customers].[State Province].Members ON ROWS
FROM [Sales]
select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`
from èazybi_development_dwh_20`.`d_customers` as `d_customers`
group by `d_customers`.`country`, `d_customers`.`state_province`
order by ISNULL(`d_customers`.`country`) ASC, `d_customers`.`country` ASC,
ISNULL(`d_customers`.`state_province`) ASC, `d_customers`.`state_province` ASC
select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`,
sum(`sales`.`store_sales`) as `m0`, sum(`sales`.`store_cost`) as `m1`,
sum(`sales`.ùnit_sales`) as `m2`
from èazybi_development_dwh_20`.`d_customers` as `d_customers`, èazybi_development_dwh_20`.`sales` as `sales`
where `sales`.`customer_id` = `d_customers`.ìd` and `d_customers`.`country` = 'USA'

Debugging in development
log4j.rootLogger=DEBUG, MONDRIAN
log4j.appender.MONDRIAN=org.apache.log4j.ConsoleAppender
log4j.appender.MONDRIAN.layout=org.apache.log4j.PatternLayout
log4j.appender.MONDRIAN.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss Z} %-5p [%c] %m%n
log4j.category.mondrian.mdx=DEBUG, MONDRIAN
log4j.category.mondrian.sql=DEBUG, MONDRIAN

Debugging in development
2018-11-22 16:32:02 +0000 DEBUG [mondrian.mdx] 308: select {[Measures].[Store Sales], [Measures].
[Store Cost], [Measures].[Unit Sales]} ON COLUMNS,
[Customers].[State Province].Members ON ROWS
from [Sales]
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 7: SqlTupleReader.readTuples [[Customers].[State
Province]]: executing sql [select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as
`c1` from èazybi_development_dwh_20`.`d_customers` as `d_customers` group by `d_customers`.`country`,
`d_customers`.`state_province` order by ISNULL(`d_customers`.`country`) ASC, `d_customers`.`country`
ASC, ISNULL(`d_customers`.`state_province`) ASC, `d_customers`.`state_province` ASC]
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 7: , exec 0 ms
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 7: , exec+fetch 3 ms, 3 rows
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 8: Segment.load: executing sql [select
`d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`, sum(`sales`.`store_sales`) as
`m0`, sum(`sales`.`store_cost`) as `m1`, sum(`sales`.ùnit_sales`) as `m2` from
èazybi_development_dwh_20`.`d_customers` as `d_customers`, èazybi_development_dwh_20`.`sales` as
`sales` where `sales`.`customer_id` = `d_customers`.ìd` and `d_customers`.`country` = 'USA' group by
`d_customers`.`country`, `d_customers`.`state_province`]
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 8: , exec 0 ms
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 8: , exec+fetch 7 ms, 3 rows
2018-11-22 16:32:02 +0000 DEBUG [mondrian.mdx] 308: exec: 68 ms

Query Timing and Profiling
/**
* Provides hooks for recording timing information of components of Query
* execution.
*
* <p>NOTE: This class is experimental and subject to change/removal
* without notice.
*
* <p>Code that executes as part of a Query can call
* {@link QueryTiming#markStart(String)}
* before executing, and {@link QueryTiming#markEnd(String)} afterwards, or can
* track execution times manually and call
* {@link QueryTiming#markFull(String, long)}.
*
* <p>To read timing information, add a handler to the statement using
* {@link mondrian.server.Statement#enableProfiling} and implement the
* {@link mondrian.spi.ProfileHandler#explain(String, QueryTiming)} method.
*
* @author jbarnett
*/
public class QueryTiming {

mondrian-olap query profiling
result = connection.execute(
"SELECT {[Measures].[Store Sales], [Measures].[Store Cost], [Measures].[Unit Sales]} ON COLUMNS "
"[Customers].[State Province].Members ON ROWS "
"FROM [Sales]",
profiling: true
)
Axis (COLUMNS):
SetListCalc(name=SetListCalc, class=class mondrian.olap.fun.SetFunDef$SetListCalc, type=SetType<MemberType<member=[Measures].[Store
Sales]>>, resultStyle=MUTABLE_LIST)
2(name=2, class=class mondrian.olap.fun.SetFunDef$SetListCalc$2, type=MemberType<member=[Measures].[Store Sales]>, resultStyle=VALUE)
Literal(name=Literal, class=class mondrian.calc.impl.ConstantCalc, type=MemberType<member=[Measures].[Store Sales]>,
resultStyle=VALUE_NOT_NULL, value=[Measures].[Store Sales])
2(name=2, class=class mondrian.olap.fun.SetFunDef$SetListCalc$2, type=MemberType<member=[Measures].[Store Cost]>, resultStyle=VALUE)
Literal(name=Literal, class=class mondrian.calc.impl.ConstantCalc, type=MemberType<member=[Measures].[Store Cost]>,
resultStyle=VALUE_NOT_NULL, value=[Measures].[Store Cost])
2(name=2, class=class mondrian.olap.fun.SetFunDef$SetListCalc$2, type=MemberType<member=[Measures].[Unit Sales]>, resultStyle=VALUE)
Literal(name=Literal, class=class mondrian.calc.impl.ConstantCalc, type=MemberType<member=[Measures].[Unit Sales]>,
resultStyle=VALUE_NOT_NULL, value=[Measures].[Unit Sales])
Axis (ROWS):
Members(name=Members, class=class mondrian.olap.fun.LevelMembersFunDef$1, type=SetType<MemberType<level=[Customers].[State Province]>>,
resultStyle=MUTABLE_LIST)
Literal(name=Literal, class=class mondrian.calc.impl.ConstantCalc, type=LevelType<level=[Customers].[State Province]>,
resultStyle=VALUE_NOT_NULL, value=[Customers].[State Province])
result.profiling_plan

mondrian-olap query profiling
result.profiling_timing_string
SqlStatement-Segment.load invoked 1 times for total of 7ms. (Avg. 7ms/invocation)
SqlStatement-SqlTupleReader.readTuples [[Customers].[State Province]] invoked 1 times for total of
3ms. (Avg. 3ms/invocation)
result.total_duration
123

SQL logging in a string buffer
sql_logger = Java::org.apache.log4j.Logger.getLogger('mondrian.rolap.RolapUtil')
sql_logger.setAdditivity(false)
sql_log_buffer = StringIO.new
sql_log_stream = sql_log_buffer.to_outputstream
log_layout = org.apache.log4j.PatternLayout.new("%m%n")
log_appender = org.apache.log4j.WriterAppender.new(log_layout, @sql_log_stream)
class_synchronize { sql_logger.addAppender(log_appender) }
sql_logger.setLevel(org.apache.log4j.Level::DEBUG)
log_lines = sql_log_buffer.string.lines.map(&:strip)
# Always show the last log line as the last SQL probably didn't complete
last_log_line = log_lines.pop
# Filter SQL queries only for the current account using the table schema prefix
from_regexp = /^from #{Regexp.escape log_table_prefix}/
log_lines.grep(/done executing sql/).concat(Array(last_log_line)).map do |log_line|
formatted_sql_query = format_log_sql_query(log_line)
formatted_sql_query if formatted_sql_query =~ from_regexp
end.compact

SQL logging results
SqlTupleReader.readTuples [[Customers].[State Province]]: done executing sql [
select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`
from èazybi_development_dwh_20`.`d_customers` as `d_customers`
order by ISNULL(`d_customers`.`country`) ASC, `d_customers`.`country` ASC,
ISNULL(`d_customers`.`state_province`) ASC, `d_customers`.`state_province` ASC
], exec+fetch 3 ms, 3 rows, ex=8, close=8, open=[]
Segment.load: done executing sql [
select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`,
sum(`sales`.`store_sales`) as `m0`, sum(`sales`.`store_cost`) as `m1`, sum(`sales`.ùnit_sales`) as
`m2`
from èazybi_development_dwh_20`.`d_customers` as `d_customers`, èazybi_development_dwh_20`.`sales`
as `sales`
where `sales`.`customer_id` = `d_customers`.ìd` and `d_customers`.`country` = 'USA'
], exec+fetch 7 ms, 3 rows, ex=9, close=9, open=[]
SqlStatement-Segment.load invoked 1 times for total of 7ms. (Avg. 7ms/invocation)
SqlStatement-SqlTupleReader.readTuples [[Customers].[State Province]] invoked 1 times for total of
3ms. (Avg. 3ms/invocation)

Mondrian connection and schema classes

Mondrian schema pool
/**
* A collection of schemas, identified by their connection properties
* (catalog name, JDBC URL, and so forth).
*/
class RolapSchemaPool {
private final Map<SchemaKey, ExpiringReference<RolapSchema>>
mapKeyToSchema =
new HashMap<SchemaKey, ExpiringReference<RolapSchema>>();
/**
* An expiring reference is a subclass of {@link SoftReference}
* which pins the reference in memory until a certain timeout
* is reached. After that, the reference is free to be garbage
* collected if needed.
*
* <p>The timeout value must be provided as a String representing
* both the time value and the time unit. For example, 1 second is
* represented as "1s". Valid time units are [d, h, m, s, ms],
* representing respectively days, hours, minutes, seconds and
* milliseconds.
*/
public class ExpiringReference<T> extends SoftReference<T> {

• Checkout / checkin Mondrian connections from a pool
• Store last used timestamp for a schema
• Periodically flush unused schemas
Flush unused Mondrian schemas
2018-11-23 07:43:18 +0000 INFO: flushed 11 schemas from total 122 schemas (0.080 sec)

mondrian-olap flush schema
def self.raw_schema_pool
method = Java::mondrian.rolap.RolapSchemaPool.java_class.declared_method('instance')
method.accessible = true
method.invoke_static
end
def self.flush_schema_cache
method = Java::mondrian.rolap.RolapSchemaPool.java_class.declared_method('clear')
method.invoke(raw_schema_pool)
end
def self.flush_schema(schema_key)
method = Java::mondrian.rolap.RolapSchemaPool.java_class.declared_method('remove',
Java::mondrian.rolap.SchemaKey.java_class)
method.invoke(raw_schema_pool, raw_schema_key(schema_key))
end

Thank you! 
github.com/rsim/mondrian-olap
raimonds.simanovskis@eazybi.com

Profiling Mondrian MDX Requests in a Production Environment

Recommended

Recommended

More Related Content

Similar to Profiling Mondrian MDX Requests in a Production Environment

Similar to Profiling Mondrian MDX Requests in a Production Environment (20)

More from Raimonds Simanovskis

More from Raimonds Simanovskis (20)

Recently uploaded

Recently uploaded (20)

Profiling Mondrian MDX Requests in a Production Environment