Mondrian MDX requests can be slow in production environments for several reasons: large schemas with many dimensions and measures can cause performance issues; querying large datasets from the database can also impact performance. To address these issues, the document discusses profiling Mondrian requests, optimizing the JVM and database, using caching, and tuning the Mondrian schema. A Mondrian schema pool is also described that reuses schema objects and periodically flushes unused schemas to free memory.
9. Black Mondrian Magic
Mondrian
Schema
and
segment
cache
DB schema
Dimension4Dimension3
Dimension2Dimension1
Measure
SELECT {[Measures].[Store Sales],
[Measures].[Store Cost],
[Measures].[Unit Sales] } ON COLUMNS,
[Customers].[State Province].Members ON ROWS
FROM [Sales]
select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`
from `eazybi_development_dwh_20`.`d_customers` as `d_customers`
group by `d_customers`.`country`, `d_customers`.`state_province`
order by ISNULL(`d_customers`.`country`) ASC, `d_customers`.`country` ASC,
ISNULL(`d_customers`.`state_province`) ASC, `d_customers`.`state_province` ASC
select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`,
sum(`sales`.`store_sales`) as `m0`, sum(`sales`.`store_cost`) as `m1`,
sum(`sales`.`unit_sales`) as `m2`
from `eazybi_development_dwh_20`.`d_customers` as `d_customers`, `eazybi_development_dwh_20`.`sales` as `sales`
where `sales`.`customer_id` = `d_customers`.`id` and `d_customers`.`country` = 'USA'
group by `d_customers`.`country`, `d_customers`.`state_province`
10. Debugging in development
log4j.rootLogger=DEBUG, MONDRIAN
log4j.appender.MONDRIAN=org.apache.log4j.ConsoleAppender
log4j.appender.MONDRIAN.layout=org.apache.log4j.PatternLayout
log4j.appender.MONDRIAN.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss Z} %-5p [%c] %m%n
log4j.category.mondrian.mdx=DEBUG, MONDRIAN
log4j.category.mondrian.sql=DEBUG, MONDRIAN
11. Debugging in development
2018-11-22 16:32:02 +0000 DEBUG [mondrian.mdx] 308: select {[Measures].[Store Sales], [Measures].
[Store Cost], [Measures].[Unit Sales]} ON COLUMNS,
[Customers].[State Province].Members ON ROWS
from [Sales]
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 7: SqlTupleReader.readTuples [[Customers].[State
Province]]: executing sql [select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as
`c1` from `eazybi_development_dwh_20`.`d_customers` as `d_customers` group by `d_customers`.`country`,
`d_customers`.`state_province` order by ISNULL(`d_customers`.`country`) ASC, `d_customers`.`country`
ASC, ISNULL(`d_customers`.`state_province`) ASC, `d_customers`.`state_province` ASC]
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 7: , exec 0 ms
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 7: , exec+fetch 3 ms, 3 rows
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 8: Segment.load: executing sql [select
`d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`, sum(`sales`.`store_sales`) as
`m0`, sum(`sales`.`store_cost`) as `m1`, sum(`sales`.`unit_sales`) as `m2` from
`eazybi_development_dwh_20`.`d_customers` as `d_customers`, `eazybi_development_dwh_20`.`sales` as
`sales` where `sales`.`customer_id` = `d_customers`.`id` and `d_customers`.`country` = 'USA' group by
`d_customers`.`country`, `d_customers`.`state_province`]
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 8: , exec 0 ms
2018-11-22 16:32:02 +0000 DEBUG [mondrian.sql] 8: , exec+fetch 7 ms, 3 rows
2018-11-22 16:32:02 +0000 DEBUG [mondrian.mdx] 308: exec: 68 ms
12. Query Timing and Profiling
/**
* Provides hooks for recording timing information of components of Query
* execution.
*
* <p>NOTE: This class is experimental and subject to change/removal
* without notice.
*
* <p>Code that executes as part of a Query can call
* {@link QueryTiming#markStart(String)}
* before executing, and {@link QueryTiming#markEnd(String)} afterwards, or can
* track execution times manually and call
* {@link QueryTiming#markFull(String, long)}.
*
* <p>To read timing information, add a handler to the statement using
* {@link mondrian.server.Statement#enableProfiling} and implement the
* {@link mondrian.spi.ProfileHandler#explain(String, QueryTiming)} method.
*
* @author jbarnett
*/
public class QueryTiming {
13. Query Timing and Profiling
/**
* Provides hooks for recording timing information of components of Query
* execution.
*
* <p>NOTE: This class is experimental and subject to change/removal
* without notice.
*
* <p>Code that executes as part of a Query can call
* {@link QueryTiming#markStart(String)}
* before executing, and {@link QueryTiming#markEnd(String)} afterwards, or can
* track execution times manually and call
* {@link QueryTiming#markFull(String, long)}.
*
* <p>To read timing information, add a handler to the statement using
* {@link mondrian.server.Statement#enableProfiling} and implement the
* {@link mondrian.spi.ProfileHandler#explain(String, QueryTiming)} method.
*
* @author jbarnett
*/
public class QueryTiming {
16. SQL logging in a string buffer
sql_logger = Java::org.apache.log4j.Logger.getLogger('mondrian.rolap.RolapUtil')
sql_logger.setAdditivity(false)
sql_log_buffer = StringIO.new
sql_log_stream = sql_log_buffer.to_outputstream
log_layout = org.apache.log4j.PatternLayout.new("%m%n")
log_appender = org.apache.log4j.WriterAppender.new(log_layout, @sql_log_stream)
class_synchronize { sql_logger.addAppender(log_appender) }
sql_logger.setLevel(org.apache.log4j.Level::DEBUG)
log_lines = sql_log_buffer.string.lines.map(&:strip)
# Always show the last log line as the last SQL probably didn't complete
last_log_line = log_lines.pop
# Filter SQL queries only for the current account using the table schema prefix
from_regexp = /^from #{Regexp.escape log_table_prefix}/
log_lines.grep(/done executing sql/).concat(Array(last_log_line)).map do |log_line|
formatted_sql_query = format_log_sql_query(log_line)
formatted_sql_query if formatted_sql_query =~ from_regexp
end.compact
17. SQL logging results
SqlTupleReader.readTuples [[Customers].[State Province]]: done executing sql [
select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`
from `eazybi_development_dwh_20`.`d_customers` as `d_customers`
group by `d_customers`.`country`, `d_customers`.`state_province`
order by ISNULL(`d_customers`.`country`) ASC, `d_customers`.`country` ASC,
ISNULL(`d_customers`.`state_province`) ASC, `d_customers`.`state_province` ASC
], exec+fetch 3 ms, 3 rows, ex=8, close=8, open=[]
Segment.load: done executing sql [
select `d_customers`.`country` as `c0`, `d_customers`.`state_province` as `c1`,
sum(`sales`.`store_sales`) as `m0`, sum(`sales`.`store_cost`) as `m1`, sum(`sales`.`unit_sales`) as
`m2`
from `eazybi_development_dwh_20`.`d_customers` as `d_customers`, `eazybi_development_dwh_20`.`sales`
as `sales`
where `sales`.`customer_id` = `d_customers`.`id` and `d_customers`.`country` = 'USA'
group by `d_customers`.`country`, `d_customers`.`state_province`
], exec+fetch 7 ms, 3 rows, ex=9, close=9, open=[]
SqlStatement-Segment.load invoked 1 times for total of 7ms. (Avg. 7ms/invocation)
SqlStatement-SqlTupleReader.readTuples [[Customers].[State Province]] invoked 1 times for total of
3ms. (Avg. 3ms/invocation)
22. Mondrian schema pool
/**
* A collection of schemas, identified by their connection properties
* (catalog name, JDBC URL, and so forth).
*/
class RolapSchemaPool {
private final Map<SchemaKey, ExpiringReference<RolapSchema>>
mapKeyToSchema =
new HashMap<SchemaKey, ExpiringReference<RolapSchema>>();
/**
* An expiring reference is a subclass of {@link SoftReference}
* which pins the reference in memory until a certain timeout
* is reached. After that, the reference is free to be garbage
* collected if needed.
*
* <p>The timeout value must be provided as a String representing
* both the time value and the time unit. For example, 1 second is
* represented as "1s". Valid time units are [d, h, m, s, ms],
* representing respectively days, hours, minutes, seconds and
* milliseconds.
*/
public class ExpiringReference<T> extends SoftReference<T> {
23. • Checkout / checkin Mondrian connections from a pool
• Store last used timestamp for a schema
• Periodically flush unused schemas
Flush unused Mondrian schemas
2018-11-23 07:43:18 +0000 INFO: flushed 11 schemas from total 122 schemas (0.080 sec)
2018-11-23 07:53:27 +0000 INFO: flushed 15 schemas from total 134 schemas (0.433 sec)
2018-11-23 08:03:38 +0000 INFO: flushed 20 schemas from total 137 schemas (0.138 sec)
2018-11-23 08:15:06 +0000 INFO: flushed 20 schemas from total 136 schemas (0.135 sec)
2018-11-23 08:25:27 +0000 INFO: flushed 21 schemas from total 134 schemas (0.196 sec)
2018-11-23 08:36:20 +0000 INFO: flushed 9 schemas from total 135 schemas (0.069 sec)
2018-11-23 08:47:00 +0000 INFO: flushed 11 schemas from total 144 schemas (0.176 sec)
2018-11-23 08:57:14 +0000 INFO: flushed 13 schemas from total 153 schemas (0.139 sec)
2018-11-23 09:08:31 +0000 INFO: flushed 22 schemas from total 160 schemas (0.163 sec)
2018-11-23 09:19:16 +0000 INFO: flushed 11 schemas from total 156 schemas (0.069 sec)