Prof. Dr. Wil van der Aalst, Distinguished University Professor, Eindhoven University of Technology
Recently, process mining emerged as a new scientific discipline on the interface between process models and event data. Conventional Business Process Management (BPM) and Workflow Management (WfM) approaches and tools are mostly model-driven with little consideration for event data. Data Mining (DM), Business Intelligence (BI), and Machine Learning (ML) focus on data without considering end-to-end process models. Process mining aims to bridge the gap between BPM and WfM on the one hand and DM, BI, and ML on the other hand. The challenge is to turn torrents of event data ("Big Data") into valuable insights related to performance and compliance. Process mining is one of the few mature approaches that can indeed be used to identify, understand, and predict bottlenecks, inefficiencies, deviations, and risks. Process mining helps organizations to "mine their own business": they are enabled to discover, monitor and improve real processes by extracting knowledge from event logs. In his talk, Wil van der Aalst will provide an overview of this exciting field. Moreover, he will focus on the predictive value of process mining. All of this will be illustrated using many real-life examples demonstrating the unique capabilities of today’s process mining tools.
2. PAGE 1
How many
students will
take my class
next year?
What is the
real
curriculum?
When, where
and why do
students drop
out?
When, where and
why do students
deviate?
Will I
graduate or
drop out?
When will I
graduate?
Should I take this
course if I want to
graduate in two
years from now?
4. The data scientist …
PAGE 3
Data
Science
Probability
and Statistics
Stochastic
Networks
Data Mining
Process
Mining
Visualization
Large-Scale
Distributed
SystemsData-
Intensive
Algorithms
Data-Driven
Operations
Management
Data-Driven
Innovation and
Business
Human and
Social Analytics
Privacy, Security,
Ethics, and
Governance
Internet of
Things
5. Data Science Center Eindhoven (DSC/e)
PAGE 4
Context: Why are we using data science,
does it have the intended effect, and will
people accept it?
Analysis: How to turn data into real value
(models, answers/decisions, and
visualizations/insights)?
Enabling technologies: How to get the
data and deal with computational/
infrastructural challenges (big data and
hard questions)?
Probability and Statistics
Stochastic Networks
Data Mining
Process Mining
Visualization
Large-Scale Distributed
Systems
Data-Intensive Algorithms
Data-Driven Operations
Management
Data-Driven Innovation and
Business
Human and Social Analytics
Privacy, Security, Ethics, and
Governance
Internet of Things
28 research
groups are
involved
6. Data Science Center Eindhoven (DSC/e)
PAGE 5
Context: Why are we using data science,
does it have the intended effect, and will
people accept it?
Analysis: How to turn data into real value
(models, answers/decisions, and
visualizations/insights)?
Enabling technologies: How to get the
data and deal with computational/
infrastructural challenges (big data and
hard questions)?
Probability and Statistics
Stochastic Networks
Data Mining
Process Mining
Visualization
Large-Scale Distributed
Systems
Data-Intensive Algorithms
Data-Driven Operations
Management
Data-Driven Innovation and
Business
Human and Social Analytics
Privacy, Security, Ethics, and
Governance
Internet of Things
systems
infrastructures
cities
organizations
people
[RP1] Process Analytics:
Improving Service While Cutting Costs
[RP2] Customer Journey:
Correlating Events to Learn and Influence
Customer Behavior
[RP3] Smart Maintenance & Diagnostics:
Safeguarding Availability
[RP4] Quantified Self:
Improving Performance and Well-Being
[RP5] Data Value and Privacy:
Economic and Legal Aspects of Data Science
[RP6] Smart Cities:
Ensuring Safety and Convenience for Citizens
[RP7] Smart Grids:
Data Intensive Infrastructures
7 research programs
12. PAGE 11
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
Case Activity Timestamp Resource
432 register travel request (a) 18-3-2014:9.15 John
432 get support from local manager (b) 18-3-2014:9.25 Mary
432 check budget by finance (d) 19-3-2014:8.55 John
432 decide (e) 19-3-2014:9.36 Sue
432 accept request (g) 19-3-2014:9.48 Mary
Play-In
Play-Out
Replay
Let's play
13. PAGE 12
Play-Out
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
Case Activity Timestamp Resource
432 register travel request (a) 18-3-2014:9.15 John
432 get support from local manager (b) 18-3-2014:9.25 Mary
432 check budget by finance (d) 19-3-2014:8.55 John
432 decide (e) 19-3-2014:9.36 Sue
432 accept request (g) 19-3-2014:9.48 Mary
14. Play Out: A possible scenario
PAGE 13
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
AND-
split
XOR-
split
XOR-
join
XOR-
join
AND-
join
XOR-
split
XOR-
join
a b d e g
Case Activity Timestamp Resource
432 register travel request (a) 18-3-2014:9.15 John
432 get support from local manager (b) 18-3-2014:9.25 Mary
432 check budget by finance (d) 19-3-2014:8.55 John
432 decide (e) 19-3-2014:9.36 Sue
432 accept request (g) 19-3-2014:9.48 Mary
15. Play Out: Process model allows for many
more scenarios
PAGE 14
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
abdeg
abcefbdeh
adceg adbeh
acbefbdeh
acdefcdefbdeh
adcefcdefbdefbdeg
abdegadcehadbeh
acbefbdegacdefcdefbdeh
adcefcdefbdefbdeg
abdeg
adceh
adbeh
acbefbdegacdefcdefbdeh
adcefcdefbdefbdeg
abdeg
16. PAGE 15
Play-In
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
Case Activity Timestamp Resource
432 register travel request (a) 18-3-2014:9.15 John
432 get support from local manager (b) 18-3-2014:9.25 Mary
432 check budget by finance (d) 19-3-2014:8.55 John
432 decide (e) 19-3-2014:9.36 Sue
432 accept request (g) 19-3-2014:9.48 Mary
17. Play In:
Process allowing for more traces
PAGE 16
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
abdeg
abcefbdeh
adcegadbeh
acbefbdehacdefcdefbdehadcefcdefbdefbdeg abdeg
adcehadbeh
acbefbdeg
acdefcdefbdeh
adcefcdefbdefbdeg
abdeg
adceh
adbehacbefbdeg
acdefcdefbdeh
adcefcdefbdefbdeg
abdeg
22. PAGE 21
Replay
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
Case Activity Timestamp Resource
432 register travel request (a) 18-3-2014:9.15 John
432 get support from local manager (b) 18-3-2014:9.25 Mary
432 check budget by finance (d) 19-3-2014:8.55 John
432 decide (e) 19-3-2014:9.36 Sue
432 accept request (g) 19-3-2014:9.48 Mary
23. Replay
PAGE 22
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
a c d e g
24. Replay
PAGE 23
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
a c
check budget
(d) is missing!
e g
?
25. Alignments: Relating reality and model
PAGE 24
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
a c » e g
a c d e g
check budget (d) did not happen but
should have according to the model
26. Replay
PAGE 25
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
a c d e gh
reject request (h)
is impossible
?
27. Alignments: Relating reality and model
PAGE 26
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
a c h d e g
a c » d e g
reject request (h) happened but could
not happen according to the model
29. Any trace in reality can be related to a
path in the model
PAGE 28
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
a c e f d d b c e h
30. Any trace in reality can be related to a
path in the model
PAGE 29
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
a c » e f d d b c e h
a c d e f d » b » e h
check is
missing
one check
too many
cannot both
be done
32. Replay with timestamps
PAGE 31
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
a9.15 c9.20 d9.35 e10.15 g11.30
9.15
9.20
9.35
10.15
11.30
5 55
20 40
75
33. Replay with timestamps for many traces
PAGE 32
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
frequencies
of activities
frequencies
of paths
durations of
activities
waiting times and
other delays between
activities
34. PAGE 33
• conformance checking to diagnose deviations
• squeezing reality into the model to do model-based
analysis
Alignments are essential!
35. Example: BPI Challenge 2012
(Dutch financial institute, doi:10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f)
PAGE 34
“O_DECLINED” and “W_Wijzigen
contractgegevens” are often skipped
Many moves on log of
“O_CANCELLED”,
”O_CREATED”,
”O_SELECTED”,
“O_SENT” occurred
with the same
frequency value (i.e.
60) before parallel
branch
Many moves on log of
“W_Afhandelen
leads” ( > 2200 times)
occurred in the end of
traces
Loops of “W_Completeren aanvraag” and
“W_Nabellen offertes” are often performed
Work of Arya Adriansyah (Replay project)
36. PAGE 35
“O_DECLINED” and “W_Wijzigen
contractgegevens” are often skipped
Many moves on log of
“O_CANCELLED”,
”O_CREATED”,
”O_SELECTED”,
“O_SENT” occurred
with the same
frequency value (i.e.
60) before parallel
branch
Many moves on log of
“W_Afhandelen
leads” ( > 2200 times)
occurred in the end of
traces
Loops of “W_Completeren aanvraag” and
“W_Nabellen offertes” are often performed
Synchronous moves of
“Completeren aanvraag”
Move on log of “Completeren aanvraag”
Moves on model towards end of traces
Move on log of “O_CANCELLED” and “A_CANCELLED”
Auditor's toolbox
37. PAGE 36
“O_ACCEPTED” has average sojourn time of 27.07 minutes,
while “A_REGISTERED”, ”A_ACTIVATED”, and
“A_APPROVED” have average sojourn time of 29.56 minutes
Activity “W_Wijzigen contractgegevens” is the
bottleneck, but it occured rarely (only 4 times)
The average waiting time for the input place of
“W_Nabellen offertes+START” is very long (2.83 days)
compares to the average waiting time of other places
Business analyst's
toolbox
40. PAGE 39
models are like maps, their usefulness
is determined by the intended use,
i.e., there is not a single "perfect map"
41. PAGE 40
traffic jams are
projected on map
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
42. Good process models can be used to
predict events and recommend actions
PAGE 41
register travel
request (a)
get detailed
motivation
letter (c)
get support
from local
manager (b)
check budget
by finance (d)
decide (e)
accept
request (g)
reject
request (h)
reinitiate
request (f)
start end
43. PAGE 42
Demand TomTom!
Do not settle for restrictive
information systems and
static process models
predict: when
will I be home
recommend:
turn right
adapt: use real-
time traffic
information
48. Process Mining: Relating event data and
models (normative or descriptive)
PAGE 47
software
system
(process)
model
event
logs
models
analyzes
discovery
records
events, e.g.,
messages,
transactions,
etc.
specifies
configures
implements
analyzes
supports/
controls
enhancement
conformance
“world”
people machines
organizations
components
business
processes
Play-Out
Play-In
Replay
52. Process Mining: Data Science in Action
https://www.coursera.org/course/procmin
data
mining
process
mining
visualization
data
science
behavioral/
social
sciences
domain
knowledge
machine
learning
large scale
distributed
computing
statistics industrial
engineering
databases
business
process
intelligence
stochastics
privacy
algorithms
visual
analytics
First Massive Open
Online Course (MOOC)
on Process Mining
Editor's Notes
Vision: recommend (do not force), predict, real-time, adaptive,