Kristian Kersting, Associate Professor for Computer Science, TU Dortmund University, Germany at MLconf SEA - 5/20/16

Declarative Programming
for Statistical ML
Kristian
Kersting
Martiin
Mladenov
TUD
Babak
Ahmadi
PicoEgo
Amir
Globerson
HUJI
Martin
Grohe
RWTH
Sriraam
Natarajan
U. Indiana
Leonard
Kleinhans
TUD
Danny
Heinrich
TUD
andmany
more…
Pavel
Tokmakov
INRIA
Grenoble

Is there a “-01” flag for
Statistical ML?
Kristian
Kersting
Martiin
Mladenov
TUD
Babak
Ahmadi
PicoEgo
Amir
Globerson
HUJI
Martin
Grohe
RWTH
Sriraam
Natarajan
U. Indiana
Leonard
Kleinhans
TUD
Danny
Heinrich
TUD
andmany
more…
Pavel
Tokmakov
INRIA
Grenoble

Kristian Kersting - Declarative Programming for Statistical ML

There is an arms race to “deeply”
understand data

Take your spreadsheet …
Features
Objects

Latent Dirichlet AllocationMatrix Factorization
Features
Objects
… and apply some ML
Gaussian Processes
Decision Trees/Boosting
Autoencoder/Deep Learning
and many more …
Support Vector Machines

IS IT REALLY THAT SIMPLE?

Guy van den Broeck
UCLA
card
(1,d2)
card
(1,d3)
card
(1,pAce)
card
(52,d2)
card
(52,d3)
card
(52,pAce)
…
…
…
…

Guy van den Broeck
UCLA
No independencies.
Fully connected.
22704 states
card
(1,d2)
card
(1,d3)
card
(1,pAce)
card
(52,d2)
card
(52,d3)
card
(52,pAce)
…
…
…
…

Guy van den Broeck
UCLA
A machine will not solve
the problem
card
(1,d2)
card
(1,d3)
card
(1,pAce)
card
(52,d2)
card
(52,d3)
card
(52,pAce)
…
…
…
…

Faster modelling
Faster inference and learning

Symmetry-Aware Message Passing
Compress the model
Run message passing
inference on the
smaller model
[Singla, Domingos AAAI’08; Kersting, Ahmadi, Natarajan UAI’09; Ahmadi, Kersting, Mladenov, Natarajan MLJ’13;
Mladenov, Globerson, Kersting AISTATS `14, UAI `14; Mladenov, Kersting UAI´15; ...]
Big model
small model

De Raedt, Kersting, Natarajan, Poole
“Statistical Relational Artificial Intelligence”,2016
… the study and design of intelligent agents that act in noisy
worlds composed of objects and relations among the objects
Statistical Relational AI
ScalingUncertainty
Logic
Graphs
Trees
Mining
And
Learning
[Getoor, Taskar MIT Press ’07; De Raedt, Frasconi, Kersting, Muggleton, LNCS’08; Domingos, Lowd Morgan Claypool
’09; Natarajan, Kersting, Khot, Shavlik Springer Brief’15; Russell CACM 58(7): 88-97 ’15]

BUT WAIT A MINUTE! WE WANT
TO USE SOME ML, NOT JUST
GRAPHICAL MODELS!
Latent Dirichlet Allocation
Matrix Factorization
Gaussian Processes
Decision Trees/Boosting
Autoencoder/Deep Learning
and many more …

Let’s say we want to classify
publications that cite each other

This is a quadratic program. If you replace
l2- by l1-,l∞-norm you get a linear program
Standard ML approach:
[Vapnik ´79; Bennett´99; Mangasarian´99; Zhou, Zhang, Jiao´02, ... ]

Write down the problem in „paper form“. The machine
then compiles automatically into algebraic solver form.
Statistical Machine Learning via
Declarative Programming
[Kersting, Mladenov, Tokmakov AIJ´15, Mladenov, Heinrich, Kleinhans, Gonsio, Kersting DeLBP´16]
Logically parameterized variable
(set of ground variables)
Logically parameterized constraint
Logically parameterized objective
Data stored
externally

Program1
Data1
Program2
Data2
Program3
Data3
...
Captures the essence of a problem and
can be reused for several problems

MP1
Declarative
Program
MP2 MPn
Data1 Data2 Datan
...
Captures the essence of a problem and
can be reused for several problems

But wait, publications are citing
each other. OMG, I have to use
graph kernels!
REALLY?

Simply program some additional constraints

http://www-ai.cs.uni-dortmund.de/weblab/static/RLP/html/
…
Loops and relations get
interwined, and models
can refer to each other
DBMS Interface
Using a probabilistic
programming language we
can even get stochastic
relational mathematical
programs

Finally, the „-O1“ flag
[Kersting, Mladenov, Tokmakov AIJ 2015, Mladenov, Kleinhans, Kersting 2016]
(1) Reduce the QP via symmetries
(2) Run any solver on the reduced QP

… and the “-02” flag
Algebraic
Decision Diagrams
Formulae
parse trees
Matrix Free
Optimization
(  )+
= Optimization with 60 Millions of non-zeros
with 12 minutes per log-barrier iteration and
actually sublinear in the number of non-zeros

HIGH-LEVEL LANGUAGES
FOR MACHINE LEARNING
AND OPTIMIZATION ARE A
STEP TOWARDS THE …
Conclusions

DEMOCRATIZATION OF
MACHINE LEARNING
 Reduces the level of expertise necessary to
build optimization applications, makes models
faster to write and easier to communicate
 Facilitate the construction of sophisticated
models with rich domain knowledge
 Speed up solvers by exploiting language
properties, compression, and compilation

Kristian Kersting, Associate Professor for Computer Science, TU Dortmund University, Germany at MLconf SEA - 5/20/16

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Kristian Kersting, Associate Professor for Computer Science, TU Dortmund University, Germany at MLconf SEA - 5/20/16

Similar to Kristian Kersting, Associate Professor for Computer Science, TU Dortmund University, Germany at MLconf SEA - 5/20/16 (20)

More from MLconf

More from MLconf (20)

Recently uploaded

Recently uploaded (20)

Kristian Kersting, Associate Professor for Computer Science, TU Dortmund University, Germany at MLconf SEA - 5/20/16