Legal Analytics - Introduction to the Course - Professor Daniel Martin Katz + Professor Michael J Bommartio II

Legal Analytics
Professor Daniel Martin Katz
Professor Michael J Bommarito II
legalanalyticscourse.com
Class 1
Introduction to the Course

This Course
is Called
“Legal Analytics”
access more at legalanalyticscourse.com

< As We Deﬁne it ...
“Legal Analytics” is about
deriving substantively
meaningful insight from some
sort of legal data >

< Let us start with a general
description of the overall
landscape >

() Statistical Models for
Causal Inference
() Statistical Models for
Prediction
versus

Few Words About Causal Inference
Causal Inference is at the core of the “empirical turn”
that has taken hold in law as well as the social sciences

Such Approaches are best for Appropriate Problems/
Question where identifying/linking cause and effect
are key

Instrumental Variables, Propensity Score Matching, Rubin Causal
Model, Regression Discontinuity, Difference in Differences, etc.
Here are just some of the methods/topics
associated with causal inference

However, the methods
associated with Causal
Inference will not be the
focus of this course

We are focused
upon prediction

We are focused
upon machine learning

We are focused
upon data science

We are going to learn
data management skills

SKILLSTO BETAUGHT:
Collecting, cleaning and processing data
Exploring and analyzing data to produce knowledge and
insights, including:
Communicating data and knowledge to clients,
colleagues, or courts.
Machine learning (i.e., classiﬁcation, regression, and clustering)
Visualization
Natural language processing (time permitting)

Books For This Class

Deduction versus Induction

“Long Before Machine Learning came into
existence philosophers knew that
generalizing from particular cases to
general rules is not a well posed problem”
Flach Page 20

David Hume

Black Swans
Two Core Issues:
Uniformity of Nature?

Black Swan Problem
Even If We Observe White Swan
after White Swan we cannot
induce that all swans are white

“[T]here are known knowns; there are things we know we know.
We also know there are known unknowns; that is to say we
know there are some things we do not know.
But there are also unknown unknowns – there are things we do
not know we don't know. ”
United States Secretary of Defense
Donald Rumsfeld

Uniformity of Nature
It is a mistake to presuppose
that a sequence of events in
the future will occur as it
always has in the past

< Regression as a Prediction Tool >

Standard Linear Regression
Can Be Used to
Predict a Quantity

Task = Predict the Expected Hourly
Rate of a Lawyer
f( )
Cost?
#
and/or
010
101
001

Build a (Regression) Model
from Existing Billing Data

Y = βo +/- β1 ( X1 ) +/- β2 ( X2 ) +/- β3 ( X3 ) +/- β4 ( X3 ) +/- β5 ( X3 ) + ε
Y = $151 + $15 ( ) + 161 ( ) + 95 ( ) + 34 ( ) +/- β5 ( ) + ε
Per
100
Lawyers
If Tier 1
Market
is True
Partner
Status
is True
Per
10
Years
Practice
Area

Turn Around and
Use This Model To Predict a
New Lawyer (also Matters, etc.)

This Requires a Method to Deal
With Changes in Dynamics

This Requires a Method to Update
the Model as Time Moves Forward

Must Deal With
Underﬁtting / Overﬁtting
the Existing Data

< Machine Learning
A HighLevel Overview >

< Machine Learning HighLevel Overview >
See Flach Textbook Page 11

See Flach Textbook Page 11
Here we have the main ingredients of machine learning:
tasks, models and features.

“A task (red box) requires an
appropriate mapping – a model
– from data described by
features to outputs.”
“Obtaining such a mapping
from training data is what
constitutes a learning problem
(blue box).”

Key Point: “tasks are
addressed by models,
w h e r e a s l e a r n i n g
problems are solved by
learning algorithms that
produce models”

< The Family of ML Methods >

http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html

Adapted from Slides By
Victor Lavrenko and Nigel Goddard
@ University of Edinburgh
Take A LookThese 12

72
Female
Human
3
Female
Horse
36
Male
Human
21
Male
Human
67
Male
Human
29
Female
Human
54
Male
Human
44
Male
Human
50
Male
Human
42
Female
Human
6
Male
Dog
7
Female
Human

Task = Determine the Gender
of the Respective Agents
female
male
f( )
Gender?
and/or
010
101
001

Task = Determine the Gender
of the Respective Agents
female
male
f( )
Gender?
Binary Classiﬁcation (Supervised Learning)
and/or
010
101
001

Classiﬁcation (Supervised Learning)
female
male
f( )
Gender?

decision boundary
female
male
f( )
Gender?

Task = Determine Whether the Agents
Will Obtain Employment?
Yes
No
f( )
Job?
Binary Classiﬁcation (Supervised Learning)

Yes
No
f( )
Job?

decision boundary
Yes
No
f( )
Job?
decision boundary

Task = Determine Whether the Agents
Will Obtain a Loan?
Yes
Perhapsf( )
Loan?
Multi Class Classiﬁcation (Supervised Learning)
No

f( )
Loan?
Yes
Perhaps
No

f( )
Loan?
Yes
No
Maybe
Yes
Perhaps
No

Multiclass = Hyperplane

Task = Determine the Age of the
Respective Agents
f( )
Age?
Regression (Supervised Learning)
#

#f( )
Age?
723
21
36
67
54
29
42
44 50
7
6

#f( )
Age?
723
21
36
67
54
29
42
44 50
7
6
27 44 53 37
68
22 48
10
6
74
3
44

Task = Can We Determine to Which
Group the Agent Belongs?
Clustering (Unsupervised Learning)
f( )
Group?
Cluster
Relies upon some notion of “similarity”

Clustering (Unsupervised Learning)
Clusterf( )
Group?

< Loss Functions >

“In statistics, typically a loss function is used
for parameter estimation, and the event in
question is some function of the difference
between estimated and true values for an
instance of data.”

Take a Set of Predictor X’s and some response Y
Obtain a function f (X) to make predictions of Y
from those input variables
This is called a loss function L(Y, f (X))
In order to identify f (X) we need another
function to penalize errors in prediction

Once Again Remember
Linear Regression

05101520
0 5 10 15 20
X
Fitted values Y

Notice that the
prediction line does not
really pass through the
middle of any particular
observation
There is an error term called “epsilon” which attempts to capture the amount
of error in the model
Y = α + βx + ε
A Large ErrorTerm Mean that the Regression Line Does not Really “Fit” the
Data Particularly Well
05101520
0 5 10 15 20
X
Fitted values Y

Standard Linear Regression =
minimize the sum of squared residuals
residual is the
difference between
observed value
and ﬁtted value

Regression Analysis
Involves a
Loss Function

Linear Regression
Squared Error Loss Function
L(Y, f ( X)) = ( y − f ( X))Σ 2

Linear Regression
Y = α + βX
where α and β are both in the reals

Linear Regression
Y = α + βX
where α and β are both in the reals
Minimizing our SSE loss function helps us
identify the "best" alpha and beta that deﬁne an
actual function out of the family deﬁned above.

Why is it Squared Error Loss
Function Correct?

There are many other
loss functions

Many models are deﬁned by a functional
form or family, e.g., logistic regression, linear
regression, SVM+kernel.
Most often, the geometric category that Flach
discusses is tied to these forms, and the "loss"
functions are essentially "distance" or "spatial"
metrics.
Note:

Misclassiﬁcation is one
common loss function

“Imagine we are trying to predict a binary outcome (0,1)
Now swap the (0,1) for [-1,1]
L(Y, f( X)) = I (y ≠ sign( f))Σ
I is the indicator function where we are summing up
misclassiﬁcations”
Example drawn from
Michael Clark @ Notre Dame

< Okay A Few Words About Implementation >

< We Will Use >

< Review These As Needed >
http://computationallegalstudies.com/quantitative-methods-for-lawyers-course/

< More to Come in the Next Class >

Legal Analytics - Introduction to the Course - Professor Daniel Martin Katz + Professor Michael J Bommartio II

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Legal Analytics - Introduction to the Course - Professor Daniel Martin Katz + Professor Michael J Bommartio II

Similar to Legal Analytics - Introduction to the Course - Professor Daniel Martin Katz + Professor Michael J Bommartio II (19)

More from Daniel Katz

More from Daniel Katz (20)

Recently uploaded

Recently uploaded (20)

Legal Analytics - Introduction to the Course - Professor Daniel Martin Katz + Professor Michael J Bommartio II