3. < As We Define it ...
“Legal Analytics” is about
deriving substantively
meaningful insight from some
sort of legal data >
access more at legalanalyticscourse.com
4. < Let us start with a general
description of the overall
landscape >
access more at legalanalyticscourse.com
5. () Statistical Models for
Causal Inference
() Statistical Models for
Prediction
versus
access more at legalanalyticscourse.com
6. Few Words About Causal Inference
Causal Inference is at the core of the “empirical turn”
that has taken hold in law as well as the social sciences
access more at legalanalyticscourse.com
7. Such Approaches are best for Appropriate Problems/
Question where identifying/linking cause and effect
are key
Few Words About Causal Inference
access more at legalanalyticscourse.com
8. Instrumental Variables, Propensity Score Matching, Rubin Causal
Model, Regression Discontinuity, Difference in Differences, etc.
Here are just some of the methods/topics
associated with causal inference
Few Words About Causal Inference
access more at legalanalyticscourse.com
9. However, the methods
associated with Causal
Inference will not be the
focus of this course
access more at legalanalyticscourse.com
13. We are going to learn
data management skills
access more at legalanalyticscourse.com
14. SKILLSTO BETAUGHT:
Collecting, cleaning and processing data
Exploring and analyzing data to produce knowledge and
insights, including:
Communicating data and knowledge to clients,
colleagues, or courts.
Machine learning (i.e., classification, regression, and clustering)
Visualization
Natural language processing (time permitting)
15. Books For This Class
access more at legalanalyticscourse.com
18. “Long Before Machine Learning came into
existence philosophers knew that
generalizing from particular cases to
general rules is not a well posed problem”
Flach Page 20
access more at legalanalyticscourse.com
20. Black Swans
Two Core Issues:
Uniformity of Nature?
access more at legalanalyticscourse.com
21. Black Swan Problem
Even If We Observe White Swan
after White Swan we cannot
induce that all swans are white
access more at legalanalyticscourse.com
22. “[T]here are known knowns; there are things we know we know.
We also know there are known unknowns; that is to say we
know there are some things we do not know.
But there are also unknown unknowns – there are things we do
not know we don't know. ”
United States Secretary of Defense
Donald Rumsfeld
23. Uniformity of Nature
It is a mistake to presuppose
that a sequence of events in
the future will occur as it
always has in the past
access more at legalanalyticscourse.com
24. < Regression as a Prediction Tool >
access more at legalanalyticscourse.com
33. < Machine Learning
A HighLevel Overview >
access more at legalanalyticscourse.com
34. < Machine Learning HighLevel Overview >
See Flach Textbook Page 11
access more at legalanalyticscourse.com
35. See Flach Textbook Page 11
Here we have the main ingredients of machine learning:
tasks, models and features.
access more at legalanalyticscourse.com
36. “A task (red box) requires an
appropriate mapping – a model
– from data described by
features to outputs.”
“Obtaining such a mapping
from training data is what
constitutes a learning problem
(blue box).”
access more at legalanalyticscourse.com
37. Key Point: “tasks are
addressed by models,
w h e r e a s l e a r n i n g
problems are solved by
learning algorithms that
produce models”
access more at legalanalyticscourse.com
38. < The Family of ML Methods >
access more at legalanalyticscourse.com
42. Task = Determine the Gender
of the Respective Agents
female
male
f( )
Gender?
and/or
010
101
001
access more at legalanalyticscourse.com
43. Task = Determine the Gender
of the Respective Agents
female
male
f( )
Gender?
Binary Classification (Supervised Learning)
and/or
010
101
001
access more at legalanalyticscourse.com
46. Task = Determine Whether the Agents
Will Obtain Employment?
Yes
No
f( )
Job?
Binary Classification (Supervised Learning)
access more at legalanalyticscourse.com
49. Task = Determine Whether the Agents
Will Obtain a Loan?
Yes
Perhapsf( )
Loan?
Multi Class Classification (Supervised Learning)
No
access more at legalanalyticscourse.com
50. f( )
Multi Class Classification (Supervised Learning)
Loan?
Yes
Perhaps
No
access more at legalanalyticscourse.com
51. f( )
Loan?
Yes
Multi Class Classification (Supervised Learning)
No
Maybe
Yes
Perhaps
No
access more at legalanalyticscourse.com
56. Task = Can We Determine to Which
Group the Agent Belongs?
Clustering (Unsupervised Learning)
f( )
Group?
Cluster
Relies upon some notion of “similarity”
access more at legalanalyticscourse.com
60. “In statistics, typically a loss function is used
for parameter estimation, and the event in
question is some function of the difference
between estimated and true values for an
instance of data.”
access more at legalanalyticscourse.com
61. Take a Set of Predictor X’s and some response Y
Obtain a function f (X) to make predictions of Y
from those input variables
This is called a loss function L(Y, f (X))
In order to identify f (X) we need another
function to penalize errors in prediction
access more at legalanalyticscourse.com
63. 05101520
0 5 10 15 20
X
Fitted values Y
access more at legalanalyticscourse.com
64. Notice that the
prediction line does not
really pass through the
middle of any particular
observation
There is an error term called “epsilon” which attempts to capture the amount
of error in the model
Y = α + βx + ε
A Large ErrorTerm Mean that the Regression Line Does not Really “Fit” the
Data Particularly Well
05101520
0 5 10 15 20
X
Fitted values Y
65. Standard Linear Regression =
minimize the sum of squared residuals
residual is the
difference between
observed value
and fitted value
access more at legalanalyticscourse.com
68. Linear Regression
Y = α + βX
where α and β are both in the reals
access more at legalanalyticscourse.com
69. Y = βo +/- β1 ( X1 ) +/- β2 ( X2 ) +/- β3 ( X3 ) +/- β4 ( X3 ) +/- β5 ( X3 ) + ε
Y = $151 + $15 ( ) + 161 ( ) + 95 ( ) + 34 ( ) +/- β5 ( ) + ε
Per
100
Lawyers
If Tier 1
Market
is True
Partner
Status
is True
Per
10
Years
Practice
Area
access more at legalanalyticscourse.com
70. Linear Regression
Y = α + βX
where α and β are both in the reals
Minimizing our SSE loss function helps us
identify the "best" alpha and beta that define an
actual function out of the family defined above.
access more at legalanalyticscourse.com
71. Why is it Squared Error Loss
Function Correct?
access more at legalanalyticscourse.com
72. There are many other
loss functions
access more at legalanalyticscourse.com
73. Many models are defined by a functional
form or family, e.g., logistic regression, linear
regression, SVM+kernel.
Most often, the geometric category that Flach
discusses is tied to these forms, and the "loss"
functions are essentially "distance" or "spatial"
metrics.
Note:
access more at legalanalyticscourse.com
75. “Imagine we are trying to predict a binary outcome (0,1)
Now swap the (0,1) for [-1,1]
L(Y, f( X)) = I (y ≠ sign( f))Σ
I is the indicator function where we are summing up
misclassifications”
Example drawn from
Michael Clark @ Notre Dame
access more at legalanalyticscourse.com
76. < Okay A Few Words About Implementation >
access more at legalanalyticscourse.com
77. < We Will Use >
access more at legalanalyticscourse.com
78. < Review These As Needed >
http://computationallegalstudies.com/quantitative-methods-for-lawyers-course/
79. < Review These As Needed >
http://computationallegalstudies.com/quantitative-methods-for-lawyers-course/
80. < More to Come in the Next Class >
access more at legalanalyticscourse.com
81. Legal Analytics
Class 1 - Introduction to the Course
daniel martin katz
blog | ComputationalLegalStudies
corp | LexPredict
michael j bommarito
twitter | @computational
blog | ComputationalLegalStudies
corp | LexPredict
twitter | @mjbommar
more content available at legalanalyticscourse.com
site | danielmartinkatz.com site | bommaritollc.com