Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Legal Technology, lecture 5 (2015)


Published on

Slides for lecture 5 of the course Introduction to Legal Technology at the University of Turku Law School, presented Feb 10 2015.

This lecture is the first of three lectures on specific legal technology applications: information retrieval, knowledge management, and e-discovery.

Published in: Law
  • Login to see the comments

  • Be the first to like this

Introduction to Legal Technology, lecture 5 (2015)

  1. 1. TLS0070 Introduction to Legal Technology Lecture 5 Applications I: Information retrieval, knowledge management, e-discovery University of Turku Law School 2015-02-10 Anna Ronkainen @ronkaine
  2. 2. First a little digression (guess why...)
  3. 3. Google Flu Trends -  predicting the timing and strength of influenza epidemics based on the relative frequency of certain keywords in searches -  values for the model in black (dotted lines 95% confidence intervals for predicted values), actual CDC influenza figures in red
  4. 4. But...
  5. 5. Performance after the initial period
  6. 6. Lessons worth learning (also for legal applications) -  transparency and replicability -  use big data for understanding the unknown -  study the algorithm -  it’s not just about the size of the data (from Lazer et al 2014)
  7. 7. Applications (general)
  8. 8. Application lectures overview Applications I (this week): -  information retrieval -  e-discovery (e-disclosure) -  knowledge management Applications II (next week, 1st half): -  case management -  online dispute resolution -  access to justice solutions Applications III (next week, 2nd half): -  decision support -  prediction -  automation -  self-service
  9. 9. Legal tech applications not covered here -  general-purpose applications (like Office®/ office software) -  legislative drafting applications -  docket management (and other applications for use within the judiciary) -  courtroom visualization (etc.) software -  ... and probably a ton of other things I don’t even know existed
  10. 10. Information retrieval
  11. 11. Information retrieval (IR) -  the granddaddy of legal tech applications -  the only form of legal tech available in all (industrial) countries at least in some form -  making different types of static legal content available for human consumption -  statute law (+ commentaries) -  case law -  doctrine: journal articles and books
  12. 12. Information retrieval users -  types of users: -  lawyers in general -  subgroups of lawyers (e.g. IP lawyers) -  legal/admin support staff (e.g. tax administrators, paralegals, informaticians) -  other non-law professionals -  ordinary citizens -  different users have different needs in terms of -  type and quantity of content required -  terminology used -  user interface in general
  13. 13. First-generation information retrieval -  take whatever text you have (on paper) and put it into a database -  full-text search (exact match or wildcards) -  structured search (in whatever fields are available) -  Boolean search with AND, OR, NOT -  some metadata enhancements like keywords (typically same as on paper)
  14. 14. Present-day Boolean search example: TMview
  15. 15. Further developments -  hypertext (links) -  better search capabilities with language technology (try searching for “back” as a noun) -  relevancy ranking -  recommendations for further reading -  morebetter metadata
  16. 16. An example: WestlawNext -  natural-language and Boolean search -  relevancy ranking of sources of law, using (among others) a network of links between cases -  (commercial break, text version:
  17. 17. On the horizon -  natural-language query interfaces and advanced text understanding (think Watson/ Siri) -  merging relevancy ranking with predictive legal analytics (like a certain trademark platform) -  even more polarization between biggest markets (esp. US) and others (e.g. Finland, let alone developing countries)
  18. 18. Knowledge management
  19. 19. Knowledge management -  taking (and improving upon!) the knowledge (explicit and tacit!) of an organization and putting it into optimal use -  by no means just tech: creating and developing processes within the organization is equally important -  can take different forms: -  internal: e.g. making work product (memos, contracts etc.) electronically searchable -  external: creating digital legal content for use by law firm customers
  20. 20. Knowledge management advantages -  higher efficiency -> better service -  higher quality (better dissemination of expertise) -  makes life easier for lawyers (increased productivity, reduced stress) -  keeps knowledge in the firm even if individuals leave -  helps with the training of new lawyers -  necessary for good risk management (after Kay 2003)
  21. 21. One knowledge management example: contract management -  the default solution that’s still used by many (most?) companies: paper + binders -  low overhead; manageable with low volumes -  doesn’t scale (cope with large volumes) well, e.g. finding information becomes difficult -  particularly kludgy when documents needed externally (due diligence, anyone?) -  error-prone and fragile -  still need to manage templates somewhere (lack of central storage leads to inconsistencies)
  22. 22. Low-tech electronic contract management -  establish a central organization-wide repository for signed contracts and official templates -  doesn’t need proprietary software, any LAN or cloud based (private) file sharing solution works -  electronically searchable, at least if word processing documents and scans are kept together -  works well (enough) if there are good processes (e.g. regarding file naming and organization of files) and they are (always!) consistently adhered to -  ...which this solution obviously cannot enforce -  no built-in workflow management
  23. 23. Dedicated contract lifecycle management (CLM) solutions -  hundreds of providers, including two from Finland (that I know of: M-Files and Sopima) -  functionalities of varying sophistication for different stages in the contract lifecycle: -  contract and clause template libraries -  platform and history for internal review -  platform and history for negotiations and external review -  electronic signing / import of scanned definitive paper originals -  archiving, retrieval etc. -  workflow management, managing access privileges etc.
  24. 24. Exhibit A: Sopima
  25. 25. Exhibit B: M-Files
  26. 26. Electronic signing -  real electronic signing not widespread (outside Estonia, anyway), to a great deal due to a lack of standards internationally (and esp. for identifying legal persons) -  pseudo-electronic signing (images manually written signatures stored electronically) now quite widespread, dedicated solutions and support in CLM systems also available -  the latter raises some obvious questions about probative value
  27. 27. Heck, even Apple does it:
  28. 28. In summary: Levels of contract management adoption (via Juntunen 2013)
  29. 29. Another knowledge management example: Fondia’s Virtual Lawyer
  30. 30. Fondia’s Virtual Lawyer -  a collection of ~1700 short documents made by Fondia staff describing the legal aspects of particular situations -  for external use (self-help by Fondia clients etc.), AFAIK also used internally in an enhanced version -  not for total novices -  available at for free, registration required, document template library additionally available for a fee
  31. 31. Electronic discovery (disclosure)
  32. 32. Discovery in electronically stored information (e-discovery) -  emerged out of nowhere a dozen years ago -  now a multi-billion-dollar industry (mostly US), hundreds of providers -  roots in more general-purpose language tech (outside the AI & law community) -  Enron corpus, Sedona Conference, TREC, DESI -  storage requirements for e-mail etc. introduced (US) by amendments to Federal Rules of Civil Procedure in 2006
  33. 33. ...and now* it’s already this much widespread (in the US, anyway): *: actually this book is from 2009
  34. 34. Zubulake v. UBS Warburg -  employment law case in District Court for Southern NY, heard 2003–2005 -  led to four groundbreaking rulings which set the basic standards for e-discovery (before 2006 FRCP revisions), widely referred to as Zubulake I, III, IV, V
  35. 35. Zubulake I and III -  what data is considered accessible ESI -  yes: online data/hard disks, optical disks, offline magnetic tapes -  no: backup tapes, damaged/deleted/... data -  no -> yes if considerable evidentiary value can be demonstrated, for which a 7-factor test was introduced: -  The extent to which the request is specifically tailored to discover relevant information; -  The availability of such information from other sources; -  The total cost of production, compared to the amount in controversy; -  The total cost of production, compared to the resources available to each party; -  The relative ability of each party to control costs and its incentive to do so; -  The importance of the issues at stake in the litigation; and -  The relative benefits to the parties of obtaining the information.
  36. 36. Zubulake IV -  some backups no longer available -  relevant emails (created after the start of the proceedings) had been deleted -  defendant had a duty to preserve evidence (since relevant for ongoing/future litigation) -  plaintiff got access to the information -  however, plaintiff couldn’t show adverse interference (at this stage) and was ordered to pay the costs
  37. 37. Zubulake V -  upon the plaintiff’s motion, the court concluded that the defendant (and defence counsel) had failed to safeguard and produce evidence in an adequate manner -  defendant sanctioned and ordered to pay plaintiff’s costs for producing evidence (witness re-examination etc.) necessary due to plaintiff’s late (or non-)production of relevant evidence
  38. 38. Outcome -  active interference (intentional destruction or hiding of evidence) ruled by the judge -  jury found in favour of the plaintiff, compensatory and punitive damages -  reimbursement of even more costs to the plaintiff (generally a lot more unusual in US)
  39. 39. E-discovery workflow -  establish an ESI retention policy, stick to it when creating and storing data -  identify relevant ESI, create authentic snapshot and collect it for further processing -  process and filter ESI (e.g. removal of duplicates) -  review and analyze ESI for privileged information -  produce ESI after filtering out irrelevant, duplicated or privileged materials -  possibly clawback if too much produced in error -  present at trial (if it ever goes that far)
  40. 40. First-generation e-discovery -  based on lists of specific search terms (or phrases) proposed by the plaintiff and approved or modified by the judge -  a bit sketchy, not even real consensus about whether keywords cover all inflections? -  no longer considered acceptable by many of the most influential US judges for this field
  41. 41. Predictive coding -  based on coding a (very) small subset of the relevant document mass as responsive or not (should/n’t be released) -  then using that as the teaching set for a machine learning algorithm -  performance comparable to (or better than) human reviewers at a fraction of the cost
  42. 42. E-discovery output -  native (original) formats (e.g.: .docx) -  usually better for the plaintiff: electronically searchable -  native file formats for proprietary software not necessarily openable without that software -  “petrified” formats (tiff, pdf) -  often better for the defendant: almost the same as handing out the data on paper -  general-purpose tools enough for viewing -  easier to redact
  43. 43. What’s the status with e-discovery -  very widespread in the US (because it’s the law!) -  gaining popularity in the rest of Anglophonia (because common law; tech readily available for English) -  some providers also support major European and Asian languages (mostly for international companies operating in the US) -  rest of the world: is there even a word for this? (then again: discovery in the common-law sense doesn’t exist in most civil-law countries (incl. Finland) in general)
  44. 44. No concrete examples -  (because, frankly, I understand neither the field nor the legal issue well enough) -  but e-discovery in itself is an interesting example of legal tech for many reasons -  first real big data application for law -  came out of nowhere in the early 2000s -  now a multi-billion-dollar industry (US) -  many startups, some notable exits (e.g. Cataphora’s e-discovery ops to EY) -  also continuously new funding rounds (even $100M+) to more and more companies
  45. 45. Questions?