Episode Summary: When one thinks through important industry applications of machine learning in law, legal apps are not usually the first to jump to mind, but there’s certainly a need. Richard Downe PhD is vice president of Data Science at Casetext, a startup working on improving search and natural language processing and democratizing legal information. In this episode, he speaks about the current bottlenecks for people trying to get more out of of legal case documents, as well as how Casetext team is applying ML applications in the legal industry to help humans parse through thousands of documents for useful and meaningful information.


Get it on SCiTunes Badge



Expertise: Data science and computer engineering, machine learning in law

Recognition in BriefRichard received his PhD in Electrical and Computer Engineering from the University of Iowa, where he researched the predictive analysis of the progression of cardiovascular disease. During graduate school he worked at IBM’s TJ Watson Research Center in New York and IBM’s Almaden, California lab, where he conducted natural language analysis of medical literature.

Current Affiliations: Vice President of Data Science at Casetext


Interview Highlights:

(1:32) What do you see as the biggest challenges in the legal space in terms of people understanding the connection between documents, the meaning between documents, and search in general?

(3:27) To be able to tease all that out via machine learning (ML) or other data science methods, seems like quite a challenge, how in layman’s terms are you working on this today?

(6:10) Have you thought about how you would crack that (sort through cases with ML) with enough citations from crowdsourcing, would that be possible with today’s technology?

(12:17) How do you summarize a legal text with any degree of confidence, how are you approaching that question which is obviously a big time saver and value driver in your space?

(16:24) How in those long tail where you don’t have nice succinct summaries, in the coming two to five years…what sorts of approaches do you feel confident might be able to get us to the point of much better summarized and swift search?


Big Ideas:

  • The biggest challenge in the legal and related industries today is access to the law, and the greatest need is democratization of access – for the general public – to a deluge of information.
  • Machine learning in law is helping researchers mine relevant cases and other legal documents from a vast sea of legal information, offering the potential to build stronger cases and streamline the process of millions of cases brought to court each year in the U.S. alone.
  • Cognitive computing applied to unstructured data (like that found in many legal documents), an approach being pioneered in business by companies like Coseer, could be one way to help make meaningful sense of the information packed into thousands of relevant documents that are typically unaccessible to human beings based on volume and technical language.


[This article has been updated as of December 2016.]