Skip to main content

Scoring Systems: At the Extreme of Interpretable Machine Learning

Speaker: Cynthia Rudin

With widespread use of machine learning, there have been serious societal consequences from using black box models for high-stakes decisions, including flawed bail and parole decisions in criminal justice, flawed models in healthcare, and black box loan decisions in finance. Interpretability of machine learning models is critical in high stakes decisions.

In this talk, I will focus on one of the most fundamental and important problems in the field of interpretable machine learning: optimal scoring systems. Scoring systems are sparse linear models with integer coefficients. Such models first started to be used ~100 years ago. Generally, such models are created without data, or are constructed by manual feature selection and rounding logistic regression coefficients, but these manual techniques sacrifice performance; humans are not naturally adept at high-dimensional optimization. I will present the first practical algorithm for building optimal scoring systems from data. This method has been used for several important applications to healthcare and criminal justice.

I will mainly discuss work from three papers:
Learning Optimized Risk Scores. Journal of Machine Learning Research, 2019. http://jmlr.org/papers/v20/18-615.html
The Age of Secrecy and Unfairness in Recidivism Prediction. Harvard Data Science Review, 2020.
https://hdsr.mitpress.mit.edu/pub/7z10o269
Struck et al. Association of an Electroencephalography-Based Risk Score With Seizure Probability in Hospitalized Patients. JAMA Neurology, 2017.

𝗛𝗼𝘄 𝗰𝗮𝗻 𝘆𝗼𝘂 𝗷𝗼𝗶𝗻?
Register here.
(Registration closes 2 hours before the beginning of the seminar).

Speaker bio

Cynthia Rudin is a professor of computer science, electrical and computer engineering, statistical science, and biostatistics & bioinformatics at Duke University, and directs the Prediction Analysis Lab, whose main focus is in interpretable machine learning. Previously, Prof. Rudin held positions at MIT, Columbia, and NYU. She holds an undergraduate degree from the University at Buffalo, and a PhD from Princeton University. She is a three-time winner of the INFORMS Innovative Applications in Analytics Award, was named as one of the ""Top 40 Under 40"" by Poets and Quants in 2015, and was named by Businessinsider.com as one of the 12 most impressive professors at MIT in 2015. She is a fellow of the American Statistical Association and a fellow of the Institute of Mathematical Statistics.

Some of her (collaborative) projects are: (1) she has developed practical code for optimal decision trees and sparse scoring systems, used for creating models for high stakes decisions. Some of these models are used to manage treatment and monitoring for patients in intensive care units of hospitals. (2) She led the first major effort to maintain a power distribution network with machine learning (in NYC). (3) She developed algorithms for crime series detection, which allow police detectives to find patterns of housebreaks. Her code was developed with detectives in Cambridge MA, and later adopted by the NYPD. (4) She solved several well-known previously open theoretical problems about the convergence of AdaBoost and related boosting methods. (5) She is a co-lead of the Almost-Matching-Exactly lab, which develops matching methods for use in interpretable causal inference.

 

 

Share this: