- This event has passed.
Applied Statistics and Data Science Seminar on Monday, September 18, 2023
September 18, 2023 @ 2:00 pm - 3:30 pm
Title: The Generalized Variable Importance Metric: A model agnostic method to identify predictor outcome relationship
Presenter:
Kaviul Anam khan
PhD in Biostatistics candidate at the Dalla Lana School of Public Health, University of Toronto
Assistant Professor, Department of Statistical Sciences, University of Toronto
Abstract:
The aim my research is to define importance of predictors for black box machine learning methods, where the prediction function can be highly non-additive and cannot be represented by statistical parameters. In this paper we defined a “Generalized Variable Importance Metric (GVIM)” using the true conditional expectation function for a continuous or a binary response variable. We further showed that the defined GVIM can be represented as a function of the Conditional Average Treatment Effect (CATE) squared for multinomial and continuous predictors. Then we propose how the metric can be estimated using any machine learning models. Finally we showed the properties of the estimator using multiple simulations. While the estimators for the GVIM are consistent, they have small sample biases. We proposed and efficient influence function based approach under some regularity conditions to perform one step correction of the bias. This research is going to significantly impact the public and clinical health sciences, since this opens the door for effectively using modern machine learning methods in real life applications in health sciences.