Abstract
Most scientific questions are about establishing cause-effect relationships. While machine learning has received a lot of attention in recent years especially when traditional statistical models are challenging, the prediction tools from the machine learning literature cannot be readily used for causal inference. In the last decade, major innovations have taken place incorporating machine learning tools into estimators for causal quantities, which helps with model misspecification and variable selection. In this talk, I will review some of these developments. I will illustrate why “direct” machine learning estimation methods should be avoided. I will then outline a strategy for constructing statistical estimators which result in valid inference even when using machine learning.