Dear all,
This is a reminder of tomorrow's CWI Machine Learning seminar, with a *new and improved zoom URL*.
Speaker: Christian Hennig (University of Bologna) Title: A spotlight on statistical model assumptions Date: Friday 27 November, 15:00 Location: https://cwi-nl.zoom.us/j/87252095652?pwd=aXI0K1VUdlNlbGlReEE3WGMyWXd6QT09
Please find the abstract below.
Hope to see you then.
Best wishes,
Wouter
Details:
https://portals.project.cwi.nl/ml-reading-group/events/a-spotlight-on-statis...
============
A spotlight on statistical model assumptions
Christian Hennig (University of Bologna)
Many statistics teachers tell their students something like "In order to apply the t-test, we have to assume that the data are i.i.d. normally distributed, and therefore these model assumptions need to be checked before applying the t-test." This statement is highly problematic in several respects. There is no good reason to believe that any real data truly are drawn i.i.d. normally. Furthermore, quite relevant aspects of these model assumptions cannot be checked. For example, I will show that data generated from a normal distribution with a correlation of $\rho\neq 0$ between any two observations cannot be distinguished from i.i.d. normal data. On top of this, passing a model by a model checking test will automatically invalidate it; much literature investigating the performance of specific procedures that run model-based tests conditionally on passing a model misspecification test comment very critically on this practice.
Despite all these issues, I will defend interpreting and using statistical models in a frequentist manner, by advocating an understanding of models that never forgets that models are essentially different from reality (and in this sense can never be "true"). Model assumptions specify idealised conditions under which methods work well; in reality they do not need to be fulfilled. However, situations in which the data will mislead a method need to be distinguished from situations in which a method does what it is expected to do. This defines a more appropriate task for model checking. Conditions are required for doing this job properly that some model checking currently in use does not fulfill. For better "model checking" it will be helpful to understand that this is not about "finding out whether the model assumptions hold", but about something quite different.