Dear all,
It is my pleasure to announce the following CWI Machine Learning
seminar.
Speaker: Christian Hennig (University of Bologna)
Title: A spotlight on statistical model assumptions
Date: Friday 27 November, 15:00
Location: https://us02web.zoom.us/j/82596062334?pwd=OTMwU2JmYUFRK0NLYW42OTExWDRyUT09
Please find the abstract below.
Hope to see you then.
Best wishes,
Wouter
Details:
https://portals.project.cwi.nl/ml-reading-group/events/a-spotlight-on-statistical-model-assumptions-christian-hennig
============
A spotlight on statistical model assumptions
Christian Hennig (University of Bologna)
Many statistics teachers tell their
students something like "In order to apply the t-test, we have to
assume that the data are i.i.d. normally distributed, and
therefore these model assumptions need to be checked before
applying the t-test." This statement is highly problematic in
several respects. There is no good reason to believe that any real
data truly are drawn i.i.d. normally. Furthermore, quite relevant
aspects of these model assumptions cannot be checked. For example,
I will show that data generated from a normal distribution with a
correlation of $\rho\neq 0$ between any two observations cannot be
distinguished from i.i.d. normal data. On top of this, passing a
model by a model checking test will automatically invalidate it;
much literature investigating the performance of specific
procedures that run model-based tests conditionally on passing a
model misspecification test comment very critically on this
practice.
Despite all these issues, I will defend interpreting and using
statistical models in a frequentist manner, by advocating an
understanding of models that never forgets that models are
essentially different from reality (and in this sense can never be
"true"). Model assumptions specify idealised conditions under
which methods work well; in reality they do not need to be
fulfilled. However, situations in which the data will mislead a
method need to be distinguished from situations in which a method
does what it is expected to do. This defines a more appropriate
task for model checking. Conditions are required for doing this
job properly that some model checking currently in use does not
fulfill. For better "model checking" it will be helpful to
understand that this is not about "finding out whether the model
assumptions hold", but about something quite different.