You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The goal is to add the Bayesian bootstrap as a first-class alternative to leave-one-out cross validation in this package. I believe the Bayesian bootstrap should provide the major advantage of letting users plot a full posterior distribution, rather than just having a point estimate and standard error. Aside from the actual informational difference, I've noticed that plotting posteriors is a good way to help people intuitively understand that the point estimates aren't special. Show someone a point estimate and a standard error and they will usually either ignore the standard error or construct a 95% normal confidence interval.
Mentioning @topipa because Aki suggested I talk to you about this, and said you've built something similar before. I know that the bootstrap tends to underestimate the bias caused by overfitting, because bootstrap resamples will be more similar to the data than a new sample from the original distribution would be -- did your own implementation use any corrections for this bias?
My own thoughts on how to correct this:
Iterated bootstrap techniques, and
Adding random noise to resamples -- instead of assigning a random Dirichlet-distributed probability to every observation x, we can draw random observations x + N, where N ~ Normal(0, Σ / n), and then assign a random Dirichlet probability to each of these resamples. I've seen Tim Hesterberg suggest this in his textbook, but Aki seemed to suggest it would be a bad idea. Intuitively I'd expect this to reduce the bias caused by resampling, since we'd at least be getting the variance of the underlying distribution correct, but I could be wrong.
I have an initial implementation of a basic BB here, although it's not quite working yet -- the estimates seem to be slightly off, but I'm not sure why.
The text was updated successfully, but these errors were encountered:
The goal is to add the Bayesian bootstrap as a first-class alternative to leave-one-out cross validation in this package. I believe the Bayesian bootstrap should provide the major advantage of letting users plot a full posterior distribution, rather than just having a point estimate and standard error. Aside from the actual informational difference, I've noticed that plotting posteriors is a good way to help people intuitively understand that the point estimates aren't special. Show someone a point estimate and a standard error and they will usually either ignore the standard error or construct a 95% normal confidence interval.
Mentioning @topipa because Aki suggested I talk to you about this, and said you've built something similar before. I know that the bootstrap tends to underestimate the bias caused by overfitting, because bootstrap resamples will be more similar to the data than a new sample from the original distribution would be -- did your own implementation use any corrections for this bias?
My own thoughts on how to correct this:
x
, we can draw random observationsx + N
, whereN ~ Normal(0, Σ / n)
, and then assign a random Dirichlet probability to each of these resamples. I've seen Tim Hesterberg suggest this in his textbook, but Aki seemed to suggest it would be a bad idea. Intuitively I'd expect this to reduce the bias caused by resampling, since we'd at least be getting the variance of the underlying distribution correct, but I could be wrong.I have an initial implementation of a basic BB here, although it's not quite working yet -- the estimates seem to be slightly off, but I'm not sure why.
The text was updated successfully, but these errors were encountered: