I see, so at the end of the day you're assigning a number representing how productive the day was, and you consider predicting that number the day before? I guess in case that rating is based on your feeling about the day as opposed to more objectively predefined criteria, the "predictions affect outcomes" issue might indeed be a bit larger here than described in the post, as in this case the prediction would potentially not only affect your behavior, but also the rating itself, so it could have an effect of decoupling the metric from reality to a degree.
If you end up doing this, I'd be very interested in how things go. May I message you in a month or so?
Good point, I also make predictions about quarterly goals (which I update twice a month) as well as my plans for the year. I find the latter especially difficult, as quite a lot can change within a year including my perspective on and priority of the goals. For short term goals you basically only need to predict to what degree you will act in accordance with your preferences, whereas for longer term goals you also need to take potential changes of your preferences into account.
It does appear to me that calibration can differ between the different time frames. I seem to be well calibrated regarding weekly plans, decently calibrated on the quarter level, and probably less so on the year level (I don't yet have any data for the latter). Admittedly that weakens the "calibration can be achieved quickly in this domain" to a degree, as calibrating on "behavior over the next year" might still take a year or two to significantly improve.
I personally tend to stick to the following system:
While this may sound complex when explaining it, I added the time estimates to the list above in order to demonstrate that all of these steps are pretty quick and easy. Spending these 10 minutes[4] each week seems like a fair price for the benefits it brings.
An example would be “make check up appointment with my dentist”, but when calling during the week realizing the dentist is on vacation and no appointment can be made; given there’s no time pressure and I prefer making an appointment there later to calling a different dentist, the task itself was not achieved, yet my behavior was as desired; as there are arguments to be made to evaluate this both as true or false, I often just drop such cases entirely from my evaluation ↩︎
I once had the task “sign up for library membership” on my list, but then during the week realized that membership was more expensive than I had thought, and thus decided to drop that goal; here too, you could either argue “the goal is concluded” (no todo remains open at the end of the week) or “I failed the task” (as I didn’t do the formulated action), so I usually ignore those cases instead of evaluating them arbitrarily ↩︎
One could argue that a 5% and a 95% prediction should really end up in the same bucket, as they entail the same level of certainty; my experience with this particular forecasting domain however is that the symmetry implied by this argument is not necessarily given here. The category of things you’re very likely to do seems highly different in nature from the category of things you’re very unlikely to do. This lack of symmetry can also be observed in the fact that 90% predictions are ~10x more frequent for me in this domain than 10% predictions. ↩︎
It’s 30 minutes total, but the first 20 are just the planning process itself, whereas the 3+2+5 afterwards are the actual forecasting & calibration training. ↩︎
"Before January 1st" in any particular time zone? I'll probably (85%) publish something within the next ~32h at the time of writing this comment. In case you're based in e.g. Australia or Asia that might then be January 1st already. Hope that still qualifies. :)
Indeed, thank you. :) I haven't started the other, forecasting related one, but intend to spend some time on it next week and hopefully come up with something publishable before the end of the year.
My thoughts on how to best prepare for the workshop (as mentioned in the post):
Sure. Those I can mention without providing too much context:
If anybody actually ends up planning to write a post on any of these, feel free to let me know so I'll make sure focus on something else.
Good timing and great idea. Considering I've just read this: https://forum.effectivealtruism.org/posts/8Nwy3tX2WnDDSTRoi/announcing-the-forecasting-innovation-prize I'll gladly commit to submitting at least one forum post to the forecasting innovation prize (precise topic remains to be determined). Which entails writing and publishing a post here or on lesswrong before the end of the year.
I further commit to publishing a second post (which I'd already been writing on for a while) before the end of the year.
If anybody would like to hold me accountable, feel free to contact me around December 20th and be very disappointed if I haven't published a single post by then.
Thanks for the prompt Neel!
Nice! In the few minutes of reading this post I came up with five ideas for related things I could (and maybe should) write a post on. My only issue is that there's only 6 weeks of time for this, and I'm not sure if that'll be enough for me to finish even one given my current schedule. But I'll see what I can do. May even be the right kind of pressure, as otherwise I'd surely be following Parkinson's law and work on a post for way too long.
(The many examples you posted were very helpful by the way, as without them I would have assumed I don't have much to contribute here)
"Bei 80% der Treffen der EA Münster Lokalgruppe in 2021 waren mehr als 5 Personen anwesend" - how will cancelled meetups (due to lack of attendees, if that ever happens) count into this? Not at all, or as <=5 attendees? (kind of reminds me of how the Deutsche Bahn decided to not count cancelled trains as delayed)
Also, coming from EA Bonn where our average attendance is ~4 people, I find the implications of this question impressive. :D