crossposted on LessWrong
I'm interested in questions of the form, "I have a bit of metadata/structure to the question, but I know very little about the content of the question (or alternatively, I'm too worried about biases/hacks to how I think about the problem or what pieces of information to pay attention to). In those situations, what prior should I start with?"
I'm not sure if there is a more technical term than "low-information prior."
Some examples of what I found useful recently:
1. Laplace's Rule of Succession, for when the underlying mechanism is unknown.
2. Percentage of binary questions that resolves as "yes" on Metaculus. It turns out that of all binary (Yes-No) questions asked on the prediction platform Metaculus, ~29% of them resolved yes. This means that even if you know nothing about the content of a Metaculus question, a reasonable starting point for answering a randomly selected binary Metaculus question is 29%.
In both cases, obviously there are reasons to override the prior in both practice and theory (for example, you can arbitrarily add a "not" to all questions on Metaculus such that your prior is now 71%). However (I claim), having a decent prior is nonetheless useful in practice, even if it's theoretically unprincipled.
I'd be interested in seeing something like 5-10 examples of low-information priors as useful as the rule of succession or the Metaculus binary prior.
Thanks a lot for the pointers! Greaves' example seems to suffer the same problem, though, doesn't it?
We have information about the set and distribution of colors, and assigning 50% credence to the color red does not use that information.
The cube factory problem does suffer less from this, cool!
I wonder if one should simply model this hierarchically, assigning equal credence to the idea that the relevant measure in cube production is side length or volume. For example, we might have information about cube bottle customers that want to fill their cubes with water. Because the customers vary in how much water they want to fit in their cube bottles, it seems to me that we should put more credence into partitioning it according to volume. Or if we'd have some information that people often want to glue the cubes under their shoes to appear taller, the relevant measure would be the side length. Currently, we have no information like this, so we should assign equal credence to both measures.