casebash

Wiki Contributions

Comments

The Phil Torres essay in Aeon attacking Longtermism might be good

I think you're underestimating the impact bad faith criticism can have. Lots of people just copy their takes from someone else.

An ML safety insurance company - shower thoughts

Insurance seems like it could encourage companies to take more risks.

We're Redwood Research, we do applied alignment research, AMA

What's the main way that you think resources for onboarding people has improved?

We're Redwood Research, we do applied alignment research, AMA

Okay, "How alignment research might look different five or ten years from now?"

SIA > SSA, part 3: An aside on betting in anthropics

What do we mean by probability in the sense of credences? I would suggest that these kinds of claims only makes sense from within the model and that we aren't making a literal ontological claim. Here's an example of what would count as an ontological claim about probability from quantum mechanics: probability distributions correspond to quantum states. In contrast, when we're talking about credences, almost none of our uncertainty is due to uncertainty inherent in physics itself, such as comes from quantum uncertainty. That is, that this uncertainty is present in our model of the universe rather than the universe itself.

Following this logic seems to suggest that anthropic claims not related to quantum mechanics shouldn't be taken literally either. My take on probability is (unsurprisingly) essentially the same as my take on counterfactuals:

a) That probability is a partially-constructed, partially-intrinsic frame that we impose on the world and use to organise our experiences

b) However, that it's possible that this frame may be subsumed into another in the same way that Einstein subsumed space and time into space-time.

The partially constructed nature of probability means that we can't really talk about what it means outside of a particular context or set of goals. Betting provides one such way in which we can define a context or task, although I suspect it'd be more fruitful to frame this in terms of a scoring system rather than betting behaviour.

You identified that, for example, an altruistic, halfer EDT agent bets in a way that is different from its credences. This is a problem if we try to equate betting and credences. On the other hand, if we think about credences in terms of adopting a particular scoring system, then this problem disappears. If the scoring system was constructed for non-altruistic agents, then its hardly surprising if an altruistic agent acts in a way that appears strange on the face of it given the score that it assigned. The altruistic agent may still be able to make use of the scoring system by making appropriate adjustements. Indeed, my preferred approach to anthropics is that insofar as it is possible, our value system should be build on top of our anthropics, instead of being integrated into it (1).

It is worth noting that adopting a scoring system doesn't inherently indicate a value commitment as it is possible that one scoring system makes sense for one context and another for another context, but nonetheless if there is to be a standard scoring system it has to be one or another.

So the question of anthropic probability breaks down as follows:

a) If we are going to have a standard scoring system called probability, what convention should we adopt in relation to anthropics?

b) In what circumstances and for what purposes do different scoring systems make sense?

Even though you make legitimate criticisms of framing the problem in terms of betting, this core insight from the betting perspective survives untouched.

(1) This may not be possible insofar as choosing axioms is a value judgment rather than a judgement of fact, but it should be possible to construct a decision theory that is useful to both altruistic and non-altruistic agents

[PR FAQ] Improving tag notifications

"A tag can be considered roughly analogous to a subreddit" - Tags are not analogous to sub-reddit except in the broadest sense. The key aspect of sub-reddits that makes them so powerful is that they are structured to allow the formation of distinct communities with their own community norms.

[PR FAQ] Tagging users in posts and comments

I'm a bit skeptical tbh. It feels as though it might lead to a lot of spammy consisting of either just a tag or a tag plus something like "Check this out"

Seeking social science students / collaborators interested in AI existential risks

I've written a post on this topic here - https://www.lesswrong.com/posts/9rtWTHsPAf2mLKizi/counterfactuals-as-a-matter-of-social-convention.

BTW, I should be clear that my opinions on this topic aren't necessarily a mainstream position.

Seeking social science students / collaborators interested in AI existential risks

I am of the belief that counterfactuals are socially constructed to an extent and so it might be useful for someone from a social science background to investigate this - at least if you think there's value in MIRI's research agenda.

Load More