KW

keith_wynroe

269 karmaJoined Dec 2020

Comments
7

I think this response misses the woods for the trees here. It's true that you can fit some utility function to behaviour, if you make a more fine-grained outcome-space on which preferences are now coherent etc. But this removes basically all of the predictive content that Eliezer etc. assumes when invoking them.

In particular, the use of these theorems in doomer arguments absolutely does implicitly care about "internal structure" stuff - e.g. one major premise is that non-EU-maximising AI's will reflectively iron out the "wrinkles" in their preferences to better approximate an EU-maximiser, since they will notice that their e.g. incompleteness leads to exploitability. The OP argument shows that an incomplete-preference agent will be inexploitable by its own lights. The fact that there's some completely different way to refactor the outcome-space such that from the outside it looks like an EU-maximiser is just irrelevant.

>If describing a system as a decision theoretic agent is that cumbersome, it's probably better to look for some other model to predict its behaviour

This also seems to be begging the question - if I have something I think I can describe as a non-EU-maximising decision-theoretic agent, but which has to be described with an incredibly cumbersome utility function, why do we not just conclude that EU-maximisation is the wrong way to model the agent, rather than throwing out the belief that is should be modelled as an agent. If I have a preferential gap between A and B, and you have to jump through some ridiculous hoops to make this look EU-coherent ( "he prefers [A and Tuesday and feeling slightly hungry and saw some friends yesterday and the price of blueberries is <£1 and....] to [B and Wednesday and full and at a party and blueberries >£1 and...]" ), seems like the correct conclusion is not to throw away me being a decision-theoretic agent, but me being well-modelled as an EU-maximiser

>The less coherent and smart a system acts, the longer the utility function you need to specify...

These are two very different concepts? (Equating "coherent" with "smart" is again kinda begging the question). Re: coherence, it's just tautologous that the more complexly you have to partition up outcome-space to make things look coherent, the more complex the resulting utility function will be. Re: smartness, if we're operationalising this as "ability to steer the world towards states of higher utility", then it seems like smartness and utility-function-complexity are by definition independent. Unless you mean more "ability to steer the world in a way that seems legible to us" in which case it's again just tautologous

The coherence theorem part seems particularly egregious to me given how load-bearing it seems to be to a lot of his major claims. A frustration I have personally is that he seems to claim a lot that no one ever comes to him with good object-level objections to his arguments, but then when they do like in that thread he just refuses to engage 

>that makes extremely bright people with math PhDs make simple dumb mistakes that any rando can notice

Bright math PhDs that have already been selected for largely buying into Eliezer's philosophy/worldview, which changes how you should view this evidence. Personally I don't think FDT is wrong as much as just talking past the other theories and being confused about that, and that's a much more subtle mistake that very smart math PhDs could very understandably make

I think it's very plausible the reputational damage to EA from this - if it's as bad as it's looking to be  - will outweigh the good the Future Fund has done tbh

Agreed lots of kudos to the Future Fund people though

Thanks for writing this, really great post. 

I don't think this is super important, but when it comes to things like FTX I think it's also worth keeping in mind that besides the crypto volatility and stuff there's also the fact that a lot of what we're marking EA funding to aren't publicly-traded assets, and so numbers should probably be taken with an even bigger pinch of salt than usual. 

For example, the numbers for  FTX here are presumably backed out of the implied valuation from its last equity raise, but AFAIK this was at the end of January this year. Since then Coinbase (probably the best publicly traded comparator) stock has fallen ~62% in value, whereas FTX's nominal valuation hasn't changed in the interim since there hasn't been a capital raise. But presumably, were FTX to raise money today the implied valuation would reflect a somewhat similar move

Not a huge point, and in any case these kinds of numbers are always very rough proxies anyway since things aren't liquid, but I think maybe worth keeping in mind when doing BOTECs for EA funding

This looks wonderful, congrats. Dumb question on my end - there seems to be a lot of overlap in some areas with causes that Openphil target. My impression was Openphil wasn’t funding constrained in these areas and had more money to deploy than projects to put it into (maybe that’s not accurate though)

If it is - what do you see the marginal add of the Future fund being? E.g will it have a different set of criteria from Openphil such that it funds thing Openphil has seen but wouldn’t fund, or are you expecting a different pool of candidate projects that Openphil wouldn’t be seeing?

I know you allude to it briefly and dismiss it but it does seem like semantic externalism is maybe a better basis for a lot of the intuitions Indexical Dogmatism is getting at. It seems like you’re saying the Putnam BIV-style externalism argument is too strong because actual BIVs will use it and be wrong to do so, so if we use it to dismiss the problem there’s a chance we’re making a mistake too? But the fact that we laugh at them doesn’t mean they’re not stating something True in BIV-English right, I’m not sure if flows from something like that that it is a possibility we have to consider.

If the point is more: “We could wake up the BIV after he (it?) said that and it’d immediately do whatever the brain version of blushing is and admit the error of its ways”

or maybe

“We could start start talking to the BIVs through their simulated environment and convince them of their situation and give them an existential crisis”

Then that’s true but not really inconsistent with their earlier utterances, since presumably whatever kind of theory you’re working with that gives you semantic externalism would also tell you that the Brain-out-of-a-Vat is now speaking a different language.