Actually, computer science conferences are peer reviewed. They play a similar role as journals in other fields. I think it's just a historical curiosity that it's conferences rather than journals that are the prestigious places to publish in CS!
Of course, this doesn't change the overall picture of some AI work and much AI safety work not being peer reviewed.
Thanks, this back and forth is very helpful. I think I've got a clearer idea about what you're saying.
I think I disagree that it's reasonable to assume that there will be a fixed N = 10^35 future lives, regardless of whether it ends up Malthusian. If it ends up not Malthusian, I think I'd expect the number of people in the future to be far less than whatever the max imposed by resource constraints is, ie much less than 10^35.
So I think that changes the calculation of E[saving one life], without much changing E[preventing extinction], because you need...
Ah nice, thanks for explaining! I'm not following all the calculations still, but that's on me, and I think they're probably right.
But I don't think your argument is actually that relevant to what we should do, even if it's right. That's because we don't care about how good our actions are as a fraction/multiple of what our other options are. Instead, we just want to do whatever leads to the best expected outcomes.
Suppose there was a hypothetical world where there was a one in ten chance the total figure population was a billion, and 90% chance the p...
I think your calculations must be wrong somewhere, although I can't quite follow them well enough to see exactly where.
If you have a 10% credence in Malthusianism, then the expected badness of extinction is 0.1*10^35, or whatever value you think a big future is. That's still a lot closer to 10^35 times the badness of one death than 10^10 times.
Does that seem right?
I agree with this comment, but I interpreted your original comment as implying a much greater degree of certainty of extinction assuming ASI is developed than you might have intended. My disagree vote was meant to disagree with the implication that it's near certain. If you think it's not near certain it'd cause extinction or equivalent, then it does seem worth considering who might end up controlling ASI!
You're stating it as a fact that "it is" a game of chicken, i.e. that it's certain or very likely that developing ASI will cause a global catastrophe because of misaligned takeover. It's an outcome I'm worried about, but it's far from certain, as I see it. And if it's not certain, then it is worth considering what people would do with aligned AI.
Thanks Vasco! :)
I agree that thinking about other moral theories is useful for working out what utilitarianism would actually recommend.
That's an interesting point re increasing the total amount of killing, I hadn't considered that! But I was actually picking up on your comment which seemed to say something more general - that you wouldn't intrinsically take into account whether an option involved (you) killing people, you'd just look at the consequences (and killing can lead to worse consequences, including in indirect ways, of course). But it sounds like...
For what it's worth, although I do think we are clueless about the long-run (and so overall) consequences of our actions, the example you've given isn't intuitively compelling to me. My intuition wants to say that it's quite possible that the cat vs dog decision ends up being irrelevant for the far future / ends up being washed out.
Sorry, I know that's probably not what you want to hear! Maybe different people have different intuitions.
I don't think OpenAI's near term ability to make money (e.g. because of the quality of its models) is particularly relevant now to its valuation. It's possible it won't be in the lead in the future, but I think OpenAI investors are betting on worlds where OpenAI does clearly "win", and the stickiness of its customers in other worlds doesn't really affect the valuation much.
So I don't agree that working on this would be useful compared with things that contribute to safety more directly.
How much do you think customers having 0 friction to switching away fro...
Interesting!
I think my worry is people who don't think they need advice about what the future should look like. When I imagine them making the bad decision despite having lots of time to consult superintelligent AIs, I imagine them just not being that interested in making the "right" decision? And therefore their advisors not being proactive in telling them things that are only relevant for making the "right" decision.
That is, assuming the AIs are intent aligned, they'll only help you in the ways you want to be helped:
I agree that the text an LLM outputs shouldn't be thought of as communicating with the LLM "behind the mask" itself.
But I don't agree that it's impossible in principle to say anything about the welfare of a sentient AI. Could we not develop some guesses about AI welfare by getting a much better understanding of animal welfare? (For example, we might learn much more about when brains are suffering, and this could be suggestive of what to look for in artificial neural nets)
It's also not completely clear to me what the relationship between the sentient being ...
Why does "lock-in" seem so unlikely to you?
One story:
Good question! I share that intuition that preventing harm is a really good thing to do, and I find striking the right balance between self-sacrifice and pursuing my own interests difficult.
I think if you argue that that leads to anything close to a normal life you are being disingenuous
I think this is probably wrong for most people. If you make yourself unhappy by trying to force yourself to make sacrifices you don't want to make, I think most people will be much less productive. And I think that most people actually need a fairly normal social life etc. ...
I think misaligned AI values should be expected to be worse than human values, because it's not clear that misaligned AI systems would care about eg their own welfare.
Inasmuch as we expect misaligned AI systems to be conscious (or whatever we need to care about them) and also to be good at looking after their own interests, I agree that it's not clear from a total utilitarian perspective that the outcome would be bad.
But the "values" of a misaligned AI system could be pretty arbitrary, so I don't think we should expect that.
I'm also confused by this. The use of "and" (instead of, say, "in that", "because", or "to the extent that") suggests that they've verified counterfactuality in some stronger way than just "the money won't go to us this season if you don't donate", but then they should be telling us how they know this.
Hi Isaac, this is a good question! I can elaborate more in the Q&A tomorrow but here are some thoughts:
Ultimatley a lot depends on your personal fit and comparative advantage. I think people should do the things they excel at. While I do think you can have a more scalable impact on the groups team, the groups team would have very little to no impact without the organizers working on the ground!
I can share some of the reasons that led me to prefer working at CEA over working on the ground:
To be fair, I think I'm partly making wrong assumptions about what exactly you're arguing for here.
On a slightly closer read, you don't actually argue in this piece that it's as high as 90% - I assumed that because I think you've argued for that previously, and I think that's what "high" p(doom) normally means.
Makes sense. To be clear, I think global health is very important, and I think it's a great thing to devote one's life to! I don't think it should be underestimated how big a difference you can make improving the world now, and I admire people who focus on making that happen. It just happens that I'm concerned the future might be even higher priority thing that many people could be in a good position to address.
On your last point, if you believe that the EV from a "effective neartermism -> effective longtermism" career change is greater than a "somewhat harmful career -> effective neartermism" career change, then the downside of using a "somewhat harmful career -> effective longtermism" example is that people might think the "stopped doing harm" part is more important than the "focused on longtermism" part.
More generally, I think your "arguments for the status quo" seem right to me! I think it's great that you're thinking clearly about the considerations on both sides, and my guess is that you and I would just weight these considerations differently.
Another thing on my mind is that we should beware surprising and suspicious convergence - it would be surprising and suspicious if the same intervention (present-focused WAW work) was best for improving animals' lives today and also happened to be best for improving animals' lives in the distant future.
I worry about people interested in animal welfare justifying maintaining their existing work when they switch their focus to longtermism, when actually it would be better if they worked on something different.
Thanks for your reply! I can see your perspective.
On your last point, but future-focused WAW interventions, I'm thinking of things that you mention in the tractability section of your post:
...Here is a list of ways we could work on this issue (directly copied from the post by saulius[9]):
“To reduce the probability of humans spreading of wildlife in a way that causes a lot of suffering, we could:
- Directly argue about caring about WAW if humans ever spread wildlife beyond Earth
- Lobby to expand the application of an existing international law that tries to protect
For the kinds of reasons you give, I think it could be good to get people to care about the suffering of wild animals (and other sentient beings) in the event that we colonise the stars.
I think that the interventions that decrease the chance of future wild animal suffering are only a subset of all WAW things you could do, though. For example, figuring out ways to make wild animals suffer less in the present would come under "WAW", but I wouldn't expect to make any difference to the more distant future. That's because if we care about wild animals, we'll fi...
I've only skimmed this, but just want to say I think it's awesome that you're doing your own thinking trying to compare these two approaches! In my view, you don't need to be "qualified" to try to form your own view, which depends on understanding the kinds of considerations you raise. This decision matters a lot, and I'm glad you're thinking carefully about it and sharing your thoughts.
Would you be eligible for the graduate visa? https://www.gov.uk/graduate-visa
If so, would that meet your needs?
The Superalignment team's goal is "to build a roughly human-level automated alignment researcher".
Human-level AI systems sound capable enough to cause a global catastrophe if misaligned. So is the plan to make sure that these systems are definitely aligned (if so, how?), or to make sure that they are deployed in a such a way that they would not be able to take catastrophic actions even if they want to (if so, what would that look like?)?
Thanks David, that's just the kind of reply I was hoping for! Those three goals do seem to me like three of the most important. It might be worth adding that context to your write-up.
I'm curious whether there's much you did specifically to achieve your third goal - inspiring people to take action based on high quality reasoning - beyond just running an event where people might talk to others who are doing that. I wouldn't expect so, but I'd be interested there was.
Interesting results, thanks for sharing! I think getting data from people who attend events is an important source of information about what's working and what's not.
I do worry a bit about what's best for the world coming apart from what people report as being valuable to them. (This comment ended up a bit rambley, sorry.)
Two main reasons that might be the case:
Do you think that most of GWWC's impact will come from money moved, or from introducing people to EA who then change their career paths, or something else? (I can't tell immediately tell from your strategy, which mentions both.)
Even if it's true that it can be hard to agree or disagree with a post as a whole, I do get the impression that people sometimes feel like they disagree with posts as a whole, and so simply downvote the post.
Also, I suspect it is possible to disagree with a post as a whole. Many posts are structured like "argument 1, argument 2, argument 3, therefore conclusion". If you disagree with the conclusion, I think it's reasonable to say that that's disagreeing with the post as a whole. If you agree with the arguments and the conclusion, then you agree with the po...
I wasn't sure if I was, but reading the guidelines matched my guess of what they would say, so I think I was familiar with them.