All of Joshc's Comments + Replies

The competition was cancelled. I think the funding for it was cut, though @Oliver Z can say more. I was not involved in this decision.

1
Oliver Z
1y
Yup, due to the FTX Collapse, the competition was no longer funded.

Thanks for this! I think it would be helpful to plot the median changes in extinction probabilities against the number of words in the article/video. I'm noticing a correlation as I click through the links and would be curious how strong it is (so this effect can be disentangled from the style of the source).

1
Otto
1y
Hi Joshc, thanks and sorry for the slow reply, it's a good idea! Unfortunately we don't really have time right now, but we might do something like this in the future. Also, if you're interested in receiving the raw data, let us know. Thanks again for the suggestion!

Yep, I have some ideas. Please DM me and give some info about yourself if you are interested in hearing them :)

Thanks for the referral. I agree that the distinction between serial time and parallel time is important and that serial time is more valuable. I'm not sure if it is astronomically more valuable though. There are two points we could have differing views on:

- the amount of expected serial time a successful (let's say $10 billion dollar) AI startup is likely to counterfactually burn. In the post I claimed that this seems unlikely to be more than a few weeks. Would you agree with this?
- the relative value of serial time to money (which is exchangeable with pa... (read more)

1
D0TheMath
1y
No, see my comment above. Its the difference between a super duper AGI and only a super-human AGI, which could be years or months (but very very critical months!). Plus whatever you add to the hype, plus worlds where you somehow make $10 billion from this are also worlds where you've had an inordinate impact, which makes me more suspicious the $10 billion company world is the one where someone decided to just make the company another AGI lab. Definitely not! Alignment is currently talent and time constrained, and very much not funding constrained. I don't even know what we'd buy that'd be worth $10 billion. Maybe some people have some good ideas. Perhaps we could buy lots of compute? But we can already buy lots of compute. I don't know why we aren't, but I doubt its because we can't afford it. Maybe I'd trade a day for $10 billion? I don't think I'd trade 2 days for $20 billion though. Maybe I'm just not imaginative enough. Any ideas yourself?

If your claim is that 'applying AI models for economically valuable tasks seems dangerous, i.e. the AIs themselves could be dangerous' then I agree. A scrappy applications company might be more likely to end the world than OpenAI/DeepMind... it seems like it would be good, then, if more of these companies were run by safety conscious people.

A separate claim is the one about capabilities externalities. I basically agree that AI startups will have capabilities externalities, even if I don't expect them to be very large. The question, then, is how much expected money we would be trading for expected time and what is the relative value between these two currencies.

It's unclear to me that having EA people starting an AI startup is more tractable than convincing other people that the work is worth funding

Yeah, this is unclear to me to. But you can encourage lots of people to pursue earn-to-give paths (maybe a few will succeed). Not many are in a position to persuade people, and more people having this as an explicit goal seems dangerous.

Also, as an undergraduate student with short timelines, the startup path seems like a better fit.

I don't see how the flexibility of money makes any difference? Isn't it frustratingly d

... (read more)

That's a good point. Here's another possibility:
Require that students go through a 'research training program' before they can participate in the research program. It would have to actually help prepare them for technical research though. Relabeling AGISF as a research training program would be misleading, so you would want to add a lot more technical content (reading papers, coding assignments, etc.)  It would probably be pretty easy to gauge how much the training program participants care about X-risk / safety and factor that in when deciding whethe... (read more)

Oo exciting. Yeah, the research program looks like it is closer to what I'm pitching. 

Though I'd also be excited about putting research projects right at the start of the pipeline (if they aren't already). It looks like AGISF is still at the top of your funnel and I'm not sure if discussion groups like these will be as good for attracting talent.

5
juliakarbing
1y
I really appreciate this kind of post :) Agree that no one has AIS field-building figured out and that more experimentation of different models would be great!  One of my main uncertainties about putting these kinds of research projects early on in the pipeline (and indeed one of the main reasons that the Oxford group has been putting it after a round of AGISF) is that having one early on makes it much harder to filter for people who are actually motivated by safety. Because there is such demand for getting to do research projects among ML students, we worried that if we didn't filter by having them do AGISF first, we might get lots of people who are actually mainly interested in capabilities research; and then be putting our efforts and resources towards potentially furthering capabilities rather than safety (by giving 'capabilities students' skills and experience in ML research). Do you have any thoughts on this? In particular; is there a particular reason that you don't worry about this?  If there's a way of running what you describe without having this be a significant risk, I'd be very excited about some groups trying this approach! And as Charlie mentions, the AI Safety Hub could be very happy to support them in running the research project (even at this early stage of the funnel). :)) 

Late to the party here, but I was wondering why these organizations need aligned engineering talent. Anthropic seems like the kind of org that talented, non-aligned people would be interested in...

These are reasonable concerns, thanks for voicing them. As a result of unforeseen events, we became responsible for running this iteration only a couple of weeks ago. We thought that getting the program started quickly — and potentially running it at a smaller scale as a result — would be better than running no program at all or significantly cutting it down.

The materials (lectures, readings, homework assignments) are essentially ready to go and have already been used for MLSS last summer. Course notes are supplementary and are an ongoing project.

We are pu... (read more)

Yeah, I would be in favor of interaction in simulated environments -- other's might disagree, but I don't think this influences the general argument very much as I don't think leaving some matter for computers will reduce the number of brains by more than an order of magnitude or so.

1
Guy Raveh
2y
That's not what I meant. What I tried to say is that the universe is full of beautiful things, like galaxies, plants, hills, dogs... More generally, complex systems with so many interesting things happening on so many scales. When I imagine a utopia, I picture a thriving human society in "harmony", or at least at peace, with nature. Converting all of it into simulated brains sounds like a dystopian nightmare to me. Since I first thought about my intrinsic values, I knew there's some divergence between e.g. valuing beauty and valuing happiness singularly. But I've never managed to imagine a scenario where increasing one goes so much against the other, until now. I think a large part of any hypothetical world being a utopia is that people would like to live in it. I'm not sure if you asked people about this scenario, they would find it favourable.

Having a superintelligence aligned to normal human values seems like a big win to me! 


Not super sure what this means but the 'normal human values' outcome as I've defined it hardly contributes to EV calculations at all compared to the utopia outcome. If you disagree with this, please look at the math and let me know if I made a mistake.

8
Larks
2y
Sure. The math is clearly very handwavy, but I think there are basically two issues. Firstly, the mediocre outcome supposedly involves a superintelligence optimising for normal human values, potentially including simulating people. Yet it only involves 10 billion humans per star, less than we are currently forecast to support on a single un-optimised planet using no simulations, no AGI help and relatively primitive technology. At the very least I would think we should be having massive terraforming and efficient food production to support much higher populations, if not full dyson spheres and simulations. It's not going to be as many people as the other scenario, but it'll hopefully be more than Earth2100. Secondly, I think the utilitarian outcome is over-valued on anything but purely utilitarian criteria. A world of soma-brains, without love, friendship, meaningful challenges etc. would strike many people as quite undesirable.  It seems like it would be relatively easy to make this world significantly better by conventional lights at relatively low utilitarian cost. For example, giving the simulated humans the ability to turn themselves off might incur a positive but small overhead (as presumably very few happy people would take this option), but be a significant improvement by the standards of a conventional ethics which value consent. 

Yep, I didn't initially understand you. That's a great point!

This means the framework I presented in this post is wrong. I agree now with your statement:

the EV of partly utilitarian AI is higher than that of fully utilitarian AI.


I think the framework in this post can be modified to incorporate this and the conclusions are similar. The quantity that dominates the utility calculation is now the expected representation of utilitarianism in the AGI's values.

The two handles become:
(1) The probability of misalignment.
(2) The expected representation of utilitaria... (read more)

4
Mau
2y
On second thought, another potential wrinkle, re:the representation of utilitarianism in the AI's values. Here are two ways that could be defined: * In some sort of moral parliament, what % of representatives are utilitarian? * How good are outcomes relative to what would be optimal by utilitarian lights? Arguably the latter definition is the more morally relevant one. The former is related but maybe not linearly. (E.g., if the non-utilitarians in a parliament are all scope-insensitive, maybe utilitarianism just needs 5% representation to get > 50% of what it wants. If that's the case, then it may make sense to be risk-averse with respect to expected representation, e.g., maximize the chances that some sort of compromise happens at all.)
2
Mau
2y
Thanks! From the other comment thread, now I'm less confident in the moral parliament per se being a great framework, but I'd guess something along those lines should work out.

Yep, thanks for pointing that out! Fixed it.

...I haven't seen much discussion about the downsides of delaying

I'm not sure how your first point relates to what I was saying in this post; but, I'll take a guess. I said something about how investing in capabilities at anthropic could be good. An upside to this would be increasing the probability that EAs end up controlling the super-intelligent AGI in the future. The downside is that it could shorten timelines, but hopefully this can be mitigated by keeping all of the research under wraps (which is what they ... (read more)

1
Ryan Beck
2y
Sorry, what I said wasn't very clear. Attempting to rephrase, I was thinking more along the lines of what the possible future for AI might look like if there were no EA interventions in the AI space. I haven't seen much discussion of the possible downsides there (for example slowing down AI research by prioritizing alignment resulting in delays in AI advancement and delays in good things brought about by AI advancement). But this was a less-than-half-baked idea, thinking about it some more I'm having trouble thinking of scenarios where that could produce a lower expected utility. Thanks, I follow this now and see what you mean.

I agree with Zach Stein-Perlman.  I did some BOTECs to justify this (see 'evaluating outcome 3').  If a reasonable candidate for a 'partially-utilitarian AI' leads to an outcome where there are 10 billion happy humans on average per star, then an AI that is using every last Joule of energy to produce positive experiences would produce at least ~ 10^15 times more utility.

[This comment is no longer endorsed by its author]Reply
3
Mau
2y
Why would a reasonable candidate for a 'partially-utilitarian AI' lead to an outcome that's ~worthless by utilitarian lights? I disagree with that premise--that sounds like a ~non-utilitarian AI to me, not a (nontrivially) partly utilitarian AI. (Maybe I could have put more emphasis on what kind of AI I have in mind. As my original comment mentioned, I'm talking about "a sufficiently strong version of 'partly-utilitarian.'" So an AI that's just slightly utilitarian wouldn't count. More concretely, I have in mind something like: an agent that operates via a moral parliament in which utilitarianism has > 10% of representation.) [Added] See also my reply to Zach, in which I write:

These are great! I'll add that you should be careful not to overbook yourself. I would leave an hour and a half in the middle of the day open in case you want to take a nap.

This could be helpful. Maybe posting questions on the EA forum and allowing the debate to happen in the comments could be a good format for this.

3
Harrison Durland
2y
The problem with using forums/comment chains is that the debate can become difficult to navigate and contribute to, due to the linear presentation of nested and parallel arguments. 2-dimensional/tree formats like Kialo seem to handle the problem much more efficiently, in my experience.

Got it! I edited the point about in-person reading so that it provides a more accurate portrayal of what you all are doing.