All of Jonas Hallgren's Comments + Replies

I appreciate you putting out a support post of someone who might have some EA leanings that would be good to pick up on. I may or may not have done so in the past and then removed the post because people absolutely shat on it on the forum 😅 so respect.

2
PeterSlattery
2d
Thanks, Jonas! I appreciate the support :P 

I guess I felt that a lot of the post was arguing under a frame of utilitarianism which is generally fair I think. When it comes to "not leaving a footprint on the future" what I'm referring to is epistemic humility about the correct moral theories. I'm quite uncertain myself about what is correct when it comes to morality with extra weight on utilitarianism. From this, we should be worried about being wrong and therefore try our best to not lock in whatever we're currently thinking. (The classic example being if we did this 200 years ago we might still ha... (read more)

Yes, I was on my phone, and you can't link things there easily; that was what I was referring to. 

I feel like this goes against the principle of not leaving your footprint on the future, no?

Like, a large part of what I believe to be the danger with AI is that we don't have any reflective framework for morality. I also don't believe the standard path for AGI is one of moral reflection. This would then to me say that we leave the value of the future up to market dynamics and this doesn’t seem good with all the traps there are in such a situation? (Moloch for example)

If we want a shot at a long reflection or similar, I don't think full sending AGI is the best thing to do.

6
Matthew_Barnett
17d
A major reason that I got into longtermism in the first place is that I'm quite interested in "leaving a footprint" on the future (albeit a good one). In other words, I'm not sure I understand the intuition for why we wouldn't deliberately try to leave our footprints on the future, if we want to have an impact. But perhaps I'm misunderstanding the nature of this metaphor. Can you elaborate? I think it's worth being more specific about why you think AGI will not do moral reflection? In the post, I carefully consider arguments about whether future AIs will be alien-like and have morally arbitrary goals, in a respect that you seem to be imagining. I think it's possible that I addressed some of the intuitions behind your argument here.

How will you address the conflict of interest allegations raised against your organisation? It feels like the two organisations are awfully intertwined. For gods sake, the CEOs are sleeping with each other! I bet they even do each other's taxes!

I'm joining the other EA. 

This was a dig at interpretability research. I'm pro-interpretability research in general, so if you feel personally attacked by this, it wasn't meant to be too serious. Just be careful, ok? :)

It makes sense for the dynamics of EA to naturally go in this way (Not endorsing). It is just applying the intentional stance plus the free energy principle to the community as a whole. I find myself generally agreeing with the first post at least and I notice the large regularization pressure being applied to individuals in the space.

I often feel the bad vibes that are associated with trying hard to get into an EA organisation. I'm doing for-profit entrepreneurship for AI safety adjacent to EA as a consequence and it is very enjoyable. (And more impactful... (read more)

I did enjoy the discussion here in general. I hadn't heard of the "illusionist" stance before and it does sound quite interesting yet I do find it quite confusing as well.

I generally find there to be a big confusion about the relation of the self to what "consciousness" is. I was in this rabbit hole of thinking about it a lot and I realised I had to probe the edges of my "self" to figure out how it truly manifested. A 1000 hours into meditation some of the existing barriers have fallen down. 

The complex attractor state can actually be experienced in m... (read more)

Thank you for this post! I will make sure to read the 5/5 books that I haven't read yet, especially excited about Joseph Heinrich's book from 2020, had read The Secret of Our Success before but not that one. 

I actually come from an AI Safety interest when it comes to moral progress. The question is to some extent for me on how we can set up AI systems so that they continuously improve "moral progress" as we don't want to leave our fingerprints on the future

In my opinion, the larger AI Safety dangers come from "big data hell" like the ones descr... (read more)

2
Rafael Ruiz
4mo
Hi Jonas! Henrich's 2020 book is very ambitious, but I thought it was really interesting. It has lots of insights from various disciplines, attempting to explain why Europe became the dominant superpower from the middle ages (starting to take off around the 13th century) to modernity. Regarding AI, I think it's currently beyond the scope of this project. Although I mention AI at some points regarding the future of progress, I don't develop anything in-depth. So sadly I don't have any new insights regarding AI alignment. I do think theories of cultural evolution and processes of mutations and selection of ideas could play a key role in predicting and shaping the long-term future, whether it's for humans or AI. So I'm excited for some social scientists or computer modellers to try to take this kind of work in a direction applied to making AI values dynamic and evolving (rather than static). But again, it's currently outside of the scope of my work and area of expertise.

The number of applications will affect the counterfactual value of applying. Now, saying your expected number might lower the number of people who will apply, but I would still appreciate having a range of expected applicants for the AI Safety roles. 

What is the expected amount of people applying for the AI Safety roles? 

6
Ajeya
6mo
It's hard to project forward of course, but currently there are ~50 applicants to the TAIS team and ~100 to the AI governance team (although I think a number of people are likely to apply close to the deadline).

I'm getting the vibe that your priors are on the world to some extent, being in a multipolar scenario in the future. I'm interested in more specifically what your predictions are for multipolarity versus singleton given the shard-theory thinking as it seems unlikely for recursive self-improvement to happen in the way described given what I understand of your model?

Great post; I enjoyed it.

I've got two things to say, the first one being that GPT is a very nice brainstorming tool as it generates many more ideas than you could yourself that you can then prune from, which is nice.

Secondly, I've been doing "peer coaching" with some EA people using reclaim.ai (not sponsored) to automatically book meetings each week where we take turns being the mentor and mentee answering the five following questions:

- What's on your mind?
 - When would today's setting be a success?
 - Where are you right now?
 - How do you ge... (read more)

Isn't estimated value calculated by the probability times the utility and as a consequence isn't the higher risk part wrong if one simply looks at it like this? (20% to 10% would be 10x the impact of 2% to 1%)

(I could be missing something here, please correct me in that case)

5
rileyharris
7mo
My understanding is that, at a high level, this effect is counterbalanced by the fact that a high rate of extinction risk means the expected value of the future is lower. In this example, we only reduce the risk this century to 10%, but next century it will be 20%, and the one after that it will be 20% and so on. So the risk is 10x higher than in the 2% to 1% scenario. And in general, higher risk lowers the expected value of the future.  In this simple model, these two effects perfectly counterbalance each other for proportional reductions of existential risk. In fact, in this simple model the value of reducing risk is determined entirely by the proportion of the risk reduced and the value of future centuries. (This model is very simplified, and Thorstad explores more complex scenarios in the paper).  

I didn't mean it in this sense. I think the lesson you drew from it is fair in general, I was just reacting to the things I felt you pulled under the rug, if that makes sense.

Sorry, Pablo, I meant that I got a lot more epistemically humble, I should have thought about how I phrased it more. It was more that I went from the opinion that many worlds is probably true to: "Oh man, there are some weird answers to the Wigner's friend thought experiment and I should not give a major weight to any." So I'm more like maybe 20% on many worlds? 

That being said I am overconfident from time to time and it's fair to point that out from me as well. Maybe you were being overconfident in saying that I was overconfident? :D

I will say that I thought the consciousness p zombie distinction was very interesting and a good example of overconfidence as this didn't come across in my previous comment.

Generally, some good points across the board that I agree with. Talking with some physicist friends helped me debunk the many worlds thing Yud has going. Similarly his animal consciousness stuff seems a bit crazy as well. I will also say that I feel that you're coming off way to confident and inflammatory when it comes to the general tone. The AI Safety argument you provided was just dismissal without much explanation. Also, when it comes to the consciousness stuff I honestly just get kind of pissed reading it as I feel you're to some extent hard pandering... (read more)

4
Omnizoid
8mo
I don't think I really overgeneralized from limited data.  Eliezer talks about tons of things, most of which I don't know about.  I know a lot about maybe 6 things that he talks about and expresses strong views on.  He is deeply wrong about at least four of them. 
3
Omnizoid
8mo
Eliezer talks about lots of topics that I don't know anything about.  So I can only write about the things that I do know about.  There are maybe five or six examples of that, and I think he has utterly crazy views in perhaps all except one of those cases.   I can't fat check him on physics or nanotech, for instance. 

Talking with some physicist friends helped me debunk the many worlds thing Yud has going.

Yudkowsky may be criticized for being overconfident in the many-worlds interpretation, but to feel that you have “debunked” it after talking to some physicist friends shows excessive confidence in the opposite direction. Have you considered how your views about this question would have changed if e.g. David Wallace had been among the physicists you talked to?

Also, my sense is that “Yud” was a nickname popularized by members of the SneerClub subreddit (one of the most i... (read more)

3
Jonas Hallgren
8mo
I will say that I thought the consciousness p zombie distinction was very interesting and a good example of overconfidence as this didn't come across in my previous comment.

Maybe frame it more as if you're talking to a child. Yes you can tell the child to follow something but how are you certain that it will do it?

Similarly, how can we trust the AI to actually follow the prompt? To trust it we would fundamentally have to understand the AI or safeguard against problems if we don't understand it. The question then becomes how your prompt is represented in machine language, which is very hard to answer.

To reiterate, ask yourself, how do you know that the AI will do what you say?

John Wentworth has a post on Godzilla strategies where he claims that putting an AGI to solve the alignment problem is like asking Godzilla to make a larger Godzilla behave. How will you ensure you don't overshoot the intelligence of the agent you're using to solve alignment and fall into the "Godzilla trap"?

3
Jonas Hallgren
9mo
(Leike responds to this here if anyone is interested)

TL;DR: I totally agree with the general spirit of this post, we need people to solve alignment, and we're not on track. Go and work on alignment but before you do, try to engage with the existing research, there are reasons why it exists. There are a lot of things not getting worked on within AI alignment research, and I can almost guarantee you that within six months to a year, you can find things that people haven't worked on. 

So go and find these underexplored areas in a way where you engage with what people have done before you!

There’s no secret e

... (read more)

Great tool; I've enjoyed it and used it for two years. I (a random EA) would recommend it.

Thank you for this! I'm hoping that this enables me to spend a lot less time on hiring in the future. I feel that this is a topic that could easily have taken me 3x the effort to understand if I hadn't gotten some very good resources from this post so I will definitely check out the book and again, awesome post!

1
Richard Möhn
2y
Thanks! I'm glad you liked it!

That makes sense, and I would tend to agree that the framing of contingency invokes more of a "what if I were to do this" feeling which might be more conducive toward people choosing to do more entrepreneurial thinking which in turn seems to have higher impact

Good post; interesting point with that the impact of the founder effect is probably higher in longtermism and I would tend to agree that starting a new field can have a big impact. (Such as wild animal suffering in space, NO FISH ON MARS!)

Not to be the guy that points something out, but I will be that guy; why not use the classic EA jargon of counterfactual impact instead of contingent impact?

2
Richard Ren
2y
Thanks a ton for your kind response (and for being the guy that points something out). :) "Counterfactual" & "replaceability" work too and essentially mean the same thing, so I'm really choosing which beautiful fruit I prefer in this instance (it doesn't really matter). I slightly prefer the word contingent because it feels less hypothetical and more like you're pulling a lever for impact in the future, which reflects the spirit I want to create in community building. It also seems reflect uncertainty better: e.g. the ability to shift the path dependence of institutions, the ability to shape long-term trends. Contingency captures how interventions affect the full probability spectrum and time-span, rather than just envisioning a hypothetical alternate history world with and without an intervention in x years. Thus, despite hearing the other phrases, it was the first word that clicked for me, if that makes sense.

Essentially that the epistemics of EA is better than in previous longtermist movements. EA's frameworks are a lot more advanced with things such as thinking about the traceability of a problem, not Goodharting on a metric, forecasting calibration, RCTs... and so on with techniques that other movements didn't have.

1
ChristianKleineidam
2y
Whether or not AI risk is tractable is in doubt. Eliezer argued that it's likely not tractable but that we should still invest in it. The longermist arguments about the value of the far future suggest that even if there's only a 0.1% chance that AI risk is tractable we should still fund it as the most important cause. 

Thank you! I was looking for this one but couldn't find it

The ones who aimed at the distant future mostly failed. The longtermist label seems mostly unneeded and unhelpful- and I’m far from the first to think so.

 

Firstly, in my mind, you're trying to say something akin to that we shouldn't advertise longtermism as it hasn't worked in the past. Yet this is a claim about the tractability of the philosophy and not necessarily about the idea that future people matter.

Don't confuse the philosophy with the instrumentals, longtermism matters, but the implementation method is still up for debate.

 But I don’t vi

... (read more)
4[anonymous]2y
What do you mean by this? 
3
Lizka
2y
Related: Hero Licensing (the title of the first section is "Outperforming the outside view"). 

This is completely unrelated to the great point you made with the comment but I felt I had to share a classic? EA tip that worked well for me. (uncertain how much this counts as a classic.) I got to the nice nihilistic bottom of realising that my moral system is essentially based on evolution but I reversed that within a year by reading a bunch of Buddhist philosophy and by meditating. Now it's all nirvana over here! (try it out now...)

https://www.lesswrong.com/posts/Mf2MCkYgSZSJRz5nM/a-non-mystical-explanation-of-insight-meditation-and-the

https://www.less... (read more)

If you feel like booking a meeting to help me out in some way, here's my calendly: https://calendly.com/jonas-hallgren

Also, it is more focused on easier projects that university groups and people new to EA can manage to do on their own. For example, the EA hotel is a cool idea but there are very few select people who could do it. More like running a conference in your city or helping with EA resources by creating a website.

Hey Aaron, I missed your message here so sorry for the late response. I should probably relabel this to Community Project Voting or something, the idea was to have a category specifically for individual projects, like a system where the highest rated posts are the most "recommended" ones by the community. The plan of the website was to give pathways to starting community projects such as a smaller conference or some smaller scalable thing that a lot of communities can do under some other organisation. The category would be used for clarity on what individual projects are the best ranked currently.

I find it quite ironic that I might have missed this post because of a high workload in school, dropping a class next term thanks to this post, so very much appreciated!

One of the bigger parts is probably that it would have a public prize attached to it. I get the feeling from people outside EA  that altruism is charity and nothing that you can actually do a career within. A person has a certain threshold of motivation before digging into EA. I believe this threshold would be easier to get through if you had a potential explicit reward at the end of it (a carrot on a stick). It might also generate some interesting ideas that could be tried out. Essentially, the idea is that it would turbocharge the fellowships as they would have something to apply the ideas of EA to.

To use EA terms, this was an absolute banger of a post. As someone who's 2 weeks from the process of starting up a club from scratch I will certainly use the tips and tricks in this post in the future, great fricking job man!

On a side note I've had some ideas about running an EA related competition on campus, have you thought about something similar before? I was thinking it would be like an essay competition with the goal of coming up with an idea that saves the most lives. You give the people interested information that they can evaluate their ideas wit... (read more)

3
kuhanj
3y
That's very sweet, thank you Jonas! I have been in some conversations about EA essay/idea competitions similar to what you've mentioned, but haven't thought much about it. I think we're also thinking about ideas like hackathons as experimental outreach mechanisms to try out. How do you think something like what you're proposing would compare to the more standard intro EA programming (like intro talks and fellowships)?

Great to "see" you Sean! I do remember our meeting during the conference, an interesting chat for sure. 
Thank you for the long and deliberate answer, I checked out the stuff you sent and it of course sent me down a rabbit hole of EA motivation which was quite cool. Other than that it makes sense to modify my working process and goals a bit in order to get motivation from other sources than altruism. I think the two main things I take with me from the advice here is to have a more written account of why I do things but most importantly I need to get in... (read more)

First and foremost I really liked the engineering pandemics video, I actually saw it independently a month ago and it reminds me a lot of good informational youtube content  (which is good).  I wanted to state my support for this because I think what you're trying to do is great! Other than that I'm a person with time on my hands and youtube experience, so I would love to help in whatever way I can, especially with demographics and outreach. (I've sent you a dm :) ).