Co-Director of Equilibria Network: https://eq-network.org/
I try to write as if I were having a conversation with you in person.
I would like to claim that my current safety beliefs are a mix between Paul Christiano's, Andrew Critch's and Def/Acc.
Some people might find that this post is written from a place of agitation which is fully okay. I think that even if you do there are two things that I would want to point out as really good points:
I think there's a very very interesting project of democratizingthe EA community in a way that makes it more effective. There are lots of institutional design that we can apply to ourselves and I would be very excited to see more work in this direction!
Edit:
Clarification on why I believe it to cause some agitation for some people:
This isn’t just a technical issue. This is a design philosophy — one that rewards orthodoxy, punishes dissent, and enforces existing hierarchies.
I liked the post, I think it made a good point, I strong upvoted it but I wanted to mention it as a caveat.
I felt that this post might be relevant for longtermism and person affecting views so I had claude write up a quick report on that:
In short: Rejecting the SWWM 💸11% pledge's EV calculation logically commits you to person-affecting views, effectively transforming you from a longtermist into a neartermist.
Example: Bob rejects investing in a $500 ergonomic chair despite the calculation showing 10^50 * 1.2*10^-49 = 12 lives saved due to "uncertainty in the probabilities." Yet Bob still identifies as a longtermist who believes we should value future generations. This is inconsistent, as longtermism fundamentally relies on the same expected value calculations with uncertain probabilities that SWWM uses.
The 🔮 Badge
If you've rejected the SWWM 💸11% Pledge while maintaining longtermist views, we'd appreciate if you could add the 🔮 "crystal ball" emoji to your social media profiles to signal your epistemic inconsistency.
FAQ
Why can't I reject SWWM but stay a longtermist?
Both longtermism and SWWM rely on the same decision-theoretic framework of accepting tiny probabilities of affecting vast future populations. Our analysis shows the error bars in SWWM calculations (±0.0000000000000000000000000000000000000000000001%) are actually narrower than the error bars in most longtermist calculations.
What alternatives do I have?
According to our comprehensive Fermi estimate, maintaining consistency between your views on SWWM and longtermism is approximately 4.2x more philosophically respectable.
First and foremost, I'm low confidence here.
I will focus on x-risk from AI and I will challenge the premise of this being the right way to ask the question.
What is the difference between x-risk and s-risk/increasing the value of futures? When we mention x-risk with regards to AI we think of humans going extinct but I believe that to be a shortform for wise compassionate decision making. (at least in the EA sphere)
Personally, I think that x-risk and good decision making in terms of moral value might be coupled to each other. We can think of our current governance conditions a bit like correction systems for individual errors. If they pile up, we go off the rail and increase x-risk as well as chances of a bad future.
So a good decision making system should both account for x-risk and value estimation, therefore the solution is the same and it is a false dichotomy?
(I might be wrong and I appreciate the slider question anyway!)
First and foremost, I agree with the point. I think looking at this especially from a lens of transformative AI might be interesting. (Coincidentally this is something I'm currently doing using ABMs with LLMs)
You probably know this one but here's a link to a cool project: https://effectiveinstitutionsproject.org/
Dropping some links below, I've been working on this with a couple of people in Sweden for the last 2 years, we're building an open source platform for better democratic decision making using prediction markets:
https://digitaldemocracy.world/flowback-the-future-of-democracy/
The people I'm working with there are also working on:
I know the general space here so if anyone is curious I'm happy to link to people doing different things!
You might also want to check out:
I guess a random thought I have here is that you would probably want video and you would probably want it to be pretty spammable so you have many shots at it. Looking at twitter we already see like a large amounts of bots around commenting on things which is like a text deepfake.
Like I can see in a year or so when SORA is good enough that creating a short form stabel video is easy we will see a lot more manipulation of voters through various social media through deepfakes.
(I don't think the tech is easy enough to use yet for it to be painless to do it even though it is possible. I spent a couple of hours trying to set this up for a showcase once and you had to do some fine-tuning and training stuff, there was no plug and play which is probably a bottleneck for now.)
FWIW, I find that if you analyze places where we've successfully aligned things in the past (social systems or biology etc.) you find that the 1th and 2nd types of alignment really don't break down in that way.
After doing Agent Foundations for a while I'm just really against the alignment frame and I'm personally hoping that more research in direction will happen so that we get more evidence that other types of solutions are needed. (e.g alignment of complex systems such as has happened in biology and social systems in the past)
FWIW, I completely agree with what you're saying here and I think that if you seriously go into consciousness research and especially for what we westerners more label as a sense of self rather than anything else it quickly becomes infeasible to hold a position that the way we're taking AI development, e.g towards AI agents will not lead to AIs having self-models.
For all matters and purposes this encompasses most theories of physicalist or non-dual theories of consciousness which are the only feasible ones unless you want to bite some really sour apples.
There's a classic "what are we getting wrong" question in EA and I think it's extremely likely that we will look back in 10 years and say, "wow, what are we doing here?".
I think it's a lot better to think of systemic alignment and look at properties that we want for the general collective intelligences that we're engaging in such as our information networks or our institutional decision making procedures and think of how we can optimise these for resillience and truth-seeking. If certain AIs deserve moral patienthood then that truth will naturally arise from such structures.
(hot take) Individual AI alignment might honestly be counter-productive towards this view.
I'm not a career councellor so take everything with a grain of salt but you did publically post this asking for unsolicited advice, so here you go!
So, more directly if you're thinking of EA as a community that needs specific skills and you're wondering what to do, your people management skills, strategy & general leadership skills are likely to be high in demand from other organisations: https://forum.effectivealtruism.org/posts/LoGBdHoovs4GxeBbF/meta-coordination-forum-2024-talent-need-survey
Someone else mentioned that enjoyment can be highly organisation specific and even specific to the stage of the organisation.
My thought is something like:
Those are some random thoughts, best of luck to you!
So I'll just give some reporting on a vibe I've been feeling on the forum.
I feel a lot more comfortable posting on LessWrong compared to the EA forum because it feels like there's a lot more moral outrage here? Like if I go back 3 or 4 years I felt that the forum was a lot more open to discussing and exploring new ideas. There's been some controversies recently around meat-eater problem stuff and similar and I can't help but just feel uncomfortable posting stuff with how people have started to react?
I like the different debate weeks as I think it sets up a specific context to create more content which is quite great. Maybe it's a vibe thing, maybe it's something else but I feel that the virtue of open-hearted truthseeking is missing compared to a couple of years back and it makes me want to avoid posting.
I do believe that the post standard should be lowered at least a bit and for things to be more exploratory again. So uhhhm, more events that invite more community writing and engagement?
This is very nice!
I've been thinking that there's a nice generalisable analogy between bayesian updating and forecasting. (It is quite no shit when you think about it but it feels like people aren't exploiting it?)
I'm doing a project on simulating a version of this idea but in a way that utilizes democratic decision making called Predictive Liquid Democracy (PLD) and I would love to hear if you have any thoughts on the general setup. It is model parameterization but within a specific democratic framing.
PLD is basically saying the following:
What if we could set up a trust based meritocratic voting network based on the predictions about how well a candidate will perform? It is futarchy with some twists.
Now for the generalised framing in terms of graphs that I'm thinking of:
I'm writing a paper on setting up the variational mathematics behind this right now. I'm also writing a paper on some more specific simulations of this to run so I'm very grateful for any thoughts you might have of this setup!