Will the AI alignment Slack continue to run?
Thanks JJ and everyone who has worked on AISS for all your great work!
Peter Singer and Tse Yip Fai were doing some work on animal welfare relating to AI last year: https://link.springer.com/article/10.1007/s43681-022-00187-z It looks like Fai at least is still working in this area. But I'm not sure whether they have considered or initiated outreach to AGI labs, that seems like a great idea.
I place significant weight on the possibility that when labs are in the process of training AGI or near-AGI systems, they will be able to see alignment opportunities that we can't from a more theoretical or distanced POV. In this sense, I'm sympathetic to Anthropic's empirical approach to safety. I also think there are a lot of really smart and creative people working at these labs.
Leading labs also employ some people focused on the worst risks. For misalignment risks, I am most worried about deceptive alignment, and Anthropic recently hired one of the peo...
A key part of my model right now relies on who develops the first AGI and on how many AGIs are developed.
If the first AGI is developed by OpenAI, Google DeepMind or Anthropic - all of whom seem relatively cautious (perhaps some more than others) - I put the chance of massively catastrophic misalignment at <20%.
If one of those labs is first and somehow able to prevent other actors from creating AGI after this, then that leaves my overall massively catastrophic misalignment risk at <20%. However, while I think it's likely one of these labs would be fir...
You're right - I wasn't very happy with my word choice calling Google the 'engine of competition' in this situation. The engine was already in place and involves the various actors working on AGI and the incentives to do so. But these recent developments with Google doubling down on AI to protect their search/ad revenue are revving up that engine.
It's somewhat surprising to me the way this is shaking out. I would expect DeepMind and OpenAI's AGI research to be competing with one another*. But here it looks like Google is the engine of competition, less motivated by any future focused ideas about AGI more just by the fact that their core search/ad business model appears to be threatened by OpenAI's AGI research.
*And hopefully cooperating with one another too.
I think it's not quite right that low trust is costlier than high trust. Low trust is costly when things are going well. There's kind of a slow burn of additional cost.
But high trust is very costly when bad actors, corruption or mistakes arise that a low trust community would have preempted. So the cost is lumpier, cheap in the good times and expensive in the bad.
(I read fairly quickly so may have missed where you clarified this.)
This is quite interesting and reminds me of a short option position as a previous hedge fund manager - you earn time decay or option premium when things are going well or stable, and then once in a while you take a big hit (and a lot of of people/orgs do not survive the hit). This is not a strategy I follow from a risk adjusted return point of view on a longer term perspective. I would not like to be short put option but rather be long call option and try to minimise my time decay or option premium. The latter is more work and time consuming but I ha...
To re-frame this:
High-trust assumes both good motivations and competence. High trust is nice because it makes things go smoother. But if there are any badly motivated or incompetent actors, insisting on high trust creates conditions for repeated devastating impacts. To further insist on high trust after significant shocks means people who no longer trust good motivations and competence leave.
FTX was a high-trust/bad act...
I think of it in terms of "Justified trust". What we want is a high degree of justified trust.
If a group shouldn't be trusted, but is, then that would be unjustified trust.
We want to maximize justified trust and minimize unjustified trust.
If trust isn't justified, then you would want corresponding levels of trust.
Generally, [unjustified trust] < [low-trust, not justified] < [justified trust]
No, I didn't talk about this. I agree that you can frame low-trust as a trade where you exchange lower catastrophe risk for higher ongoing costs.
Through that lens a decent summary of my argument is:
If anyone consults a lawyer about this or starts the process with FTXrepay@ftx.us , it could be very useful to many of us if you followed up here and shared what your experience of the process was like.
I'm a long-time fan of renting over buying. I've been happily renting apartments since I started living on my own around ~2006. I've never owned a place and don't have any wishes or plans to. I skimmed the John Halstead post you linked to - a lot of his overall points have been motivations for me as well.
Last time I really looked into this (it's been a few years), the price-to-rent ratio varied a lot depending on the kind of place you live in. Generally if you lived in a major city, the ratio greatly favored renters. But in some lower-populated / suburban ...
I'm surprised you were putting such high odds on it being a mistake at this point (even before the arrest). From my understanding (all public info), FTX's terms of service agreed that they would not touch customer funds. But then FTX loaned those funds to Alameda, who made risky bets with them.
IANAL but this seems to me like pretty clear case of fraud from FTX. I didn't think any of those aspects of the story were really disputed, but I have not been following the story as closely in the past week or so.
Likewise. At some point I had looked into using FTX for crypto. I didn't because it raised several red flags for me, but let me share my recollections about the historical ToS, fwiw:
Wayne, I see in another post you aren't sure what about its terms of service was violated. I can't pull up the historical ToS, and I'm not sure if they changed, but back when I looked into using it, I specifically looked for language that they were keeping the funds safe and not lending them out and I found that language. If the language somehow wasn't completely...
Will all the results of the survey be shared publicly on EA Forum? I couldn't find mention about this in the couple announcements I've seen for this survey.
It looks like at least some of the 2020 survey results were shared publicly. [1, 2, 3] But I can't find 2021 survey results. (Maybe there was no 2021 EA Survey?)
Thanks for the link and highlights!
...Sam claims that he donated to Republicans: "I donated to both parties. I donated about the same amount to both parties (...) That was not generally known (...) All my Republican donations were dark (...) and the reason was not for regulatory reasons - it's just that reporters freak the fuck out if you donate to Republicans [inaudible] they're all liberal, and I didn't want to have that fight". If true, this seems to fit the notion that Sam didn't just donate to look good (i.e. he donated at least partly because of his per
I read somewhere (sorry can’t remember where) that he only donated to republicans that were pushing longtermist things like pandemic preparedness
I am a big fan of gratitude practice. I try to write a little in a gratitude journal most nights, which has helped my overall state of mind since I started doing it. I would recommend anybody to try it, including people involved in EA. And I'm glad you suggested it, as a little gratitude during a crisis like this can be especially helpful.
I have some reservations about posting things I'm grateful for publicly on this forum though. Gratitude can be a bit vulnerable, and this forum has more eyes on it than usual lately. Posting to a community about why you'r...
Ultimately this was a failure of the EA ideas more so than the EA community. SBF used EA ideas as a justification for his actions. Very few EAs would condone his amoral stance w.r.t. business ethics, but business ethics isn't really a central part of EA ideas. Ultimately, I think the main failure was EAs failing to adequately condemn naive utilitarianism.
So I disagree with this because:
Thanks for clarifying. That helps me understand your concern about the unilateralist's curse with funders acting independently. But i don't understand why the OP proposal of evaluating/encouraging funding diversification for important cause areas would exacerbate it. Presumably those funders could make risky bets regardless of this evaluation. Is it because you think it would bring a lot more funders into these areas or give them more permission to fund projects that they are currently ignoring?
Was it this post by chance? https://forum.effectivealtruism.org/posts/AbohvyvtF6P7cXBgy/brainstorming-ways-to-make-ea-safer-and-more-inclusive This one seems to be on a very similar topic. But it has a different name so it's probably not the same one but possibly Richard revised the title at some point.
Thanks for explaining, but who are you considering to be the "regulator" who is "captured" in this story? I guess you are thinking of either OpenPhil or OpenAI's board as the "regulator" of OpenAI. I've always heard the term "regulatory capture" in the context of companies capturing government regulators, but I guess it makes sense that it could be applied to other kinds of overseers of a company, such as its board or funder.
I've also been very upset since the FTX scandal began, and I love this community too. I think you're right that EA will lose some people. But I am not so worried the community will collapse (although it's possible that ending the global EA community could be a good thing). People's memories are short, and all things pass. In one year, I would be willing to bet you there will still be lots of (and still not enough!) good people working on and donating to important, tractable, and neglected causes. There will still be an EA Forum with lively debates ha...
Can you clarify which "public hearings" were demanded? Not sure if you're talking about how quickly the bankruptcy process has been moving at FTX, or how the reactions from people on EA Forum since the news about FTX started.
I followed this link, but I don't understand what it has to do with regulatory capture. The linked thread seems to be about nepotistic hiring and conflicts of interest at/around OpenAI.
OpenPhil recommended a $30M grant to OpenAI in a deal that involved the OP (then-CEO of OpenPhil) becoming a board member of OpenAI. This occurred no later than March 2017. Later, OpenAI appointed both the OP's then-fiancée and the fiancée’s sibling to VP positions. See these two LinkedIn profiles and the "Relationship disclosures" section in this OpenPhil writeup.
It seems plausible that there was a causal link between the $30M grant and the appointment of the fiancée and her sibling to VP positions. OpenAI may have made these appointments while hoping to ...
I have thoughts on other points you made but just wanted to comment on this one bit for the moment:
I think it is very strange that SBF had so many inconsistencies between his claimed moral positions and his behavior and nobody noticed it.
Habryka noticed. His full comment is at that link but here are some key excerpts (emphasis mine):
...Like, to be clear, I think the vast majority of EAs had little they could have or should have done here. But I think that I, and a bunch of people in the EA leadership, had the ability to actually do something about this.
I sent
Thanks for the quality summary. I finally opened this post after a couple days of ignoring it in my doomscrolling, because I thought there would be nothing new in it vs. other recent posts on FTX. But I found this critique actually gave me some new things to think about.
I think it's a good question, but I'm unsure whether a public discussion calling out names is the right way to go. And I think it might be net negative.
On the one hand, publicly calling people out could raise red flags about potential bad actors in the EA community early and in a transparent way. On the other hand, it could lead to witch hunts, false positives, etc. You can also imagine that whatever bad actors are in the community would start retaliating and listing good actors here who are their enemies, so it could cause a lot of confusion and infightin...
Commending Habryka for willing to share about these things. It takes courage and I think reflections/discussions like this could be really valuable (perhaps essential) to the EA community having the kind of reckoning about FTX that we need.
Maybe, but not so clear. He may have cared about consequentialist utilitarianism but just faked the deontology part.
There was a post Linch did earlier this year about their experience as a junior grantmaker in EA. It had an interesting part about conflicts of interest:
...4. Conflicts of Interests are unavoidable if you want to maximize impact as a grantmaker in EA.
a. In tension with the above point, the EA community is just really small, and the subcommunities of fields within it (AI safety, or forecasting, or local community building) are even smaller.
b. Having a CoI often correlates strongly with having enough local knowledge to make an inform
That's true but hiding karma is an important thing for reducing bias (e.g. people being inclined to upvote something because it's already upvoted) and mental health. So it would be a great thing to have has a built-in feature (and perhaps even the default) so that it's accessible for many more users.
Karma scores are still important when it's hidden for sorting content by quality/popularity, it's just more of a behind-the-scenes thing.
(Thanks for all your great work on these forums.)
I personally actually think that voting towards the karma score you think a comment should have, instead of voting independently, produces a better aggregate judgement (I.e. I downvote 200 karma posts if I think they should only have 150 karma, and I upvote comments at -5 if I think they should only be at -2). This of course requires seeing the current vote, and so am a bit hesitant to have too large of a population to a different thing instead (and to have a UI that makes it feel like people are supposed to vote independently).
That said, we don't have any super clear guidance on this topic, and there is a decent amount of disagreement in the user base on what style of voting (independent or relative) is better.
LessWrong has no such feature built-in that I'm aware of. Though there are external tools people have created to hide karma.
The extension for blinding karma and author names has been a game changer for me. Massively improves my forum experience. Strong upvote, it'd be great to have these as native features so that they are much more accessible and others can enjoy the debiasing and mental health benefits.
I tend to prefer blinding karma instead of the author name. But they're both useful at different times. I think adding both and making independently controllable would be a huge step forward. Then the community can experiment with favorite configurations for different contexts....
Strongly agree about the vote counts, I've been using a browser plugin to hide them for a couple months now. I think it should be a forum option and probably the default.
Stuart Russell is probably the most prominent example.
I think Dan Hendryks is doing good work in this area as well, as well as a bunch of people on the AI alignment team at DeepMind.
But yea, it'd be great if a lot more ML researchers/engineers engaged with the AI x-risk arguments and alignment research.
The U.S. National Institute of Standards and Technology (“NIST”) is developing an AI Risk Management Framework.
Just a sidenote for anyone interested in this. There is an existing effort from some folks in the AI safety community to influence the development of this framework in a positive direction. See Actionable Guidance for High-Consequence AI Risk Management (Barett et al. 2022).
I see what you mean, but if you value cause prioritization seriously enough, it is really stifling to have literally no place to discuss x-risks in detail. Carefully managed private spaces are the best compromise I've seen so far, but if there's something better then I'd be really glad to learn about it.
That is high-value work. Holden Karnofsky's list of "important, actionable research questions" about AI alignment and strategy includes one about figuring out what should be done in deployment of advanced AI and leading up to it (1):
...How do we hope an AI lab - or government - would handle various hypothetical situations in which they are nearing the development of transformative AI, and what does that mean for what they should be doing today?
Luke Muehlhauser and I sometimes refer to this general sort of question as the “AI deployment problem”: the questio
You could avoid such infohazards by drawing up the scenarios in a private message or private doc that's only shared with select people.
I love the idea of reading silently together, but I don't think it wouldn't have occurred to me without this post.
In particular, the analogy with alchemy seems apropos given that concepts like sentience are very ill posed.
I took another look at that section, interesting to learn more about the alchemists.
I think most AI alignment researchers consider 'sentience' to be unimportant for questions of AI existential risk - it doesn't turn out to matter whether or not an AI is conscious or has qualia or anything like that. [1] What matters a lot more is whether AI can model the world and gain advanced capabilities, and AI systems today are making pretty quick progress along...
That's a very good point.
With the assumption of longtermist ethics which I mentioned in the post, I think the difference in likelihoods has to be very large to make a difference though. Because placing equal value on future human lives to present ones makes extinction risks astronomically worse than catastrophic non-extinction risks.
(I don't 100% subscribe to longtermist ethics, but that was the frame I was taking for this post.)
You may have better luck getting responses to this posting on LessWrong with the 'AI' and 'AI Governance' (https://www.lesswrong.com/tag/ai-governance) tags, and/or on the AI Alignment Slack.
I skimmed the article. IMO it looks like a piece from circa 2015 dismissive of AI risk concerns. I don't have time right now to go through each argument, but it looks pretty easily refutable esp. with all that we've continued to learn about AI risk and the alignment problem in the past 8 years.
Was there a particular part from that link you found particularly compelling?
Agreed. The trend of writing "Epistemic status" as one of the first things in a post without a definition or explanation (kudos to Lizka for including one) has bothered me for some time. It immediately and unnecessarily alienates readers by making them feel like they need to be familiar with the esoteric word "epistemic", which usually has nothing to do with the rest of the post.
Would be happy to see this frequent jargon replaced with something like "How much you should trust me", "Author confidence" or "Post status" (maybe there's a better phrase, just some examples that come to mind).
Welcome to the field! Wow, I can imagine this post would be an intense crash course! :-o
There are some people who spend time on these questions. It's not something I've spent a ton of time on, but I think you'll find interesting posts related to this on LessWrong and AI Alignment Forum, e.g. using the value learning tag. Posts discussing 'ambitious value learning' and 'Coherent Extrapolated Volition' should be pretty directly related to your two questions.
Really interesting observations.
I would say the conversion rate is actually shockingly low. Maybe CEA has more information on this, but I would be surprised if more than 5% of people who do Introductory EA fellowships make a high impact career change.
Do you have any sense of how many of those people are earning to give or end up making donation to effective causes play a significant role in their lives? I wonder if 5% is at least a little pessimistic for the "retention" of effective altruists if it's not accounting for people who take this path to making an impact.
My first 2 posts for this project went live on the Alignment Forum today:
1. Introduction to the sequence: Interpretability Research for the Most Important Century
2. (main post) Interpretability’s Alignment-Solving Potential: Analysis of 7 Scenarios
I learned a lot from reading this post and some of the top comments, thanks for the useful analysis.
Throughout the post and comments people are tending to classify AI safety as a "longtermist" cause. This isn't wrong, but for anyone less familiar with the topic, I just want to point out that there are many of us who work in the field and consider AI to be a near-to-medium term existential risk.
Just in case "longtermism" gave anyone the wrong impression that AI x-risk is something we definitely won't be confronted with for 100+ years. Many of us think it wi...
Open Phil claims that campaigns to make more Americans go vegan and vegetarian haven't been very successful. But does this analysis account for immigration?
If people who already live in the US are shifting their diets, but new immigrants skew omnivore, a simple analysis could easily miss the former shift because immigration is fairly large in the US.
Source of Open Phil claim at https://www.openphilanthropy.org/research/how-can-we-reduce-demand-for-meat/ :
... (read more)Although the cited Gallup report doesn't explicitly distinguish on immigrant status or ethnicity, it does say that "[a]lmost all segments of the U.S. population have similar percentages of vegetarians" while noting a larger difference in marital status.
Even if one assumes that almost no immigrants are vegetarian, the rate of immigration isn't so high as to really move a low percentage very much. As of 2018, there were ~45M people in the US who were born in another country. [https://www.pewresearch.org/short-reads/2020/08/20/key-findings-about-u-s-immigrant... (read more)