All posts

New & upvoted

Monday, 20 May 2024
Mon, 20 May 2024

Frontpage Posts

26
adekcz
· · 3m read

Quick takes

53
Linch
1mo
10
Do we know if @Paul_Christiano or other ex-lab people working on AI policy have non-disparagement agreements with OpenAI or other AI companies? I know Cullen doesn't, but I don't know about anybody else. I know NIST isn't a regulatory body, but it still seems like standards-setting should be done by people who have no unusual legal obligations. And of course, some other people are or will be working at regulatory bodies, which may have more teeth in the future. To be clear, I want to differentiate between Non-Disclosure Agreements, which are perfectly sane and reasonable in at least a limited form as a way to prevent leaking trade secrets, and non-disparagement agreements, which prevents you from saying bad things about past employers. The latter seems clearly bad to have for anybody in a position to affect policy. Doubly so if the existence of the non-disparagement agreement itself is secretive.
Draft guidelines for new topic tags (feedback welcome) Topics (AKA wiki pages[1] or tags[2]) are used to organise Forum posts into useful groupings. They can be used to give readers context on a debate that happens only intermittently (see Time of Perils), collect news and events which might interest people in a certain region (see Greater New York City Area), collect the posts by an organisation, or, perhaps most importantly, collect all the posts on a particular subject (see Prediction Markets).  Any user can submit and begin using a topic. They can do this most easily by clicking “Add topic” on the topic line at the top of any post. However, before being permanently added to our list of topics, all topics are vetted by the Forum facilitation team. This quick take outlines some requirements and suggestions for new topics to make this more transparent. Similar, more polished, advice will soon be available on the 'add topic' page. Please give feedback if you disagree with any of these requirements.  When you add a new topic, ensure that: 1. The topic, or a very similar topic, does not already exist. If a very similar topic already exists, consider adding detail to that topic wiki page rather than creating a new topic.  2. You have used your topic to tag at least three posts by different authors (not including yourself). You will have to do this after creating the topic. The topic must describe a central theme in each post. If you cannot yet tag three relevant posts, the Forum probably doesn’t need this topic yet.  3. You’ve added at least a couple of sentences to define the term and explain how the topic tag should be used.    Not fulfilling these requirements is the most likely cause of a topic rejection. In particular, many topics are written with the aim of establishing a new term or idea, rather than collecting terms and ideas which already exist on the Forum. Other examples of rejected topics include: * Topic pages created for an individual. In certain cases, we permit these tags, for example, if the person is associated with a philosophy or set of ideas that is often discussed (see Peter Singer) and which can be clearly picked out by their name. However, in most cases, we don’t want tags for individuals because there would be far too many, and posts about individuals can generally be found through search without using tags. * Topics which are applicable to posts on the EA Forum, but which aren’t used by Forum users. For example, many posts could technically be described as “Risk Management”. However, EA forum users use other terms to refer to risk management content. 1. ^ Technically there can be a wiki page without a topic tag, i.e. a wiki page that cannot be applied to a post. However we don’t really use these, so in practice the terms are interchangeable. 2. ^ This term is used more informally. It is easier to say “I’m tagging this post” than “I’m topic-ing this post”
I spent way too much time organizing my thoughts on AI loss-of-control ("x-risk") debates without any feedback today, so I'm publishing perhaps one of my favorite snippets/threads: A lot of debates seem to boil down to under-acknowledged and poorly-framed disagreements about questions like “who bears the burden of proof.” For example, some skeptics say “extraordinary claims require extraordinary evidence” when dismissing claims that the risk is merely “above 1%”, whereas safetyists argue that having >99% confidence that things won’t go wrong is the “extraordinary claim that requires extraordinary evidence.”  I think that talking about “burdens” might be unproductive. Instead, it may be better to frame the question more like “what should we assume by default, in the absence of definitive ‘evidence’ or arguments, and why?” “Burden” language is super fuzzy (and seems a bit morally charged), whereas this framing at least forces people to acknowledge that some default assumptions are being made and consider why.  To address that framing, I think it’s better to ask/answer questions like “What reference class does ‘building AGI’ belong to, and what are the base rates of danger for that reference class?” This framing at least pushes people to make explicit claims about what reference class building AGI belongs to, which should make it clearer that it doesn’t belong in your “all technologies ever” reference class.  In my view, the "default" estimate should not be “roughly zero until proven otherwise,” especially given that there isn’t consensus among experts and the overarching narrative of “intelligence proved really powerful in humans, misalignment even among humans is quite common (and is already often observed in existing models), and we often don’t get technologies right on the first few tries.”
Working questions A mental technique I’ve been starting to use recently: “working questions.” When tackling a fuzzy concept, I’ve heard of people using “working definitions” and “working hypotheses.” Those terms help you move forward on understanding a problem without locking yourself into a frame, allowing you to focus on other parts of your investigation. Often, it seems to me, I know I want to investigate a problem without being quite clear on what exactly I want to investigate. And the exact question I want to answer is quite important! And instead of needing to be precise about the question from the beginning, I’ve found it helpful to think about a “working question” that I’ll then refine into a more precise question as I move forward. An example: “something about the EA Forum’s brand/reputation” -> “What do potential writers think about the costs and benefits of posting on the Forum?” -> “Do writers think they will reach a substantial fraction of the people they want to reach, if they post on the EA Forum?”
I find it encouraging that EAs have quickly pivoted to viewing AI companies as adversaries, after a long period of uneasily viewing them as necessary allies (c.f. Why Not Slow AI Progress?). Previously, I worried that social/professional entanglements and image concerns would lead EAs to align with AI companies even after receiving clear signals that AI companies are not interested in safety. I'm glad to have been wrong about that. Caveat: we've only seen this kind of scrutiny applied to OpenAI and it remains to be seen whether Anthropic and DeepMind will get the same scrutiny.