(Post 3/N with some rough notes on AI governance field-building strategy.
Posting here for ease of future reference, and in case anyone else thinking
about similar stuff finds this helpful.)
SOME HOT TAKES ON AI GOVERNANCE FIELD-BUILDING STRATEGY
* More people should consciously upskill as ‘founders’, i.e. people who form
and lead new teams/centres/etc. focused on making AI go well
* A case for more founders: plausibly in crunch time there will be many more
people/teams within labs/govs/think-tanks/etc. that will matter for how AI
goes. Would be good if those teams were staffed with thoughtful and
risk-conscious people.
* What I think is required to be a successful founder:
* Strong in strategy (to steer their team in useful directions), management
(for obvious reasons) and whatever object level work their team is doing
* Especially for teams within existing institutions, starting a new team
requires skill in stakeholder management and consensus building.
* Concrete thing you might consider doing: if you think you might want to be
a founder, and you agree with the above list of skills, think above how to
close your skill gaps
* More people should consciously upskill for the “AI endgame” (aka “acute risk
period” aka “crunch time”). What might be different in the endgame and what
does this imply about what people should do now?
* Lots of ‘task force-style advising’ work
* → people should practise it now
* Everyone will be very busy, especially senior people, so it won’t work as
well to just defer
* → build your own models
* More possible to mess things up real bad
* → start thinking harder about worst-case scenarios, red-teaming, etc.
now, even if it seems a bit silly to e.g. spend time tightening up your
personal infosec
* The world may well be changing scarily fast
* → practice decision-making under pressure and uncertainty. Strategy might
(Post 4/N with some rough notes on AI governance field-building strategy.
Posting here for ease of future reference, and in case anyone else thinking
about similar stuff finds this helpful.)
SOME EXERCISES FOR DEVELOPING GOOD JUDGEMENT
I’ve spent a bit of time over the last year trying to form better judgement.
Dumping some notes here on things I tried or considered trying, for future
reference.
* Jump into the mindset of “the buck stops at me” for working out whether some
project takes place, as if you were the grantmaker having to make the
decision. Ask yourself: “wait, should this actually happen?”[1]
* (Rather than “does anything jump out as incorrect” or “do I have any random
comments/ideas”—which are often helpful mindsets to be in when giving
feedback to people, but don’t really train the core skill of good
judgement.)
* I think forecasting trains a similar skill to this. I got some value from
making some forecasts in the Metaculus Beginners’ Tournament.
* Find Google Docs where people (whose judgement you respect) have left
comments and an overall take on the promisingness of the idea. Hide their
comments and form your own take. Compare. (To make this a faster process,
pick a doc/idea where you have enough background knowledge to answer without
looking up loads of things).
* Ask people/orgs for things along the lines of [minimal trust investigations |
grant reports | etc.] that they’ve written up. Do it yourself. Compare.
* Do any of the above with a friend; write your timeboxed answers then compare
reasoning.
1. ^
I think this framing of the exercise might have been mentioned to me by
Michael Aird.
A small exercise to inspire empathy/gratitude for people who grew up with access
to healthcare:
If you'd lived 150 years ago, what might you have died of as a child?
I got pneumonia when I was four and it probably would have killed me without
modern medicine.
(Post 6/N with some rough notes on AI governance field-building strategy.
Posting here for ease of future reference, and in case anyone else thinking
about similar stuff finds this helpful.)
SOME HEURISTICS FOR PRIORITISING BETWEEN TALENT PIPELINE INTERVENTIONS
Explicit backchaining is one way to do prioritisation. I sometimes forget that
there are other useful heuristics, like:
* Cheap to pilot
* E.g. doesn't require new infrastructure or making a new hire
* Cost is easier to estimate than benefit, so lower cost things tend to be more
likely to actually happen
* Visualise some person or org has been actually convinced to trial the thing.
Imagine the conversation with that decision-maker. What considerations
actually matter for them?
* Is there someone else who would do most of the heavy lifting?
(Post 2/N with some rough notes on AI governance field-building strategy.
Posting here for ease of future reference, and in case anyone else thinking
about similar stuff finds this helpful.)
MISC THINGS IT SEEMS USEFUL TO DO/FIND OUT
* To inform talent development activities: talk with relevant people who have
skilled up. How did they do it? What could be replicated via talent pipeline
infrastructure? Generally talk through their experience.
* Kinds of people to prioritise: those who are doing exceptionally well;
those who have grown quite recently (might have better memory of what they
did)
* To inform talent search activities: talk with relevant people—especially
senior folks—about what got them involved. This could feed into earlier stage
talent pipeline activities
* Case studies of important AI governance ideas (e.g. model evals, importance
of infosec) and/or pipeline wins. How did they come about? What could be
replicated?
* How much excess demand is there for fellowship programs? Look into the
strength of applications over time. This would inform how much value there is
in scaling fellowships.
* Figure out whether there is a mentorship bottleneck.
* More concretely: would it be overall better if some of the more established
AI governance folk spent a few more hours per month on mentorship?
* Thing to do: very short survey asking established AI governance people how
many hours per month they spend on mentorship.
* Benefits of mentorship:
* For the mentee: fairly light touch involvement can go a long way towards
bringing them up to speed and giving them encouragement.
* For the mentor: learn about fit for mentorship/management. Can be helpful
for making object-level progress on work.
* These benefits are often illegible and delayed in time, so a priori likely
to be undersupplied.
* If there’s a mentorship bottleneck, it might be important to solve ~now.
The nu
Reflecting on the question of CEA's mandate, I think it's challenging that CEA
has always tried to be both, and this has not worked out well.
1) a community org
2) a talent recruitment org
When you're 1) you need to think about the individual's journey in the movement.
You invest in things like community health and universal groups support. It's
important to have strong lines of communication and accountability to the
community members you serve. You think about the individual's journey
[https://forum.effectivealtruism.org/posts/PbtXD76m7axMd6QST/the-funnel-or-the-individual-two-approaches-to-understanding#An_individual_approach]
and how to help addres those issues. (Think your local Y, community center or
church)
When you're 2) you care about finding and supporting only the top talent (and by
extension actors that aid you in this mission). You care about having a healthy
funnel
[https://forum.effectivealtruism.org/posts/PbtXD76m7axMd6QST/the-funnel-or-the-individual-two-approaches-to-understanding#The_Funnel_Model]
of individuals who are at the top of their game. You care about fostering an
environment that is attractive (potentially elite), prestigious and high status.
(Think Y-Combinator, Fullbright or Emergent Ventures Fellows).
I think these goals are often overlapping and self-reinforcing, but also at odds
with each other.
It is really hard to thread that needle well - it requires a lot of nuanced,
high-fidelity communication - which in turn requires a lot of capacity
(something historically short-of-stock in this movement).
I don't think this is a novel observation, but I can't remember seeing it
explicitly stated in conversation recently.
I didn't learn about Stanislav Petrov until I saw announcements about Petrov Day
a few years ago on the EA Forum. My initial thought was "what is so special
about Stanislav Petrov? Why not celebrate Vasily Arkhipov?"
I had known about Vasily Arkhipovfor years, but the reality is that I don't
think one of them is more worthy of respect or idolization than the other. My
point here is more about something like founder effects, path dependency, and
cultural norms. You see, at some point someone in EA (I'm guessing) arbitrarily
decided that Stanislav Petrov was more worth knowing and celebrating than Vasily
Arkhipov, and now knowledge of Stanislav Petrovis widespread (within this very
narrow community). But that seems pretty arbitrary. There are other things like
this, right? Things that people hold dear or believe that are little more than
cultural norms, passed on because "that is the way we do things here."
I think a lot about culture and norms, probably as a result of studying other
cultures and then living in other countries (non-anglophone countries) for most
of my adult life. I'm wondering what other things exist in EA that are like
Stanislav Petrov: things that we do for no good reason other than that other
people do them.
Rational Animations has a subreddit:
https://www.reddit.com/r/RationalAnimations/
[https://www.reddit.com/r/RationalAnimations/]
I hadn't advertised it until now because I had to find someone to help moderate
it.
I want people here to be among the first to join since I expect having EA Forum
users early on would help foster a good epistemic culture.
Suing people nearly always makes you look like the assholes I think.
As for Torres, it is fine for people to push back against specific false things
they say. But fundamentally, even once you get past the misrepresentations,
there is a bunch of stuff that they highlight that various prominent EAs really
do believe and say that genuinely does seem outrageous or scary to most people,
and no amount of pushback is likely to persuade most of those people otherwise.
In some cases, I think that outrage fairly clearly isn't really justified once
you think things through very carefully: i.e. for example the quote from Nick
Beckstead about saving lives being all-things-equal higher value in rich
countries, because of flow-through effects which Torres always says makes
Beckstead a white supremacist. But in other cases well, it's hardly news that
utilitarianism has a bunch of implications that strongly contradict moral
commonsense, or that EAs are sympathetic to utilitarianism. And 'oh, but I don't
endorse [outrageous sounding view], I merely think there is like a 60% chance it
is true, and you should be careful about moral uncertainty' does not sound very
reassuring to a normal outside person.
For example, take Will on double-or-nothing gambles
(https://conversationswithtyler.com/episodes/william-macaskill/) where you do
something that has a 51% chance of destroying everyone, and a 49% chance of
doubling the number of humans in existence (now and in the future). It's a
little hard to make out exactly what Will's overall position on this, but he
does say it is hard to justify not taking those gambles:
'Then, in this case, it’s not an example of very low probabilities, very large
amounts of value. Then your view would have to argue that, “Well, the future, as
it is, is like close to the upper bound of value,” in order to make sense of the
idea that you shouldn’t flip 50/50. I think, actually, that position would be
pretty hard to defend, is my guess. My thought is that,
Note: this sounds like it was written by chatGPT because it basically
[https://audiopen.ai] was (from a recorded ramble)🤷
I believe the Forum could benefit from a Shorterform page, as the current
Shortform forum, intended to be a more casual and relaxed alternative to main
posts, still seems to maintain high standards. This is likely due to the
impressive competence of contributors who often submit detailed and
well-thought-out content. While some entries are just a few well-written
sentences, others resemble blog posts in length and depth.
As such, I find myself hesitant to adhere to the default filler text in the
submission editor when visiting this page. However, if it were more informal and
less intimidating in nature, I'd be inclined to post about various topics that
might otherwise seem out of place. To clarify, I'm not suggesting we resort to
jokes or low-quality "shitposts," but rather encourage genuine sharing of
thoughts without excessive analysis.
Perhaps adopting an amusing name like "EA Shorterform" would help create a more
laid-back atmosphere for users seeking lighter discussions. By doing so, we may
initiate a preference falsification cascade where everyone feels comfortable
enough admitting their desire for occasional brevity within conversations. Who
knows? Maybe I'll start with posting just one sentence soon!
The Met (a major art museum in NYC) is returning $550K in FTX-linked donations;
article below includes link to the court filing. 100% return, donations were
outside of 90 days. This is the first court filing of this nature I'm aware of,
although I haven't been watching comprehensively.
A smart move for the Met, I think. I doubt it had any viable defenses, it
clearly has $550K to return without causing any hardship, that's enough money
for the FTX estate to litigate over, and it avoids bad PR by agreeing to turn
100% of the money over without litigation. Perhaps it could have negotiated a
small discount, but saving $50K or whatever just wouldn't have been worth it in
light of PR/optics concerns. (Plus, I think the Met was very likely obliged to
return the whole $550K from an ethical perspective anyway . . . . { edit:
perhaps with a small deduction for its legal expenses })
https://www.coindesk.com/policy/2023/06/05/new-yorks-met-museum-agrees-to-return-550k-in-ftx-donations/
[https://www.coindesk.com/policy/2023/06/05/new-yorks-met-museum-agrees-to-return-550k-in-ftx-donations/]
I wrote up my nutrition notes here, from my first year of being vegan:
http://www.lincolnquirk.com/2023/06/02/vegan_nutrition.html
[http://www.lincolnquirk.com/2023/06/02/vegan_nutrition.html]
Scattered and rambly note I jotted down in a slack in February 2023, and didn't
really follow up on
--------------------------------------------------------------------------------
thinking of jotting down some notes about "what AI pessimism funding ought to
be", that takes into account forecasting and values disagreements.The premises:
* threatmodels drive research. This is true on lesswrong when everyone knows it
and agonizes over "am I splitting my time between hard math/cs and
forecasting or thinking about theories of change correctly?" and it's true in
academia when people halfass a "practical applications" paragraph in their
paper.
* people who don't really buy into the threatmodel they're ostensibly working
on do research poorly
* social pressures like funding and status make it hard to be honest about what
threatmodels motivate you.
* I don't overrate democracy or fairness as terminal values, I'm bullish on a
lot of deference and technocracy (whatever that means), but I may be feeling
some virtue-ethicsy attraction toward "people feeling basically represented
by governance bodies that represent them", that I think is tactically useful
for researchers because the above point about research outputs being more
useful when the motivation is clearheaded and honest.
* fact-value orthogonality, additionally the binary is good and we don't need a
secret third thing if we confront uncertainty well enough
The problems I want to solve:
* thinking about inclusion and exclusion (into "colleagueness" or stuff that
funder's care about like "who do I fund") is fogged by tribal conflict where
people pathologize eachother (salient in "AI ethics vs. AI alignment".
twitter is the mindkiller but occasionally I'll visit, and I always feel like
it makes me think less clearly)
* no actual set of standards for disagreement to take place in, instead we have
wishy washy stuff like "the purple hats undervalue standpoint
I've generally been quite optimistic that the increased awareness AI xRisk has
got recently can lead to some actual progress in reducing the risks and harms
from AI. However, I've become increasingly sad at the ongoing rivalry between
the AI 'Safety' and 'Ethics' camps[1] 😔 Since the CAIS Letter was released,
there seems to have been an increasing level of hostility on Twitter between the
two camps, though my impression is that the holistility is mainly
one-directional.[2]
I dearly hope that a coalition of some form can be built here
[https://forum.effectivealtruism.org/posts/Q4rg6vwbtPxXW6ECj/we-are-fighting-a-shared-battle-a-call-for-a-different#Why_this_is_a_shared_battle],
even if it is an uneasy one, but I fear that it might not be possible. It
unfortunately seems like a textbook case of mistake vs conflict theory
[https://slatestarcodex.com/2018/01/24/conflict-vs-mistake/] approaches at work?
I'd love someone to change my mind, and say that Twitter amplifies the loudest
voices,[3] and that in the background people are making attempts to build
bridges. But I fear that instead that the centre cannot hold, and that there
will be not just simmering resentment but open hostility between the two camps.
If that happens, then I don't think those involved in AI Safety work can afford
to remain passive in response to sustained attack. I think that this has already
damaged the prospects of the movement,[4] and future consequences could be even
worse. If the other player in your game is constantly defecting, it's probably
time to start defecting back.
Can someone please persuade me that my pessimism is unfounded?
1. ^
FWIW I don't like these terms, but people seem to intuitively grok what is
meant by them
2. ^
I'm open to be corrected here, but I feel like those sceptical of the AI
xRisk/AI Safety communities have upped the ante in terms of the amount of
criticism and its vitriol - though I am open to the explanation that I've
be
I mostly haven't been thinking about what the ideal effective altruism community
would look like, because it seems like most of the value of effective altruism
might just get approximated to what impact it has on steering the world towards
better AGI futures. But I think even in worlds where AI risk wasn't a problem,
the effective altruism movement seems lackluster in some ways.
I am thinking especially of the effect that it often has on university students
and younger people. My sense is that EA sometimes influences those people to be
closed-minded or at least doesn't contribute to making them as ambitious or
interested in exploring things outside "conventional EA" as I think would be
ideal. Students who come across EA often become too attached to specific EA
organisations or paths to impact suggested by existing EA institutions.
In an EA community that was more ambitiously impactful, there would be a higher
proportion of folks at least strongly considering doing things like starting
startups that could be really big, traveling to various parts of the world to
form a view about how poverty affects welfare, having long google docs with
their current best guesses for how to get rid of factory farming, looking at
non-"EA" sources to figure out what more effective interventions GiveWell might
be missing perhaps because they're somewhat controversial, doing more effective
science/medical research, writing something on the topic of better thinking and
decision-making that could be as influential as Eliezer's sequences, expressing
curiosity about the question of whether charity is even the best way to improve
human welfare, trying to fix science.
And a lower proportion of these folks would be applying to jobs on the 80,000
Hours job board or choosing to spend more time within the EA community rather
than interacting with the most ambitious, intelligent, and interesting people
amongst their general peers.
Quick updates:
* Our next critique (on Conjecture) will be published in 2 weeks.
* The critqiue after that will be on Anthropic. If you'd like to be a reviewer,
or have critiques you'd like to share, please message us or email
anonymouseaomega@gmail.com [anonymouseaomega@gmail.com].
Confusion
I get why I and other give to Givewell rather than catastrophic risk - sometimes
it's good to know your "Impact account" is positive even if all the catastrophic
risk work was useless.
But why do people not give to animal welfare in this case? Seems higher impact?
And if it's just that we prefer humans to animals that seems like something we
should be clear to ourselves about.
Also I don't know if I like my mental model of an "impact account". Seems like
my giving has maybe once again become about me rather than impact.
ht @Aaron Bergman
[https://forum.effectivealtruism.org/users/aaronb50?mention=user] for surfacing
this
I remember being very confused by the idea of an unconference
[https://en.wikipedia.org/wiki/Unconference]. I didn't understand what it was
and why it had a special name distinct from a conference. Once I learned that it
was a conference in which the talks/discussions were planned by participants, I
was a little bit less confused, but I still didn't understand why it had a
special name. To me, that was simply a conference. The conferences and
conventions I had been to had involved participants putting on workshops. It was
only when I realized that many conferences lack participative elements that I
realized my primary experience of conferences was non-representative of
conferences in this particular way.
I had a similar struggle understanding the idea of Software as a Service
[https://en.wikipedia.org/wiki/Software_as_a_service] (SaaS). I had never had
any interactions with old corporate software that required people to come and
install it on your servers. The first time I heard the term SaaS as someone
explained to me what it meant, I was puzzled. "Isn't that all software?" I
thought. "Why call it SaaS instead of simply calling it software?" All of the
software I had experienced and was aware of was in the category of SaaS.
I'm writing this mainly just to put my own thoughts down somewhere, but if
anyone is reading this I'll try to put a "what you can take from this" spin on
it:
1. If your entire experience of X falls within X_type1, and you are barely even
aware of the existence of X_type2, then you will simply think of X_type1 as
X, and you will be perplexed when people call it X_type1.
2. If you are speaking to someone who is confused by X_type1, don't
automatically assume they don't know what X_type1 is. It might be that they
simply don't know why you are using such an odd name for (what they view as
X).
Silly example: Imagine growing up in the USA, never travelling outside of the
USA, and telling people that you speak "American Englis
I vaguely remember reading something about buying property with a longtermism
perspective, but I can't remember the justification against doing it. This is
basically using people's inclination to choose immediate rewards over rewards
that come later in the future. The scenario was (very roughly) something like
this:
This feels like a very naïve question, but if I had enough money to support
myself and I also had excess funds outside of that, why not do something like
this as a step toward building an enormous pool of resources for the future?
Could anyone link me to the original post?
I imagine that it has cost and does cost 80k to push for AI safety stuff even
when it was wierd and now it seems mainstream.
Like, I think an interesting metric is when people say something which shifts
some kind of group vibe. And sure, catastrophic risk folks are into it, but many
EAs aren't and would have liked a more holistic approach (I guess).
So it seems a notable tradeoff.
Embed Interactive Metaculus Forecasts on Your Website or Blog
Now you can share interactive Metaculus forecasts for all question types,
including Question Groups and Conditional Pairs.
Just click the 'Embed' button at the top of a question page, customize the plot
as needed, and copy the iframe html.
IMPROVED FORECAST PREVIEWS
Metaculus has also made it easier to share forecast preview images for more
question types, on platforms like Twitter, Substack, Slack, and Facebook.
Just paste the question URL to generate a preview of the forecast plot on any
platform that supports them.
To learn more about embedding forecasts & preview images, click here
[https://www.metaculus.com/questions/17313/flexible-forecast-embedding--sharing/].
In Twitter and elsewhere, I've seen a bunch of people argue that AI company
execs and academics are only talking about AI existential risk because they want
to manufacture concern to increase investments and/or as a distraction away from
near-term risks and/or regulatory capture. This is obviously false.
However, there is a nearby argument that is likely true: which is that
incentives drive how people talk about AI risk, as well as which specific
regulations or interventions they ask for. This is likely to happen both
explicitly and unconsciously. It's important (as always) to have extremely solid
epistemics, and understand that even apparent allies may have (large) degrees of
self-interest and motivated reasoning.
Safety-washing
[https://forum.effectivealtruism.org/posts/f2qojPr8NaMPo2KJC/beware-safety-washing]
is a significant concern; similar things have happened a bunch in other fields,
it likely has already happened a bunch in AI, and will likely happen again in
the months and years to come, especially if/as policymakers and/or the general
public become increasingly uneasy about AI.
I guess African, Indian and Chinese voices are underrepresented in the AI
Governance discussion. And in the unlikely case we die, we all die and it think
it's weird that half the people who will die have noone loyal to them in the
discussion.
We want AI that works for everyone and it seems likely you want people who can
represent billions who aren't currently with a loyal representative.
Social Change Lab [https://www.socialchangelab.org/] is hosting a webinar on
Monday 5th of June around our previous research
[https://forum.effectivealtruism.org/posts/LXj4cs5dLqDHwJynp/radical-tactics-can-increase-support-for-more-moderate]
that radical tactics can increase support for more moderate groups. If you want
to hear more about our research, some slightly updated findings and ask
questions, now is your time!
It’ll be on June 5th, 6-7pm BST and you can sign up here
[https://www.eventbrite.com/e/the-radical-flank-effect-of-just-stop-oil-tickets-637950255387].
I wish there was a library of sorts for different base models of TAI economics
growth that weren't just some form of the Romer Model and TFP goes up because
PASTA automates science.