(comment crossposted from LW)
While the coauthors broadly agree about points listed in the post, I wanted to stick my neck out a bit more and assign some numbers to one of the core points. I think on present margins, voluntary restraint slows down capabilities progress by at most 5% while probably halving safety progress, and this doesn't seem like a good trade. [The numbers seem like they were different in the past, but the counterfactuals here are hard to estimate.] I think if you measure by the number of people involved, the effect of restraint is substa...
Request for advice from animal suffering EAs (and, to a lesser extent, climate EAs?): is there an easy win over getting turkeys from Mary's Turkeys? (Also, how much should I care about getting the heritage variety?)
Background: I routinely cook for myself and my housemates (all of whom are omnivores), and am on a diet that requires animal products for health reasons. Nevertheless, I'd rather impose fewer costs than more costs on others; I stopped eating chicken and chicken eggs in response to this post and recently switched from consuming lots of grass-fini...
So you would have lost 40 hours of productive time? Respectfully: so what? You have sources actively claiming you are about to publish directly false information about them and asking for time to provide evidence that information is directly false.
Also, I think it is worth Oli/Ben estimating how many productive hours were lost to the decision to not delay; it would not surprise me if much of the benefit here was illusory.
Hmm, there are a bunch of rhetorical components like "she told me not to talk to Ben about it" that I think almost any reader would interpret as disconfirmation of this being the case.
I think if this is a summary of Kat's experiences with Ben, then I think that section would IMO be pretty misleading (and that is relevant and not just pre-empted by it trying to be a reductio-ad-absurdum, since the level of misleadingness is trying to be parallel to the original Nonlinear post).
(I'm Matthew Gray)
Inflection is a late addition to the list, so Matt and I won’t be reviewing their AI Safety Policy here.
My sense from reading Inflection's response now is that they say the right things about red teaming and security and so on, but I am pretty worried about their basic plan / they don't seem to be grappling with the risks specific to their approach at all. Quoting from them in two different sections:
...Inflection’s mission is to build a personal artificial intelligence (AI) for everyone. That means an AI that is a trusted partner: an advisor
I'm thinking about the matching problem of "people with AI safety questions" and "people with AI safety answers". Snoop Dogg hears Geoff Hinton on CNN (or wherever), asks "what the fuck?", and then tries to find someone who can tell him what the fuck.
I think normally people trust their local expertise landscape--if they think the CDC is the authority on masks they adopt the CDC's position, if they think their mom group on Facebook is the authority on masks they adopt the mom group's position--but AI risk is weird because it's mostly unclaimed territory in ...
I think the 'traditional fine dining' experience that comes closest to this is Peking Duck.
Most of my experience has been with either salt-drenched cooked fat or honey-dusted cooked fat; I'll have to try smoking something and then applying honey to the fat cap before I eat it. My experience is that it is really good but also quickly becomes unbalanced / no longer good; some people, on their first bite, already consider it too unbalanced to enjoy. So I do think there's something interesting here where there is a somewhat subtle taste mechanism (not just opt...
...When people make big and persistent mistakes, the usual cause (in my experience) is not something that comes labeled with giant mental “THIS IS A MISTAKE” warning signs when you reflect on it.
Instead, tracing mistakes back to their upstream causes, I think that the cause tends to look like a tiny note of discord that got repeatedly ignored—nothing that mentally feels important or action-relevant, just a nagging feeling that pops up sometimes.
To do better, then, I want to take stock of those subtler upstream causes, and think about the flinch reactions
Good point! Currently, I think the "pry more" lesson is supposed to account for a bunch of this.
Since making this update, I have in fact pried more into friends' lives. In at least one instance I found some stuff that worried me, at which point I was naturally like "hey, this worries me; it pattern-matches to some bad situations I've seen; I feel wary and protective; I request an opportunity to share and/or put you in touch with people who've been through putatively-analogous situations (though I can also stfu if you're sick of hearing people's triggered t...
Can we all just agree that if you’re gonna make some funding decision with horrendous optics, you should be expected to justify the decision with actual numbers and plans?
Justify to who? I would like to have an EA that has some individual initiative, where people can make decisions using their resources to try to seek good outcomes. I agree that when actions have negative externalities, external checks would help. But it's not obvious to me that those external checks weren't passed in this case*, and if you want to propose a specific standard we should try...
From The Snowball, dealing with Warren Buffett's son's stint as a director and PR person for ADM:
The second the FBI agents left, Howie called his father, flailing, saying, I don't know what to do, I don't have the facts, how do I know if these allegations are true? My name is on every press release. How can I be the spokesman for the company worldwide? What should I do, should I resign?
...Buffett refrained from the obvious response, which was that, of his three children, only Howie could have wound up with an FBI agent in his living room after taking his firs
Can you explain the "same upsides" part?
Yeah; by default people have entangled assets which will be put at risk by starting or investing in a new project. Limiting the liability that originates from that project to just the assets held by that project means that investors and founders can do things that seem to have positive return on their own, rather than 'positive return given that you're putting all of your other assets at stake.'
[Like I agree that there's issues where the social benefit of actions and the private benefits of actions don't line up, and...
This reminds me a lot of limited liability (see also Austin's comment, where he compares it to the for-profit startup market, which because of limited liability for corporations bounds prices below by 0).
This is a historically unusual policy (full liability came first), and seems to me to have basically the same downsides (people do risky things, profiting if they win and walking away if they lose), and basically the same upsides (according to the theory supporting LLCs, there's too little investment and support of novel projects).
Can you say more ab...
I'm interested in fleshing out "what you're looking for"; do you have some examples of things written in the past which changed your minds, which you would have awarded prizes to?
For example, I thought about my old comment on patient long-termism, which observes that in order to say "I'm waiting to give later" as a complete strategy you need to identify the conditions under which you would stop waiting (as otherwise, your strategy is to give never). On the one hand, it feels "too short" to be considered, but on the other hand, it seems long enough to conve...
Random personal examples:
And if this is just a one-off, then it seems a lot less concerning, and taking action seems much less pressing. (Though it seems much easier to verify that this is a pattern, by finding other people in a similar situation to yours, than to verify that it isn't, since there are incentives to be quiet about this sort of thing).
Is this the case? Often the reaction to the 'first transgression' will determine whether or not to do future ones--if people let it slide, then probably they don't care that much, whereas if they react strongly, it's important to repen...
What I'm saying is that if you believe that x-risk is 0.1%, then you think we're at least one in a million.
I think you're saying "if you believe that x-risk this century is 0.1%, then survival probability this century is 99.9%, and for total survival probability over the next trillion years to be 0.01%, there can be at most 9200 centuries with risk that high over the next trillion years (.999^9200=0.0001), which means we're in (most generously) a one-in-one-million century, as a trillion years is 10 billion centuries, which divided by ten thousand is a million." That seem right?
Then, if the expected cost-effectiveness of the best opportunities varies substantially over time, there will be just one point in time at which your philanthropy will have the most impact, and you should try to max out your philanthropy at that time period, donating all your philanthropy at that time if you can.
Tho I note that the only way one would ever take such opportunities, if offered, is by developing a view of what sorts of opportunities are good that is sufficiently motivating to actually take action at least once every few decades.
For example, wh...
Now that the world has experienced COVID-19, everyone understands that pandemics could be bad
I found it somewhat surprising how quickly the pandemic was polarized politically; I am curious whether you expect this group to be partisan, and whether that would be a positive or negative factor.
[A related historical question: what were the political party memberships of members of environmental groups in the US across time? I would vaguely suspect that it started off more even than it is today.]
I felt confused about why I was presented with a fully general argument for something I thought I indicated I already considered.
In my original comment, I was trying to resolve the puzzle of why something would have to appear edgy instead of just having fewer filters, by pointing out the ways in which having unshared filters would lead to the appearance of edginess. [On reflection, I should've been clearer about the 'unshared' aspect of it.]
you didn't want to voice unambiguous support for the view that the comment wordings were in fact not easy to improve on given the choice of topic.
I'm afraid this sentence has too many negations for me to clearly point one way or the other, but let me try to restate it and say why I made a comment:
The mechanistic approach to avoiding offense is to keep track of the ways things you say could be interpreted negatively, and search for ways to get your point across while not allowing for any of the negative interpretations. This is a tax on saying a...
Comparing trolley accidents to rape is pretty ridiculous for a few reasons:
I think you're missing my point; I'm not describing the scale, but the type. For example, suppose we were discussing racial prejudice, and I made an analogy to prejudice against the left-handed; it would be highly innumerate of me to claim that prejudice against the left-handed is as damaging as racial prejudice, but it might be accurate of me to say both are examples of prejudice against inborn characteristics, are perceived as unfair by the victims, and so on.
And so if y...
I'm a bit puzzled why it has to be edgy on top of just talking with fewer filters.
Presumably every filter is associated with an edge, right? Like, the 'trolley problem' is a classic of philosophy, and yet it is potentially traumatic for the victims of vehicular violence or accidents. If that's a group you don't want to upset or offend, you install a filter to catch yourself before you do, and when seeing other people say things you would've filtered out, you perceive them as 'edgy'. "Don't they know they ...
Now, I'm not saying Hanson isn't deliberately edgy; he very well might be.
If you're not saying that, then why did you make a comment? It feels like you're stating a fully general counterargument to the view that some statements are clearly worth improving, and that it matters how we say things. That seems like an unattractive view to me, and I'm saying that as someone who is really unhappy with social justice discourse.
Edit: It makes sense to give a reminder that we may sometimes jump to conclusions too quickly, and maybe you didn...
Benjamin Franklin, in his will, left £1,000 pounds each to the cities of Boston and Philadelphia, with the proviso that the money should be invested for 100 years, with 25 percent of the principal to be invested for a further 100 years.
Also of note is that he gave conditions on the investments; the money was to be lent to married men under 25 who had finished an apprenticeship, with two people willing to co-sign the loan for them. So in that regard it was something like a modern microlending program, instead of just trying to maximize returns for ben...
Presumably there are two categories of heuristics, here: ones which relate to actual difficulties in discerning the ground truth, and ones which are irrelevant or stem from a misunderstanding. I think it seems bad that this list implicitly casts the heuristics as being in the latter category, and rather than linking to why each is irrelevant or a misunderstanding it does something closer to mocking the concern.
For example, I would decompose the "It's not empirically testable" heuristic into two different components. The first is something li...
I certainly don't think agents "should" try to achieve outcomes that are impossible from the problem specification itself.
I think you need to make a clearer distinction here between "outcomes that don't exist in the universe's dynamics" (like taking both boxes and receiving $1,001,000) and "outcomes that can't exist in my branch" (like there not being a bomb in the unlucky case). Because if you're operating just in the branch you find yourself in, many outcomes whose probability an FDT agent is trying to affect are impossible from the problem specification...
Oh, an additional detail that I think was part of that conversation: there's only really one way to have a '0-error' state in a hierarchical controls framework, but there are potentially many consonant energy distributions that are dissonant with each other. Whether or not that's true, and whether each is individually positive valence, will be interesting to find out.
(If I had to guess, I would guess the different mutually-dissonant internally-consonant distributions correspond to things like 'moods', in a way that means they...
FWIW I agree with Buck's criticisms of the Symmetry Theory of Valence (both content and meta) and also think that some other ideas QRI are interested in are interesting. Our conversation on the road trip was (I think) my introduction to Connectome Specific Harmonic Waves (CSHW), for example, and that seemed promising to think about.
I vaguely recall us managing to operationalize a disagreement, let me see if I can reconstruct it:
A 'multiple drive' system, like PCT's hierarchical control system, has an easy time explaining independent des...
Thanks! Also, for future opportunities like this, probably the fastest person to respond will be Colm.
But as I understand it, Eliezer regards himself as being able to do unusually well using the techniques he has described, and so would predict his own success in forecasting tournaments.
This is also my model of Eliezer; my point is that my thoughts on modesty / anti-modesty are mostly disconnected to whether or not Eliezer is right about his forecasting accuracy, and mostly connected to the underlying models of how modesty and anti-modesty work as epistemic positions.
How narrowly should you define the 'expert' group?
I want to repeat something to mak...
I think with Eliezer's approach, superforecasters should exist, and it should be possible to be aware that you are a superforecaster. Those both seem like they would be lower probability under the modest view. Whether Eliezer personally is a superforecaster seems about as relevant as whether Tetlock is one; you don't need to be a superforecaster to study them.
I expect Eliezer to agree that a careful aggregation of superforecasters will outperform any individual superforecaster; similarly, I expect Eliezer to think that a careful aggregation of anti-modest ...
I think many "apostasy" stories (someone who had the atheist personality-type but grew up in a religious culture, people converting from left to right or back) have this character, and tend to be very popular with the destination audience and unpopular with the source audience. On the one hand, this is unsurprising--both on the levels of tribal affiliation and intellectual dynamics. (If people who use the EA tools come to the EA conclusions, then of course attempts to build alternative conclusions with the EA tools will fail.)
But it seems like--there shoul... (read more)