I agree with the general underlying point. I also think that another important issue is that reasoning on counterfactuals makes people more prone to do things that are unusual AND is more prone to errors (e.g. by not taking into account some other effects).Both combined make counterfactual reasoning without empirical data pretty perilous on average IMO.
In the case of Ali in your example above for instance, Ali could neglect that the performance he'll have will determine the opportunities & impact he has 5y down the line and so that being excited/liking the job is a major variable. Without counterfactual reasoning, Ali would have intuitively relied much more on excitement to pick the job but by doing counterfactual reasoning which seemed convincing, he neglected this important variable and made a bad choice.
I think that counterfactual reasoning makes people very prone to ignoring Chesterton's fence.
I think using "unsafe" in a very broad way like this is misleading overall and generally makes the AI safety community look like miscalibrated alarmists.
I agree that when there's no memetic fitness/calibration trade-off, it's always better to be calibrated. But here there is a trade-off. How should we take it?
Curious if anyone here knows the relevant literature on the topic, e.g. details in the radical flank literature.
Very glad to see that happening, regranting solves a bunch of unsolved problems with centralized grantmaking.
I mean, I agree that it has nuance but it's still trained on a set of values that are pretty much current western people values, so it will probably put more or less emphasis on various values according to the weight western people give to each of those.
I may try to write something on that in the future. I'm personally more worried about accidents and think that solving accidents causes one to solve misuse pre-AGI. Post aligned AGI, misuse rebecomes a major worry.
Note that saying "this isn't my intention" doesn't prevent net negative effects of a theory of change from applying. Otherwise, doing good would be a lot easier.
I also highly recommend clarifying what exactly you're criticizing, i.e. the philosophy, the movement norms or some institutions that are core to the movement.
Finally, I usually find the criticism of people a) at the core of the movement and b) highly truth-seeking most relevant to improve the movement so I would expect that if you're trying to improve the movement, you may want to focus on these people. There exists relevant criticisms external to the movement but usually they will lack of context and thus fail to address some key trade-offs that the movement cares about.
Here's a small list of people I would be excited to hear on EA flaws and their recommandations for change:
Thanks for publishing that, I also had a draft lying somewhere on that!
I work every day from about 9:30am to 1am with about 3h off on average and 30 min of walk which helps me brainstorming.
Technically this is ~12*7=84h. The main reason is that 1) I want that we don't die and 2) think that there are increasing marginal returns on working hours in a lot of situation, mostly due to the fact that in a lot of domains, winner takes all even if he's only 10% better than others, and because you accumulate in a single person more expertise/knowledge which gives access to more and more rare skills
Among that, I would say that I lose about 15h in unproductive or very little productive work (e.g Twitter or working on stuff which is not on my To Do list).
I also spend about 10 to 15h a week in calls.
The rest of it (from 50 to 60h) is productive work.
My productivity (& thus productive work time) has been hugely increasing over the past 3 months (my first three months where I can fully decide the allocation of my time). The total amount of hours I work increased a bit (like +1h/day) in the last 3 months, mostly thanks to optimizations of my sleep & schedule.
Can you share the passive time tracking tools you're using?
"Nobody cared about" LLMs is certainly not true - I'm pretty sure the relevant people watched them closely.
What do you mean by "the relevant people"? I would love that we talk about specifics here and operationalize what we mean. I'm pretty sure E. Macron haven't thought deeply about AGI (i.e has never thought for more than 1h about timelines) and I'm at 50% that if he had any deep understanding of what changes it will bring, he would already be racing. Likewise for Israel, which is a country which has strong track record of becoming leads in technologies that are crucial for defense.
That many people aren't concerned about AGI or doubting its feasibility by now only means that THOSE people will not pursue it, and any public discussion will probably not change their minds.
I think here you wrongly assume that people have even understood what are the implications of AGI and that they can't update at all once the first systems will start being deployed. The situation where what you say could be true is if you think that most of your arguments hold because of ChatGPT. I think it's quite plausible that since ChatGPT and probably even more in 2023 there will be deployments that may make mostly everyone that matter aware of AGI. I don't have a good sense yet of how policymakers have updated yet.
Already there are many alarming posts and articles out there, as well as books like Stuart Russell's "Human Compatible" (which I think is very good and helpful), so keeping the lid on the possibility of AGI and its profound impacts is way too late
Yeah, I realize thanks to this part that a lot of the debate should happen on specifics rather that at a high-level as we're doing here. Thus, chatting about your book in particular will be helpful for that.
I'm currently in the process of translating it to English so I can do just that. I'll send you a link as soon as I'm finished. I'll also invite everyone else in the AI safety community (I'm probably going to post an invite on LessWrong).
Great! Thanks for doing that!
while discussing the great potential of AGI for humanity should not.
FYI I don't think that it's true.
Regarding all our discussion, I realized I didn't mention a fairly important argument: a major failure mode specifically regarding risks is the following reaction from ~any country: "Omg, China is developing bad AGIs, so let's develop safe AGIs first!".
This can happen in two ways:
Thanks a lot for engaging with my arguments. I still think that you're substantially overconfident about the positive aspects of communicating AGI X-risks to the general public but I appreciate the fact that you took the time to consider and answer to my arguments.