All of adamShimi's Comments + Replies

Thanks for the thoughtful comment!

Behaviour science
I work in this space, and much of the theory seems very relevant to understanding non-human agents. For instance, I wonder there would be value in exploring if models of human behaviour such as COM-B and the FBM could be useful in modelling the actions of AI agents. For instance, if it is useful to theorise that a human agent's behaviour only occurs if they have sufficient motivation and ability and a trigger to act (as per the FBM), it might also be useful to do so for a non-human agent

This sounds like a ... (read more)

1
PeterSlattery
2y
Thanks, Adam, this was very helpful! I really appreciate that you took the time to respond in such detail. I will see what I can do for the fellowship. I might be able to convince someone else to do it and then I can collaborate with them :)

Hum, I think I wrote my point badly on the comment above. What I mean isn't that formal methods will never be useful, just that they're not really useful yet, and will require more pure AI safety research to be useful.

The general reason is that all formal methods try to show that a program follows a specification on a model of computation. Right now, a lot of the work on formal methods applied to AI focus on adapting known formal methods to the specific programs (say Neural Networks) and the right model of computation (in what contexts do you use... (read more)

This post annoyed me. Which is a good thing! It means that you hit where it hurts, and you forced me to reconsider my arguments. I also had to update (a bit) toward your position, because I realized that my "counter-arguments" weren't that strong.

Still, here they are:

  • I agree with the remark that many work will have both capability and safety consequences. But instead of seeing that as an argument to laud the safety aspect of capability-relevant work, I want to look for the differential technical progress. What makes me think that EA safety i
... (read more)
3
seanrson
4y
Could you say more (or work on that post) about why formal methods will be unhelpful? Why are places like Stanford, CMU, etc. pushing to integrate formal methods with AI safety? Also Paul Christiano has suggested formal methods will be useful for avoiding catastrophic scenarios. (Will update with links if you want.)

Since many other answers treat the more general ideas, I want to focus on the "volontary" sadness of reading/watching/listening sad stories. I was curious about this myself, because I noticed that reading only "positive" and "joyous" stories eventually feel empty.

The answer seem that sad elements in a story bring more depth than the fun/joyous ones. In that sense, sadness in stories act as a signal of deepness, but also a way to access some deeper part of our emotions and internal life.

I'm reminded of Mark Manson's q... (read more)

Thanks for that very in-depth answer!

I was indeed thinking about 3., even if 1. and 2. are also important. And I get that the main value of these diagrams is to force an explicit and as formal as possible statement to be made.

I guess my question was more about, given two different causal diagrams for the same risk (made by different researchers for example), have you an idea of how to compare them? Like finding the first difference along the causal path, or others means of comparison. This seems important because even with clean descriptions of our views, we can still talk past each other if we cannot see where the difference truly lies.

1
MichaelA
4y
Hmm, I'm not sure I fully understand what you mean. But hopefully the following somewhat addresses it: One possibility is that two different researchers might have different ideas of what the relevant causal pathways actually are. For a simple example, one researcher might not think of the possibility that a risk could progress right from the initial harms to the existential catastrophe, without a period of civilizational collapse first, or might think of that but dismiss it as not even worth considering because it seems so unlikely. A different researcher might think that that path is indeed worth considering. If either of the researchers tried to make an explicit causal diagram of how they think the risk could lead to existential catastrophe, the other one would probably notice that their own thoughts on the matter differ. This would likely help them see where the differences in their views lie, and the researcher who'd neglected that path might immediately say "Oh, good point, hadn't thought of that!", or they might discuss why that seems worth considering to one of the researchers but not to the other. (I would guess that in practice this would typically occur for less obvious paths than that, such as specific paths that can lead to or prevent the development/spread of certain types of information.) Another possibility is that two different researchers have essentially the same idea of what the relevant causal pathways are, but very different ideas of the probabilities of progression from certain steps to other steps. In that case, merely drawing these diagrams, in the way they're shown in this post, wouldn't be sufficient for them to spot why their views differ. But having the diagrams in front of them could help them talk through how likely they think each particular path or step is. Or they could each assign an actual probability to each path or step. Either way, they should then be able to see why and where their views differ. In all of these cases, id

Great post! I feel these diagrams will be really useful for clarifying the possible interventions and parts of the existential risks.

Do you think they'll also serve for comparing different positions on a specific existential risk, like the trajectories in this post? Or do you envision the diagram for a specific risk as a summary of all causal pathways to this risk?

3
MichaelA
4y
Thanks! I hope so. By "comparing different positions on a specific existential risk", it seems to me that you could mean either: 1. Comparing what different "stages" of a specific risk would be like * e.g., comparing what it'd be like if we're at the "implementation of hazardous information" vs "harmful events" stage of a engineered pathogen risk 2. Comparing different people's views on what stage a specific risk is currently at * e.g., identifying that one person believes the information required to develop an engineered pathogen just hasn't been developed, while another believes that it's been developed but has yet to be shared or implemented 3. Comparing different people's views on a specific risk more generally * e.g., identifying that two people roughly agree on the chances an engineered pathogen could be developed, but disagree on how likely it is that it'd be implemented or that a resulting outbreak would result in collapse/extinction, and that's why they overall disagree about the risk levels * e.g., identifying that two people roughly agree on the overall risks from an engineered pandemic, but this obscures the fact that they disagree, in ways that roughly cancel out, on the probabilities of progression from each stage to the next stage. This could be important because it could help them understand why they advocate for different interventions. (Note that I just randomly chose to go with pathogen examples here - as I say in the post, these diagrams can be used for a wide range of risks.) I think that, if these diagrams can be useful at all (which I hope they can!), they can be useful for 1 and 3. And I think perhaps you had 3 in mind, as that's perhaps most similar to what the state space model you linked to accomplishes. (I'd guess these models could also be useful for 2, but I'm not sure how often informed people would have meaningful disagreements about what stage a specific risk is currently at.) Hopefully my examples already make it somew

What about diseases? I admit I know little about this period of history, but the accounts I read (for example in Guns, Germs and Steel) place the advantage in the spread of diseases to the Americas.

Basically, because the Americas lacked many big domesticated mammals, they could not have cities like European ones with cattle everywhere. The conditions of living in these big cities caused the spread of diseases. And when going to the Americas, the conquistadors took these diseases with them to a population which had never experienced them, causing most of th... (read more)

5
kokotajlod
4y
Diseases are probably part of the explanation for Cortes' and Pizarro's success, but not Afonso. Also, Cortes got pretty far into his conquest before disease became an issue. Perhaps the disease enabled the Spanish to betray their allies and dominate the region after Tenochtitlan fell though? I'm not sure. I suspect insofar as diseases are part of the explanation, it's by the intermediate step of sowing chaos that the conquistadors could exploit. Later on, diseases would play a much bigger and more direct role, by reducing the native population to the point where they couldn't compete with the influx of european settlers. The east coast of America might look different, for example, if there had been 20x more people living there when the colonists started arriving. Insofar as we think this was an important part of the story, I guess our conclusion would be that a moderate tech and experience advantage combined with general chaos and disruption can allow a tiny group to dominate a much larger region. This is all my uninformed opinion though, don't take it too seriously.

Also interested. I did not think about it before, but since the old generation dying is one way scientific and intellectual changes are completely accepted, that would probably have some big impact on our intellectual landscape and culture.

1
InquilineKea
3y
While old generation dying is one way of getting scientific and intellectual change to be enacted, there are longer-term trends towards reduced gatekeeping that may reduce the cost of training (eg when people prove that they're scientifically competent WITHOUT having to go through the entire wasteful process of K12 + PhD), then this could inhibit the gatekeeper socialization effects of the old generation that prevent the new generation from feeling free to express itself w/o permission (programming, at the very least, is much less hierarchical because people don't need to depend on the validation of an adviser/famous person to get their ideas out as much or gain access to resources critical to experimentation- it just has to work). Similarly, reductions in the cost of doing biological experiments could also inhibit this effect.  There are power dynamics associated with scientific training and scientific publishing (not to mention that the training seems to help scientists get through publishing - blind review be damned)- and there are SOME  trends towards funding people who do work without needing access to gatekeepers (look at trends in funding from Emergent Ventures, or the Patrick Collision network). I've also noticed that people are growing

I'm curious about the article, but the link points to nothing. ^^

Thanks a lot for this presentation and corresponding transcript. I am quite new to thinking about animal welfare at all, and even more about wildlife animal welfare, but I felt this presentation was easy to follow even from this point of view (my half decent knowledge of evolution might have helped).

I like the clarification of evolution, and more specifically, of the fact that natural selection selects away options with bad fitness or bad relative fitness, instead of optimizing fitness to the maximum. That's a common issue when using theoretical compu... (read more)

1
ishi
4y
Since this topic interests me and i'm killing time I decided to comment on a few things in your post. 1. wikipedia has a reasonable article on exaptations as an introduction. i also reccomend looking at the wikip article on sexual selection---in my view these topics overlap. (The wikip article on sexual selection looks less complete---i think 'fisher's runaway process' described in that article is most relevant but some others prefer the 'handicap principal'. there are much more recent articles on these topics which tend to rely on physics formalism---though fisher used earlier forms of it as well. (fisher was a famous statistician). 2. regarding 'differential reproduction' (your 1st question) , i think the answer is 'both'. (this overlaps with the fisher runaway process ). being a blue bear might be like having a nice car or prestigious college degree. it may not mean much at all but once its around you better be a blue bear, have a nice car, or have a good college degree. 3. your last paragraph to me does support the idea that evolution is (almost ) a zero-sum game. Its not exactly zero sum, because then there would be no evolution. It might , for example, prove to be that the best way to improve animal welfare would be for those who care about it give them more room---this is the argument made by some well known philosophers--eg patricia macCormack--- and writers who promote human extinction---i personally don't believe this argument, and also don't they believe they do either even if they don't know it. 'Maximizing fitness' is an idea thatb only occurs in the most simplistic forms of evolutionary theory ---and its also due to R Fisher--'fisher's fundamental theorem of natural selection'. Its well known it only applies to ideal systems. (analagous to Lucretius' universe from 1000s of years ago---assuming the world is basically a set of billiard balls or atoms bouncing off of each other. its called 'bean bag genetics'. it can explain why a cup of hot coffee

Once upon a time, my desire to build a useful mastery and career made me neglect my family, and more precisely my little brothers. Not dramatically, but whenever we were together, I was too stuck up with my own issues to act like a good old brother, or even to interact correctly with them. At some point, I realized that giving time and attention to my family was also important to me, and thus that I could not simply allocate all my mental time and energy to "useful" things.

This happened before I discovered EA, and is not explicitly about the EA c... (read more)

On a tangent, what are your issues with quantum computing? Is it the hype? that might indeed be abusive for what we can do now. But the theory is fascinating, there are concrete applications where we should get positive benefits for humanity, and the actual researchers in the field try really hard to clarify what we know and what we don't about quantum computing.

5
EdoArad
4y
Jaime Sevilla wrote a long (albeit preliminary) and interesting report on the topic

Thanks a lot for this great post! I think the part I like the most, even more than the awesome deconstruction of arguments and their underlying hypotheses, is the sheer number of times you said "I don't know" or "I'm not sure" or "this might be false". I feel it places you at the same level than your audience (including me), in the sense that you have more experience and technical competence than the rest of us, but you still don't know THE TRUTH, or sometimes even good approximations to it. And the standard way... (read more)

8
Buck
4y
I agree that it's an important question whether AGI has the right qualities to "solve itself". To go through the ones you named: * "Personal and economic incentives are aligned against them"--I think AI safety has somewhat good properties here. Basically no-one wants to kill everyone, and AI systems that aren't aligned with their users are much less useful. On the other hand, it might be the case that people are strongly incentivised to be reckless and deploy things quickly. * "they are obvious when one is confronted with the situation"--I think that alignment problems might be fairly obvious, especially if there's a long process of continuous AI progress where unaligned non-superintelligent AI systems do non-catastrophic damage. So this comes down to questions about how rapid AI progress will be. * "at the point where the problems become obvious, you can still solve them"--If the problems become obvious because non-superintelligent AI systems are behaving badly, then we can still maybe put more effort into aligning increasingly powerful AI systems after that and hopefully we won't lose that much of the value of the future.
5
Buck
4y
This is a good point. I feel pretty unsure on this point; for a contradictory perspective you might enjoy this article.

Anecdotally, almost everyone from older generations that I know eat snails, so it might indeed be generational. Whereas I know approximately the same number of people from each generation that dislike oysters (mostly texture).

1
Daniela R. Waldhorn
4y
Interesting, thank you!

Thanks for the effort in summarizing and synthesizing this tangle of notions! Notably, I learned about axiology, and I am very glad I did.

One potential addition to the discussion of decision theory might be the use of "normative", "descriptive" and "prescriptive" within decision theory itself, which is slightly different. To quote the Decision Theory FAQ on Less Wrong:

We can divide decision theory into three parts (Grant & Zandt 2009; Baron 2008). Normative decision theory studies what an ideal agent (a perfectly rational
... (read more)
1
MichaelA
4y
Very interesting. I hadn't come across that way of using those three terms. Thanks for sharing!

Thanks for this thoughtful analysis! I must admit that I never really considered the welfare of snails as an issue, maybe because I am french and thus am culturally used to eating them.

One thing I wanted to confirm anecdotally is the consummation of snails in France. Even if snails with parsley butter is a classic french dish, it is eaten quite rarely (only for celebrations or Christmas and new year's eve diners). And I know many people that don't eat snails because they find it disgusting, even though they saw people eating them all their lives ... (read more)

3
Daniela R. Waldhorn
4y
Thanks for your comment, adamShimi! Do you have a sense of the profile of people that find eating snails disgusting? I wonder if it's a generational issue, for example. In Spain, eating snails now seems to be much more prevalent among older generations than among the youngest population.

The geometric intuition underlying this post already proves useful for me!

Yesterday, while discussing with a friend why I want to change my research topic to AI Safety instead of what I currently do (distributed computing), my first intuition was that AI safety aims at shaping the future, while distributed computing is relatively agnostic about it. But a far better intuition comes when considering the vector along the current trajectory in state space, starting at the current position of the world, and whose direction and length capture the trajectory and ... (read more)

1
David_Kristoffersson
4y
Happy to see you found it useful, Adam! Yes, general technological development corresponding to scaling of the vector is exactly the kind of intuition it's meant to carry.

That's a great criterion! We might be able to find some weird counter-example, but it solves all of my issues. Because intellectual work/knowledge might be a part of all actions, but it isn't necessary on the main causal path.

I think this might actually deserve its own post.

2
MichaelA
4y
I've gone with adding a footnote that links to this comment thread. Probably would've baked this explanation in if I'd had it initially, but I now couldn't quickly find a neat, concise way to add it. Thanks again for prompting the thinking, though!
2
MichaelA
4y
Great! And thanks for the suggestion to make this idea/criterion into its own post. I'll think about whether to do that, just adjust this post's main text to reflect that idea, or just add a footnote in this post.

Thanks for the in-depth answer!

Let's take your concrete example about democracy. If I understand correctly, you separate the progress towards democracy into:

  • discovering/creating the concept of democracy, learning it, spreading the concept itself, which is under the differential intellectual progress.
  • convince people to implement democracy, do the fieldwork for implementing it, which is at least partially under the differential progress.

But the thing is, I don't have a clear criterion for distinguishing the two. My first ideas were:

  • differential int
... (read more)
3
MichaelA
4y
I think the first of those is close, but not quite there. Maybe this is how I'd put it (though I'd hadn't tried to specify it to this extent before seeing your comment): Differential progress is about actions that advance risk-reducing lasting changes relative to risk-increasing progress, regardless of how these actions achieve that objective. Differential intellectual progress is a subset of differential progress where an increase in knowledge by some participant is a necessary step between the action and the outcomes. It's not just that someone does learn something, or even that it would necessarily be true that someone would end up having learned something (e.g., as an inevitable outcome of the effect we care about). It's instead that someone had to learn something in order for the outcome to occur. In the democracy example, if I teach someone about democracy, then the way in which that may cause risk-reducing lasting changes is via changes in that person's knowledge. So that's differential intellectual progress (and thus also differential progress, since that's the broader category). If instead I just persuade someone to be in favour of democracy by inspiring them or making liking democracy look cool, then that may not require them to have changes in knowledge. In reality, they're likely to also "learn something new" along the lines of "this guy gave a great speech", or "this really cool guy likes democracy". But that new knowledge isn't why they now want democracy; the causal pathway went via their emotions directly, with the change in their knowledge being an additional consequence that isn't on the main path. (Similar scenarios could also occur where a change in knowledge was necessary, such as if they choose to support democracy based on now explicitly thinking doing so will win them approval from me or from their friends. I'm talking about cases that aren't like that; cases where it's more automatic and emotion-driven.) Does that seem clearer to you?

I did find this post clear and useful; it will be my main recommendation if I want to explain this concept to someone else.

I also really like your proposition of "potential information hazards", as at that point in the post, I was wondering if all basic research should be considered information hazards, which would make the whole concept rather vacuous. Maybe one way to address the potential information hazards is to try to quantify how removed are they from potential concrete risks?

Anyway, I'm looking forward to the next posts on dealing with these potential information hazards.

1
MichaelA
4y
Just so you know, we've now (finally!) published the post on how to deal with potential information hazards over on LessWrong. We'll be putting most of our posts on the topic on that forum, as part of a "sequence".
2
MichaelA
4y
Thanks! That's great to hear. And yes, I think that section you point at was important, and I David Kristoffersson for pushing me to attend to that distinction between actual harms, "information hazards" (where it's just a risk), and "potential information hazards" (where we don't have a specific way it would be harmful in mind, or something like that). (He didn't formulate things in that way, but highlighted the general issue as one worth thinking about more.)

Thanks a lot for this summary post! I did not know of these concepts, and I feel they are indeed very useful for thinking about these issues.

I do have some trouble with the distinction between intellectual progress and progress though. During my first reading, I felt like all the topics mentioned in the progress section were actually about intellectual progress.

Now, rereading the post, I think I see a distinction, but I have trouble crystallizing it in concrete terms. Is the difference about creating ideas and implementing them? But then it feels very remi... (read more)

2
MichaelA
4y
Thanks for that feedback! I get why you'd feel unsure about those points you mention. That's one of the things I let stay a bit implicit/fuzzy in this post, because (a) I wanted to keep things relatively brief and simple, and (b) I wanted to mostly just summarise existing ideas/usages, and that's one point where I think existing usages left things a bit implicit/fuzzy. But here's roughly how I see things, personally: The way Tomasik uses the term "differential intellectual progress" suggests to me that he considers "intellectual progress" to also include just spreading awareness of existing ideas (e.g., via education or Wikipedia), rather than conceiving of new ideas. It seems to me like it could make sense to limit "intellectual progress" to just cases where someone "sees further than any who've come before them", and not include times when individuals climb further towards the forefront of humanity's knowledge, without actually advancing that knowledge. But it also seems reasonable to include both, and I'm happy to stick with doing so, unless anyone suggests a strong reason not to. So I'd say "differential intellectual progress" both includes adding to the pool of things that some human knows, and just meaning that some individual knows/understands more, and thus humanity's average knowledge has grown. None of this directly addresses your questions, but I hope it helps set the scene. What you ask about is instead differential intellectual progress vs differential progress, and where "implementation" or "application" of ideas fits. I'd agree that fundamental vs applied research can be blurry, and that "even the applications of ideas and solutions requires a whole lot of intellectual work." E.g., if we're "merely implementing" the idea of democracy in a new country, that will almost certainly require at least some development of new knowledge and ideas (such as how to best divide the country into constituencies, or whether a parliamentary or presidential system

As a tool for existential risk research, I feel like the graphical representation will indeed be useful in crystallizing the differences in hypotheses between researchers. It might even serves as a self-assessing tool, for checking quickly some of the consequences of one's own view.

But beyond the trajectories (and maybe specific distances), are you planning on representing the other elements you mention? Like the uncertainty or the speed along trajectories? I feel like the more details about an approach can be integrated into a simple graphical representation, the more this tool will serve to disentangle disagreement between researchers.

2
David_Kristoffersson
4y
Thanks for your comment. Yes; the other elements, like uncertainty, would definitely be part of further work on the trajectories model.

Thanks a lot for this podcast! I liked the summary you provided, and I think it is great to see people struggling to make sense of a lot of complex information on a topic, almost in direct. Given that you repeat multiple times that neither of you is an expert on the subject, I think this podcast is a net positive: it gives information while encouraging the listeners to go look for themselves.

Another great point: the criticism of the meme about overreacting. While listening to the beginning, when you said that there was no reason to panic, I wanted to objec... (read more)

2
Howie_Lempel
4y
Thanks - this is helpful.
3
Robert_Wiblin
4y
Thanks for the detailed feedback Adam. :)