I agree that it would be bizarre and absurd to believe, and disingenuous to claim, "Sam thought Kelsey would make him look extremely bad, and was okay with this".
This is not the claim I am making. I don't think you thought that, or claimed that.
The most important claim I'm trying to make is that I think it was obvious that SBF would not want those DMs published, and so it doesn't make sense for you to claim you thought he would be OK with it.
Note that I am not saying that publishing those DMs is definitely bad. Again, it might have been worth it to violate his consent for the greater good. I'm still uncertain about the ethics of violating someone's consent like that, but it's a plausible perspective.
I mostly just don't think you should say you thought he'd be OK with you publishing the DMs, because I think that's very likely false.
That doesn't seem plausible to me. I haven't seen any substantive reason for why you should have thought that.
Again, SBF said things like "fuck regulators" and you knew that he was trying to foster a good public image to regulators. I find the idea that you thought that he thought people would react positively to the leaks highly implausible. And the "fuck regulators" comment was not the only example of something that strikes me as a thing he obviously meant to keep private. The whole chat log was littered with things that he likely did not want public.
And again, you could have just asked him whether he wanted the DMs published.
In my opinion, you were either very naive about what he expected, or you're not being fully honest about what you really thought, and I don't think either possibility reflects well on what you did.
I genuinely thought SBF was comfortable with our interview being published and knew that was going to happen.
For what it's worth, I don't buy this.
My understanding is that you didn't ask SBF whether he wanted the text published. More importantly, I am confident you would have been able to correctly predict that he would say "no" if you did ask. Hence, why you didn't.
The reasons SBF wouldn't want his DMs published are too obvious to belabor: he said things like "fuck regulators", that his "ethics" were nothing but a cover for PR, and he spoke in a conversationalist rather than professional tone. Even if you actually thought he would probably be OK with those messages being leaked, an ethical journalist would at least ask, because of the highly plausible "no" you would have received.
In my opinion, publishing the DMs without his consent might have been the right thing to do, for the greater good. I do not think you're a bad person for doing it. But I don't think it makes sense to have expected SBF to want the conversation to be published, and I don't think it makes sense for you to claim you thought that.
I'm also not persuaded by the appeal journalistic norms, since I think journalistic norms generally fall well below high ethical standards.
I'm confused about this post. I don't buy the argument, but really, I'm not sure I understand what the argument is. Text-to-image models have risks? Every technology has risks. What's special about this technology that makes the costs outweigh the benefits?
Consider this paragraph,
Especially in the case of DALLE-2 and Stable Diffusion, we are not convinced of any fundamental social benefits that access to general-purpose image generators provide aside from, admittedly, being entertaining. But this does not seem commensurate with the potentially-devastating harms that deepfakes can have on victims of sex crimes. Thus, it seems that these models, as they have been rolled out, fail this basic cost/benefit test.
I don't find anything about this obvious at all.
Suppose someone in the 1910s said, "we are not convinced of any fundamental social benefits that film provides aside from, admittedly, being entertaining. But this does not seem commensurate with the potentially-devastating harms that film can cause by being used as a vehicle for propaganda. Thus, it seems that films, as they have been rolled out, fail this basic cost/benefit test." Would that have convinced you to abandon film as an art form?
Entertainment has value. Humanity collectively spends hundreds of billions of dollars every year on entertainment. People would hate to live in a world without entertainment. It would be almost dystopian. So, what justifies this casual dismissal of the entertainment value of text-to-image models?
Your primary conclusion is that "the AI research community should curtail work on risky capabilities". But isn't that obvious? Everyone should curtail working on unnecessarily risky capabilities.
The problem is coordinating our behavior. If OpenAI decides not to work on it, someone else will. What is the actual policy being recommended here? Government bans? Do you realize how hard it would be to prevent text-to-image models from being created and shared on the internet? That would require encroaching on the open internet in a way that seems totally unjustified to me, given the magnitude of the risks you've listed here.
What's missing here is any quantification of the actual harms from text-to-image models. Are we talking about 100 people a year being victimized? That would indeed be sad, but compared to potential human extinction from AI, probably not as big of a deal.
I mostly expect overreaction in cases of a weaker signal such as a Russian "test" on territory Russia claims as Russian, or tactical use
I disagree. I will probably evacuate San Francisco for a few weeks if Russia uses a tactical nuke in Ukraine. That said, I agree that there are many other events that may cause EAs to overreact, and it might be worth clearly delineating what counts as a red line, and what doesn't, ahead of time.
I strongly agree with the general point that overreaction can be very costly, and I agree that EAs overreacted to Covid, particularly after it was already clear that the overall infection fatality rate of Covid was under 1%, and roughly 0.02% in young adults.
However, I think it's important to analyze things on a case-by-case basis, and to simply think clearly about the risk we face. Personally, I felt that it was important to react to Covid in January-March 2020 because we didn't understand the nature of the threat yet, and from my perspective, there was a decent chance that it could end up being a global disaster. I don't think the actions I took in that time—mainly stocking up on more food—were that costly, or irrational. After March 2020, the main actions I took were wearing a mask when I went out and avoiding certain social events. This too, was not very costly.
I think nuclear war is a fundamentally different type of risk than Covid, especially when we're comparing the ex-ante risks of nuclear war versus the ex-post consequences of Covid. In my estimation, nuclear war could kill up to billions of people via very severe disruptions to supply chains. Even at the height of the panic, the most pessimistic credible forecasts for Covid were nowhere near that severe.
In addition, an all-out nuclear war is different from Covid because of how quickly the situation can evolve. With nuclear war, we may live through some version of the following narrative: At one point in time, the world was mostly normal. Mere hours later, the world was in total ruin, with tens of millions of people being killed by giant explosions. By contrast, Covid took place over months.
Given this, I personally think it makes sense to leave SF/NYC/wherever if we get a very clear and unambiguous signal that a large amount of the world may be utterly destroyed in a matter of hours.
One consideration is that, in the long-run, uploading people onto computers would probably squeeze far more value out of each atom than making people into hobbits. In that case, the housing stock would be multiplied by orders of magnitude, since people can be stored in server rooms. Assuming uploaded humans aren't retired, economic productivity would be a lot higher too.
Which is why I'm way more convinced by Gary Marcus' examples than by e.g. Scott Alexander. I don't think they need to be able to describe "true understanding" to demonstrate that current AI is far from human capabilities.
My impression is that this debate is mostly people talking past each other. Gary Marcus will often say something to the effect of, "Current systems are not able to do X". The other side will respond with, "But current systems will be able to do X relatively soon." People will act like these statements contradict, but they do not.
I recently asked Gary Marcus to name a set of concrete tasks he thinks deep learning systems won't be able to do in the near-term future. Along with Ernie Davis, he replied with a set of mostly vague and difficult to operationalize tasks, collectively constituting AGI, which he thought won't happen by the end of 2029 (with no probability attached).
While I can forgive people for being a bit vague, I'm not impressed by the examples Gary Marcus offered. All of the tasks seem like the type of thing that could easily be conquered by deep learning if given enough trial and error, even if the 2029 deadline is too aggressive. I have yet to see anyone -- either Gary Marcus, or anyone else -- name a credible, specific reason why deep learning will fail in the coming decades. Why exactly, for example, do we think that it will stop short of being able to write books (when it can already write essays), or it will stop short of being able to write 10,000 lines of code (when it can already write 30 lines of code)?
Now, some critiques of deep learning seem right: it's currently too data-hungry, and very costly to run large training runs, for example. But of course, these objections only tell us that there might be some even more efficient paradigm that brings us AGI sooner. It's not a good reason to expect AGI to be centuries away.
Why the thought that AGI is theoretically possible should make us expect it from the current paradigm (my impression is that most researchers don't expect that, and that's why their survey answers are so volatile with slight changes in phrasing)
Holden Karnofsky does discuss this objection in his blog post sequence,
The argument I most commonly hear that it is "too aggressive" is along the lines of: "There's no reason to think that a modern-methods-based AI can learn everything a human does, using trial-and-error training - no matter how big the model is and how much training it does. Human brains can reason in unique ways, unmatched and unmatchable by any AI unless we come up with fundamentally new approaches to AI." This kind of argument is often accompanied by saying that AI systems don't "truly understand" what they're reasoning about, and/or that they are merely imitating human reasoning through pattern recognition.I think this may turn out to be correct, but I wouldn't bet on it. A full discussion of why is outside the scope of this post, but in brief:I am unconvinced that there is a deep or stable distinction between "pattern recognition" and "true understanding" (this Slate Star Codex piece makes this point). "True understanding" might just be what really good pattern recognition looks like. Part of my thinking here is an intuition that even when people (including myself) superficially appear to "understand" something, their reasoning often (I'd even say usually) breaks down when considering an unfamiliar context. In other words, I think what we think of as "true understanding" is more of an ideal than a reality.I feel underwhelmed with the track record of those who have made this sort of argument - I don't feel they have been able to pinpoint what "true reasoning" looks like, such that they could make robust predictions about what would prove difficult for AI systems. (For example, see this discussion of Gary Marcus's latest critique of GPT3, and similar discussion on Astral Codex Ten)."Some breakthroughs / fundamental advances are needed" might be true. But for Bio Anchors to be overly aggressive, it isn't enough that some breakthroughs are needed; the breakthroughs needed have to be more than what AI scientists are capable of in the coming decades, the time frame over which Bio Anchors forecasts transformative AI. It seems hard to be confident that things will play out this way - especially because:Even moderate advances in AI systems could bring more talent and funding into the field (as is already happening8).If money, talent and processing power are plentiful, and progress toward PASTA is primarily held up by some particular weakness of how AI systems are designed and trained, a sustained attempt by researchers to fix this weakness could work. When we're talking about multi-decade timelines, that might be plenty of time for researchers to find whatever is missing from today's techniques.
The argument I most commonly hear that it is "too aggressive" is along the lines of: "There's no reason to think that a modern-methods-based AI can learn everything a human does, using trial-and-error training - no matter how big the model is and how much training it does. Human brains can reason in unique ways, unmatched and unmatchable by any AI unless we come up with fundamentally new approaches to AI." This kind of argument is often accompanied by saying that AI systems don't "truly understand" what they're reasoning about, and/or that they are merely imitating human reasoning through pattern recognition.
I think this may turn out to be correct, but I wouldn't bet on it. A full discussion of why is outside the scope of this post, but in brief:
I think more generally, even if AGI is not developed via the current paradigm, it is still a useful exercise to predict when we could in principle develop AGI via deep learning. That's because, even if some even more efficient paradigm takes over in the coming years, that could make AGI arrive even sooner, rather than later, than we expect.
What operative conclusion can be drawn from the "importance" of this century. If it turned out to be only the 17th most important century, would that affect our choices?
One major implication is that we should spend our altruistic and charity money now, rather than putting it into a fund and investing it, to be spent much later. The main alternative to this view is the view taken by the Patient Philanthropy Project, which invests money until such time that there is an unusually good opportunity.