All of Michael_Cohen's Comments + Replies

Do you have a minute to react to this? Are you satisfied with my response?

How “natural” are intended generalizations (like “Do what the supervisor is hoping I’ll do, in the sense that most humans would mean this phrase rather than in a precise but malign sense”) vs. unintended ones (like “Do whatever maximizes reward”)?

I think this is an important point. I consider the question in this paper, published last year at AI Magazine. See the "Competing Models of the Goal" section, and in particular the "Arbitrary Reward Protocols" subsection. (2500 words)

I think there's something missing from the discussion here, which the key point o... (read more)

This is very high-quality. No disputes just clarifications.

I don’t just mean meta-orgs.

I think working for a well-financed grantmaking organization is not outrageously unconventional, although I suspect most lean on part-time work from well-respected academics more than OpenPhil does.

And I think 80k may just be an exception (a minor one, to some extent), borne out of an unusually clear gap in the market. I think some of their work should be done in academia instead (basically whatever work it’s possible to do), but some of the very specific stuff like the ... (read more)

Thank you for the edit, and thank you again for your interest. I'm still not sure what you mean by a person "having access to the ground truth of the universe". There's just no sense I can think of where it is true that this a requirement for the mentor.

"The system is only safe if the mentor knows what is safe." It's true that if the mentor kills everyone, then the combined mentor-agent system would kill everyone, but surely that fact doesn't weight against this proposal at all. In any case, more importantly a) the agent will not aim to kill everyone regar... (read more)

1
Michael_Cohen
1y
Do you have a minute to react to this? Are you satisfied with my response?

Robin Hanson didn't occur to me when I wrote it or any of the times I read it! I was just trying to channel what I thought conventional advice would be.

So basically, just philosophy, math, and some very simple applied math  (like, say, the exponential growth of an epidemic), but already that last example is quite shaky.

In fields where it's possible to make progress with first-principles arguments/armchair reasoning, I think smart non-experts stand a chance of outperforming. I don't want to make strong claims about the likelihood of success here; I just want to say that it's a live possibility. I am much more comfortable saying that outperforming conventional wisdom is extremely unlikely on topics where first-principles arguments/armchair reasoning are insufficient.

(As it happens, EAs aren't really disputing the experts in philosophy, but that's beside the point...)

1
Michael_Cohen
1y
So basically, just philosophy, math, and some very simple applied math  (like, say, the exponential growth of an epidemic), but already that last example is quite shaky.

academics themselves have criticized the peer review system a great deal for various reasons, including predatory journals, incentive problems, publication bias, Why Most Published Research Findings Are False, etc

I think we could quibble on the scale and importance of all of these points, but I'm not prepared to confidently deny any of them. The important I want to make is: compared to what alternatives? The problem is hard, and even the best solution can be expected to have many visible imperfections. How persuaded you be by a revoluationary enumerating t... (read more)

3
Ben Millwood
1y
I want to differentiate two kinds of success for a social institution: 1. "reproductive" success, by analogy with evolution: how well the institution establishes and maintains itself as dominant, 2. success at stated goals: for peer review, success at finding the truth, producing high quality research, etc. Your argument seems to be (at least in part) that because peer review has achieved success 1, that is strong evidence that it's better than its alternatives at success 2. My argument (in part) is that this is only true if the two kinds of success have some mechanism tying them together. Some example mechanisms could be: * the institution achieved reproductive success by means of being pushed really hard by people motivated by the desire to build and maintain a really high quality system, * the institution is easy to replace with better systems, and better systems are easy to try, so the fact that it hasn't been replaced must mean better systems are hard to find. I don't think either of these things are true of peer review. (The second is true of AWS, for example.) So what's the mechanism that established peer review as the consensus system that relates to it being a high quality system? (I'm not saying I have alternatives, just that "consensus means a thing is good" is only sometimes a good argument.)

What "major life goals should include (emphasis added)" is not a sociological question. It is not a topic that a sociology department would study. See my comment that  I agree "conventional wisdom is wrong" in dismissing the philosophy of effective altruism (including the work of Peter Singer). And my remark immediately thereafter: "Yes, these are philosophical positions, not sociological ones, so it is not so outrageous to have a group of philosophers and philosophically-minded college students outperform conventional wisdom by doing first-principles... (read more)

4
Ben Millwood
1y
Maybe I'm just missing something, but I don't get why EAs have enough standing in philosophy to dispute the experts, but not in sociology. I'm not sure I could reliably predict which other fields you think conventional wisdom is or isn't adequate in.
4
lilly
1y
I think the crux of the disagreement is this: you can't disentangle the practical sociological questions from the normative questions this easily. E.g., the practical solution to "how do we feed everyone" is "torture lots of animals" because our society cares too much about having cheap, tasty food and too little about animals' suffering. The practical solution to "what do we do about crime" is "throw people in prison for absolutely trivial stuff" because our society cares too much about retribution and too little about the suffering of disadvantaged populations. And so on. Practical sociological solutions are always accompanied by normative baggage, and much of this normative baggage is bad.  EA wouldn't be effective if it just made normative critiques ("the world is extremely unjust") but didn't generate its own practical solutions ("donate to GiveWell"). EA has more impact than most philosophy departments because it criticizes many conventional philosophical positions while also generating its own practical sociological solutions. This doesn't mean all of those solutions are right—I agree that many aren't—but EA wouldn't be EA if it didn't challenge conventional sociological wisdom. (Separately, I'd contest that this is not a topic of interest to sociologists. Most sociology PhD curricula devote substantial time to social theory, and a large portion of sociologists are critical theorists; i.e., they believe that "social problems stem more from social structures and cultural assumptions than from individuals... [social theory] argues that ideology is the principal obstacle to human liberation.")

Upvoted this.

You generally shouldn't take Forum posts as seriously as peer-reviewed papers in top journals

I suspect I would advise taking them less seriously than you would advise, but I'm not sure.

It could also imply that EA should have fewer and larger orgs, but that's a question too complicated for this comment to cover

I think there might be a weak conventional consensus in that direction, yes. By looking at the conventional wisdom on this point, we don't have deal with the complicatedness of the question--that's kind of my whole point. But even more im... (read more)

8
Aaron Gertler
1y
The range of quality in Forum posts is... wide, so it's hard to say anything about them as a group. I thought for a while about how to phrase that sentence and could only come up with the mealy-mouthed version you read. Maybe? I'd be happy to see a huge number of additional charities at the "median GiveWell grantee" level, and someone has to start those charities. Doesn't have to be people in EA — maybe the talent pool is simply too thin right now — but there's plenty of room for people to create organizations focused on important causes. (But maybe you're talking about meta orgs only, in which case I'd need a lot more community data to know how I feel.) I agree, and I also think this is what EA people are mostly doing.  When I open Swapcard for the most recent EA Global, and look at the first 20 attendees alphabetically (with jobs listed), I see: * Seven people in academia (students or professors); one is at the Global Priorities Institute, but it still seems like "studying econ at Oxford" would be a good conventional-wisdom thing to do (I'd be happy to yield on this, though) * Six people working in conventional jobs (this includes one each from Wave and Momentum, but despite being linked to EA, both are normal tech companies, and Wave at least has done very well by conventional standards) * One person in policy * Six people at nonprofit orgs focused on EA things Glancing through the rest of the list, I'd say it leans toward more "EA jobs" than not, but this is a group that is vastly skewed in favor of "doing EA stuff" compared to the broad EA community as a whole, and it's still not obvious that the people with EA jobs/headed for EA jobs are a majority.  (The data gets even messier if you're willing to count, say, an Open Philanthropy researcher as someone doing a conventionally wise thing, since you seem to think OP should keep existing.) Overall, I'd guess that most people trying to maximize their impact with EA in mind are doing so via policy work,
8
Tiresias
1y
Yeah, I'm not sure that people prioritizing the Forum over journal articles is a majority view, but it is definitely something that happens, and there are currents in EA that encourage this sort of thinking. I'm not saying we should not be somewhat skeptical of journal articles. There are huge problems in the peer-review world. But forum/blogs posts, what your friends say, are not more reliable. And it is concerning that some elements of EA culture encourage you to think that they are. Evidence for my claim, based on replies to some of Ineffective Altruism's tweets (who makes a similar critique). 1: https://twitter.com/IneffectiveAlt4/status/1630853478053560321?s=20 Look at replies in this thread 2: https://twitter.com/NathanpmYoung/status/1630637375205576704?s=20 Look at all the various replies in this thread (If it is inappropriate for me to link to people's Twitter replies in a critical way, let me know. I feel a little uncomfortable doing this, because my point is not to name and shame any particular person. But I'm doing it because it seems worth pushing back against the claim that "this doesn't happen here." I do not want to post a name-blurred screenshot because I think all replies in the thread are valuable information, not just the replies I share, so I want to enable people to click through.)

I'm someone who has read your work (this paper and FGOIL, the latter of which I have included in a syllabus), and who would like to see more work in similar vein, as well as more formalism in AI safety. I say this to establish my bona fides, the way you established your AI safety bona fides.

Thanks! I should have clarified it has received some interest from some people.

you don't show that "when a certain parameter of a certain agent is set sufficiently high, the agent will not aim to kill everyone", you show something more like "when you can desig

... (read more)
2
Rubi J. Hudson
1y
I'll edit to comment to note that you dispute it, but I stand by the comment. The AI system trained is only as safe as the mentor, so the system is only safe if the mentor knows what is safe. By "restrict", I meant for performance reasons, so that it's feasible to train and deploy in new environments. Again, I like your work and would like to see more similar work from you and others. I am just disputing the way you summarized it in this post, because I think that portrayal makes its lack of splash in the alignment community a much stronger point against the community's epistemics than it deserves.

I'd love to chat with you about directions here, if you're interested. I don't know anyone with a bigger value of  p(survival | West Wing levels of competence in major governments) - p(survival | leave it to OpenAI and DeepMind leadership). I've published technical AI existential safety research at top ML conferences/journals, and I've gotten two MPs in the UK onside this week. You can see my work at michael-k-cohen.com, and you can reach me at michael.cohen@eng.ox.ac.uk.

Glad to hear about this!

I have a recommendation for the structure of it. I'd recommend that anonymous reviewers review submissions and share their reviews with the authors (perhaps privately) before a rebuttal phase (also perhaps private). And then reviewers can revise their reviews, and then chairs can make judgments about which submissions to publish.

I constructed an agent where you can literally prove that if you set a parameter high enough, it won't try to kill everyone, while still eventually at least matching human-level intelligence. Sure it uses a realizability assumption, sure it's intractable in its current form, sure it might require an enormously long training period, but these are computer science problems, not philosophy problems, and they clearly suggest paths forward. The underlying concept is sound. It struck me as undignified to say this in the past, but maybe dignity rightly construed ... (read more)

Here are some of mine to add to Vanessa's list.

One on imitation learning. [Currently an "accept with minor revisions" at JMLR]

One on conservatism in RL. A special case of Vanessa's infra-Bayesianism. [COLT 2020]

One on containment and myopia. [IEEE]

Among many things I agree with, the part I agree the most with:

EAs give high credence to non-expert investigations written by their peers, they rarely publish in peer-review journals and become increasingly dismissive of academia

I think a fair amount of the discussion of intelligence loses its bite if "intelligence" is replaced with what I take to be its definition: "the ability to succeed a randomly sampled task" (for some reasonable distribution over tasks). But maybe you'd say that perceptions of intelligence in the EA community... (read more)

But that mechanism for belief transmission within EA, i.e. object-level persuasion, doesn't run afoul of your concerns about echochamberism, I don't think.

Getting too little exposure to opposing arguments is a problem. Most arguments are informal so not necessarily even valid, and even for the ones that are, we can still doubt their premises, because there may be other sets of premises that conflict with them but are at least as plausible. If you disproportionately hear arguments from a given community, you're more likely than otherwise to be biased towards the views of that community.

it doesn't follow that it's a good investment overall

Yes, it doesn't by itself--my point was only meant as a counterargument to your claim that the efficient market hypothesis precluded the possibility of political donations being a good investment.

Well, there are >100 million people who have to join some constituency (i.e. pick a candidate), whereas potential EA recruits aren't otherwise picking between a small set of cults philosophical movements. Also, AI PhD-ready people are in much shorter supply than, e.g. Iowans, and they'd be giving up much much much more than someone just casting a vote for Andrew Yang.

2
kbog
5y
There are numerous minor, subtle ways that EAs reduce AI risk. Small in comparison to a research career, but large in comparison to voting. (Voting can actually be one of them.)
we've had two presidents now who actively tried to counteract mainstream views on climate-change, and they haven't budged climate scientists at all.

I have updated in your direction.

Of course, AI alignment is substantially more scientifically accepted and defensible than climate skepticism.

Yep.

You only mean this as a possibility in the future, if there is any point where AGI is believed to be imminent, right?

No I meant starting today. My impression is that coalition-building in Washington is tedious work. Scientists agreed to avoid gene editing in... (read more)

2
kbog
5y
FWIW I don't think that would be a good move. I don't feel like fully arguing it now, but main points (1) sooner AGI development could well be better despite risk, (2) such restrictions are hard to reverse for a long time after the fact, as the story of human gene editing shows, (3) AGI research is hard to define - arguably, some people are doing it already.

That is plausible. But "definitely" definitely wouldn't be called for when comparing Yang with Grow EA. How many EA people who could be sold on an AI PhD do you think could recruited with $20 million?

2
kbog
5y
I meant that it's definitely more efficient to grow the EA movement than to grow Yang's constituency. That's how it seems to me, at least. It takes millions of people to nominate a candidate.

The other thing is that in 20 years, we might want the president on the phone with very specific proposals. What are the odds they'll spend a weekend discussing AGI with Andrew Yang if Yang used to be president vs. if he didn't?

But as for what a president could actually do: create a treaty for countries to sign that ban research into AGI. Very few researchers are aiming for AGI anyway. Probably the best starting point would be to get the AI community on board with such a thing. It seems impossible today that consensus could be built about such a ... (read more)

4
kbog
5y
You only mean this as a possibility in the future, if there is any point where AGI is believed to be imminent, right? Still, I think you are really overestimating the ability of the president to move the scientific community. For instance, we've had two presidents now who actively tried to counteract mainstream views on climate-change, and they haven't budged climate scientists at all. Of course, AI alignment is substantially more scientifically accepted and defensible than climate skepticism. But the point still stands.
If you're super focused on that issue, then it will definitely be better to spend your money on actual AI research, or on some kind of direct effort to push the government to consider the issue (if such an effort exists).

I am, and that's what I'm wondering. The "definitely" isn't so obvious to me. Another $20 million to MIRI vs. an increase in the probability of Yang's presidency by, let's say, 5%--I don't think it's clear cut. (And I think MIRI is the best place to fund research).

2
kbog
5y
What about simply growing the EA movement? That clearly seems like a more efficient way to address x-risk, and something where funding could be used more readily.
Is your claim that AI policy is currently talent-constrained, and having Yang as president would lead to more people working on it, thereby making it money-constrained?

No--just that there's perhaps a unique opportunity for cash to make a difference. Otherwise, it seems like orgs are struggling to spend money to make progress in AI policy. But that's just what I hear.

Can you elaborate on this?

First pass: power is good. Second pass: get practice doing things like autonomous weapons bans, build a consensus around getting countries to agree to intern... (read more)

Additionally, Morning Consult shows higher support than all other pollsters. The average for Steyer in early states is considerably less favorable.

Good to know.

Steyer is running ads with little competition

Really?

5
Michael_S
5y
Yes. People aren't spending much money yet because people will mostly forget about it by the election.

I am in general more trusting, so I appreciate this perspective. I know he's a huge fan of Sam Harris and has historically listened to his podcast, so I imagine he's head Sam's thoughts (and maybe Stuart Russell's thoughts) on AGI.

The stake of the public good in any given election is much larger than the stake of any given entity, so the correct amount for altruists to invest in an election should be much larger than for a self-interested corporation or person.

not that he single-handedly caused Trump's victory.

Didn't claim this.

This is naive.

Not sure what this adds.

5
Michael_Wiebe
5y
Yes, you're right that altruists have a more encompassing utility function, since they focus on social instead of individual welfare. But even if altruists will invest more in elections than self-interested individuals, it doesn't follow that it's a good investment overall. Sorry for being harsh, but my honest first impression was "this makes EAs look bad to outsiders".

MIRI's current size seems to me to be approximately right for this purpose, and as far as I know MIRI staff don't think MIRI is too small to continue making steady progress.

My guess is that this intuition is relatively inelastic to MIRI's size. It might be worth trying to generate the counterfactual intuition here if MIRI were half its size or double its size. If that process outputs a similar intuition, it might be worth attempting to forget how many people MIRI employs in this area, and ask how many people should be working on a topic that by your est... (read more)