AlyssaVance

140Joined Dec 2016

Posts
1

Sorted by New

Comments
9

Seven ways to become unstoppably agentic

Some rejections are inevitable, and never getting rejected is a sign of unhealthy risk aversion. But I think if you get rejected much more than equivalent people in your situation (eg. applying to twenty colleges and getting no acceptances), changing your strategy is more important than just trying harder.

“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments

I might be reading too much into this, but the word "legitimate" in "legitimate global regulatory efforts" feels weird here. Like... the idea that "if you, a private AI lab, try to unilaterally stop everyone else from building AI, they will notice and get mad at you" is really important. But the word "legitimate" brings to mind a sort of global institutional-management-nomenklatura class using the word as a status club to go after anything it doesn't like. If eg. you developed a COVID test during 2020, one might say "this test doesn't work" or "this test has bad side effects" or "the FDA hasn't approved it, they won't let you sell this test" or "my boss won't let me build this test at our company"; but saying "this test isn't legitimate" feels like a conceptual smudge that tries to blend all those claims together, as if each implied all of the others. 

“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments

FWIW, the way I now think about these scenarios is that there's a tradeoff between technical ability and political ability:

 - If you have infinite technical ability (one person can create an aligned Jupiter Brain in their basement), then you don't need any political ability and can do whatever you want.

 - If you have infinite political ability (Xi Jinping cures aging, leads the CCP to take over the world, and becomes God-Emperor of Man), you don't need any technical ability and can just do whatever you want.

I don't think either of those are plausible and a realistic strategy will need both, although in varying proportion, but having less of one will demand more of the other. Some closely related ideas are:

 - The weaker and less general an AI is, the safer it is to align and test. Potentially dangerous AIs should be as weak as possible while still doing the job, in the same way that Android apps should have as few permissions as are reasonably practical. A technique that reduces an AI's abilities in some important way, while still fulfilling the main goal, is a net win. (Eg. scrubbing computer code from the train set of something like GPT-3.)

 -  Likewise, everything you might try for alignment will almost certainly fail if you turn up the AI power level *enough*, just as any system can be hacked into if you try infinitely hard. No alignment advance will "solve the problem", but it may make a somewhat-more-powerful AI safer to run. (Eg. I doubt better interpretability would do much to help with a Jupiter Brain, but would help you understand smaller AIs.)

 - An unexpected shock (eg. COVID-19, or the release of GPT-3) won't make existing political actors smarter, but may make them change their priorities. (Eg. if, when COVID-19 happened, you had already met everyone at the FDA, had vaccine factories and supply chains built and emergency trial designs ready in advance, it would have been a lot easier to get rapid approval. Likewise, many random SWEs that I tell about PaLM or DALL-E now instinctively see it as dangerous and start thinking about safety; they don't have a plan but now see one as important.)

(Plan to write more about this in the future, this is just a quick conceptual sketch.)

Contra the Giving What We Can pledge

"In general, without a counterfactual in the background all criticism is meaningless"

This seems like a kind of crazy assertion to me. Eg., in 1945, as part of the war against Japan, the US firebombed dozens of Japanese cities, killing hundreds of thousands of civilians. (The bombs were intentionally designed to set cities on fire.) Not being a general or historian, I don't have an exact plan in mind for an alternative way for the past US to have spent its military resources. Maybe, if you researched all the options in enough detail, there really was no better alternative. But it seems entirely reasonable to say that the firebombing was bad, and to argue that (if you were around back then) people should maybe think about not doing that. (The firebombing is obviously not comparable to the pledge, I'm just arguing the general principle here.)

"This is only half-true. I pledged 20%."

The statement was that the pledge recommended 10%, which is true. Of course other people can choose to do other things, but that seems irrelevant.

"Citation needed?"

The exact numbers aren't important here, but the US federal budget is $3.8 trillion, and the US also has a great deal of influence over both private money and foreign money (through regulations, treaties, precedent, diplomatic pressure, etc.). There are three branches of government, of which Congress is one; Congress has two houses, and there are then 435 representatives in the lower house. Much of the money flow was committed a long time ago (eg. Social Security), and would be very hard to change; on the other hand, a law you pass may keep operating and directing money decades into the future. Averaged over everything, I think you get ~$1 billion a year of total influence, order-of-magnitude; 0.1% of that is $1 million, or 57x the $17,400 personal donation. This is fairly conservative, as it basically assumes that all you're doing is appropriating federal dollars to GiveDirectly or something closely equivalent; there are probably lots of cleverer options.

"But if your time really is so incredibly valuable, then you shouldn't spend time doing this yourself, you outsource it to people you trust"

The orders of magnitude here aren't even comparable. This might reduce the net cost to your effectiveness from 5% to 2%, or something like that; it's not going to reduce it to 0.0001%, or whatever the number would have to be for the math to work out.

"However, in principle you should still give your money, just not your time."

In practice, there is always some trade-off between money and time (eg. here discusses this, along with lots of other sites). The rate varies depending on who you are, what you're doing, the type of time you're trading off against, etc. But again, it's not going to vary by the orders of magnitude you seem to implicitly assume.

"Indeed, as you point out towards the end of your piece, this is basically Givewell's initial model."

The initial GiveWell audience was mostly trading off against personal leisure time; that obviously isn't the case here.

"the marginal effort is roughly 0"

It seems extremely implausible that someone making a middle-class salary, or someone making an upper-middle-class salary but under very high time pressure and with high expenses, could give away 10% of their income for life and literally never think about it again.

"If we're not including students, what's your source for thinking effective altruists are 'low income'? Low relative to what?"

Relative to their overall expected career paths. In upper-middle-class and upper-class career tracks (finance, law, business management, entrepreneurship, etc.), income is very back-weighted, with the large majority of expected income coming during the later years of the career.

"You used two examples from (a) GWWC's page itself and (b) CEA updates, GWWC's parent organisation, and used this to conclude the other metrics don't exist?"

I can't prove a negative. If they do exist, where are they? If you link to some, I'll happily add them to the post, as I did for 80K's metrics.

"What metrics do you think 80k/REG/EAF/FHI/CSER/MIRI/CFAR/Givewell/any-other-EA-org are using?"

The GWWC pledge count is used as a metric for EA as a whole, rather than for any specific org like MIRI, CFAR, etc. (Also, AFAIK, many of the orgs mentioned don't really even have internal metrics, except things like "total annual budget" that aren't really measures of efficacy.)

"And we know that CEA is aware of the possible issues with being too focused on this to the exclusion of all else because they said exactly that here* on the forum."

That's cool, but as far as I know, these metrics don't yet exist. If they do exist, great, I'll link them here.

"I don't see how that difference is going to be on a par with your Nevada/Alaska comparison"

The important difference isn't the donation amounts (at least for that example). The important differences are a) this is a public commitment, while most GiveWell-influenced donations are private; b) the commitment is made all at once, rather than year-by-year; c) the commitment is the same income fraction for every year, rather than being adjustable on-the-fly; d) the standard deviation of income for pledgers is almost certainly much higher than for GiveWell's initial audience; e) the standard deviation of human capital is higher; f) the standard deviation of amount-of-free-time is higher; g) pledgers now have very different, and much higher-variance, ideas about "the most good" than a typical GiveWell donor in 2009 (though this is somewhat of an "accident of sociology" rather than intrinsic to the pledge itself).

"I didn't really follow the argument being made here; how does the second point follow from the first?"

There's a selection effect where pledge-takers are much less likely to be the type of people who'd be turned off by donating to a "weird" charity, taking a "weird" career, etc., since people like that would probably not pledge in the first place.

Contra the Giving What We Can pledge

I don't think it's at all clear that the people Michael highlighted are a small minority as a percentage of deeply committed EAs. A small minority among all Westerners, sure, but that's not the relevant reference class.

(Thanks for the link BTW, have added to the post)

Contra the Giving What We Can pledge

The personal utility cost of giving 10% is approximately constant, but the benefit/cost ratio definitely isn't, since benefit (the numerator) increases linearly with income. I've edited this sentence to make it clearer.

Why I'm donating to MIRI this year

Thanks Owen! Agreed with your writeup. I would donate to MIRI myself this year, but I unfortunately don't really have spare cash right now :P