I'm the CTO of Redwood Research, a nonprofit focused on applied alignment research. Read more about us here: http://tiny.cc/redwood

I'm also a fund manager on the EA Infrastructure Fund.

Wiki Contributions


Linch's Shortform

What kinds of things do you think it would be helpful to do cost effectiveness analyses of? Are you looking for cost effectiveness analyses of problem areas or specific interventions?

Buck's Shortform

When I was 19, I moved to San Francisco to do a coding bootcamp. I got a bunch better at Ruby programming and also learned a bunch of web technologies (SQL, Rails, JavaScript, etc).

It was a great experience for me, for a bunch of reasons.

  • I got a bunch better at programming and web development.
    • It was a great learning environment for me. We spent basically all day pair programming, which makes it really easy to stay motivated and engaged. And we had homework and readings in the evenings and weekends. I was living in the office at the time, with a bunch of the other students, and it was super easy for me to spend most of my waking hours programming and learning about web development. I think that it was very healthy for me to practice working really long hours in a supportive environment.
    • The basic way the course worked is that every day you’d be given a project with step-by-step instructions, and you’d try to implement the instructions with your partner. I think it was really healthy for me to repeatedly practice the skill of reading the description of a project, then reading the step-by-step breakdown, and then figuring out how to code everything.
    • Because we pair programmed every day, tips and tricks quickly percolated through the cohort. We were programming in Ruby, which has lots of neat little language features that it’s hard to pick up all of on your own; these were transmitted very naturally. I also was pushed to learn my text editor better.
    • The specific content that I learned was sometimes kind of fiddly; it was helpful to have more experienced people around to give advice when things went wrong.
    • I think that this was probably a better learning experience than most tech or research internships I could have gotten. If I’d had access to the best tech/research internships, maybe that would have been better. I think that this was probably a much better learning experience than eg most Google internships seem to be.
  • I met rationalists and EAs in the Bay.
  • I spent a bunch of time with real adults who had had real jobs before. The median age of students was like 25. Most of the people had had jobs before and felt dissatisfied with them and wanted to make a career transition. I think that spending this time with them helped me grow up faster.
  • I somehow convinced my university that this coding bootcamp was a semester abroad (thanks to my friend Andrew Donnellan for suggesting this to me; that suggestion plausibly accelerated my career by six months), which meant that I graduated on schedule even though I then spent six months working for App Academy as a TA (which incidentally was also a good experience.)

Some ways in which my experience was unusual:

  • I was a much stronger programmer on the way in to the program than most of my peers.
  • I am deeply extroverted and am fine with pair programming every day.

It seems plausible to me that more undergrad EAs should do something like this, especially if they can get college credit for it (which I imagine might be hard for most students—I think I only got away with it because my university didn’t really know what was going on). The basic argument here is that it might be good for them the same way it was good for me.

More specifically, I think that there are a bunch of EAs who want to do technical AI alignment work and who are reasonably strong but not stellar coders. I think that if they did a coding bootcamp between, say, freshman and sophomore year, they might come back to school and be a bunch stronger. The bootcamp I did was focused on web app programming with Ruby and Rails and JavaScript. I think that these skills are pretty generically useful to software engineers. I often am glad to be better than my coworkers at quickly building web apps, and I learned those skills at App Academy (though being a professional web developer for a while also helped). Eg in our current research, even aside from the web app we use for getting our contractors to label data, we have to deal with a bunch of different computers that are sending data back and forth and storing it in databases or Redis queues or whatever. A reasonable fraction of undergrad EAs would seem like much more attractive candidates to me if they’d done a bootcamp. (They’d probably seem very marginally less attractive to normal employers than if they’d done something more prestigious-seeming with that summer, but most people don’t do very prestigious-seeming things in their first summer anyway. And the skills they had learned would probably be fairly attractive to some employers.)

This is just a speculative idea, rather than a promise, but I’d be interested in considering funding people to do bootcamps over the summer—they often cost maybe $15k. I am most interested in funding people to do bootcamps if they are already successful students at prestigious schools, or have other indicators of talent and conscientious, and have evidence that they’re EA aligned.

Another thing I like about this is that a coding bootcamp seems like a fairly healthy excuse to hang out in the Bay Area for a summer. I like that they involve working hard and being really focused on a concrete skill that relates to the outside world.

I am not sure whether I’d recommend someone do a web programming bootcamp or a data science bootcamp—though data science might seem more relevant, I think the practical programming stuff in the web programming bootcamp might actually be more helpful on the margin. (Especially for people who are already doing ML courses in school.)

I don’t think there are really any bootcamps focused on ML research and engineering. I think it’s plausible that we could make one happen. Eg I know someone competent and experienced who might run a bootcamp like this over a summer if we paid them a reasonable salary.

Buck's Shortform

Doing lots of good vs getting really rich

Here in the EA community, we’re trying to do lots of good. Recently I’ve been thinking about the similarities and differences between a community focused on doing lots of good and a community focused on getting really rich.

I think this is interesting for a few reasons:

  • I found it clarifying to articulate the main differences between how we should behave and how the wealth-seeking community should behave.
  • I think that EAs make mistakes that you can notice by thinking about how the wealth-seeking community would behave, and then thinking about whether there’s a good reason for us behaving differently.

—— Here are some things that I think the wealth-seeking community would do.

  • There are some types of people who should try to get rich by following some obvious career path that’s a good fit for them. For example, if you’re a not-particularly-entrepreneurial person who won math competitions in high school, it seems pretty plausible that you should work as a quant trader. If you think you’d succeed at being a really high-powered lawyer, maybe you should do that.
  • But a lot of people should probably try to become entrepreneurs. In college, they should start small businesses, develop appropriate skills (eg building web apps), start trying to make various plans about how they might develop some expertise that they could turn into a startup, and otherwise practice skills that would help them with this. These people should be thinking about what risks to take, what jobs to maybe take to develop skills that they’ll need later, and so on.

I often think about EA careers somewhat similarly:

  • Some people are natural good fits for particular cookie-cutter roles that give them an opportunity to have a lot of impact. For example, if you are an excellent programmer and ML researcher, I (and many other people) would love to hire you to work on applied alignment research; basically all you have to do to get these roles is to obtain those skills and then apply for a job.
  • But for most people, the way they will have impact is much more bespoke and relies much more on them trying to be strategic and spot good opportunities to do good things that other people wouldn’t have otherwise done.

I feel like many EAs don’t take this distinction as seriously as they should. I fear that EAs see that there exist roles of the first type—you basically just have to learn some stuff, show up, and do what you’re told, and you have a bunch of impact—and then they don’t realize that the strategy they should be following is going to involve being much more strategic and making many more hard decisions about what risks to take. Like, I want to say something like “Imagine you suddenly decided that your goal was to make ten million dollars in the next ten years. You’d be like, damn, that seems hard, I’m going to have to do something really smart in order to do that, I’d better start scheming. I want you to have more of that attitude to EA.”

Important differences:

  • Members of the EA community are much more aligned with each other than wealth-seeking people are. (Maybe we’re supposed to be imagining a community of people who wanted to maximize total wealth of the community for some reason.)
  • Opportunities for high impact are biased to be earlier in your career than opportunities for high income. (For example, running great student groups at top universities is pretty high up there in impact-per-year according to me; there isn’t really a similarly good moneymaking opportunity for which students are unusually well suited.)
  • The space of opportunities to do very large amounts of good seems much narrower than the space of opportunities to make money. So you end up with EAs wanting to work with each other much more than the wealth-maximizing people want to work with each other.
  • It seems harder to make lots of money in a weird, bespoke, non-entrepreneurial role than it is to have lots of impact. There are many EAs who have particular roles which are great fits for them and which allow them to produce a whole bunch of value. I know of relatively fewer cases where someone gets a job which seems weirdly tailored to them and is really high paid.
    • I think this is mostly because my sense is that in the for-profit world, it’s hard to get people to be productive in weird jobs, and you’re mostly only able to hire people for roles that everyone involved understands very well already. And so even if someone would be able to produce a huge amount of value in some particular role, it’s hard for them to get paid commensurately, because the employer will be skeptical that they’ll actually produce all that value, and other potential employers will also be skeptical and so won’t bid their price up.
Buck's Shortform

Yeah but this pledge is kind of weird for an altruist to actually follow, instead of donating more above the 10%. (Unless you think that almost everyone believes that most of the reason for them to do the GWWC pledge is to enforce the norm, and this causes them to donate 10%, which is more than they'd otherwise donate.)

Buck's Shortform

[This is an excerpt from a longer post I'm writing]

Suppose someone’s utility function is

U = f(C) + D

Where U is what they’re optimizing, C is their personal consumption, f is their selfish welfare as a function of consumption (log is a classic choice for f), and D is their amount of donations.

Suppose that they have diminishing utility wrt (“with respect to”) consumption (that is, df(C)/dC is strictly monotonically decreasing). Their marginal utility wrt donations is a constant, and their marginal utility wrt consumption is a decreasing function. There has to be some level of consumption where they are indifferent between donating a marginal dollar and consuming it. Below this level of consumption, they’ll prefer consuming dollars to donating them, and so they will always consume them. And above it, they’ll prefer donating dollars to consuming them, and so will always donate them. And this is why the GWWC pledge asks you to input the C such that dF(C)/d(C) is 1, and you pledge to donate everything above it and nothing below it.

This is clearly not what happens. Why? I can think of a few reasons.

  • The above is what you get if the selfish and altruistic parts of you “negotiate” once, before you find out how high your salary is going to be. If instead, you negotiate every year to spend some fair share of your resources on altruistic and selfish resources, you get something like what we see.
  • People aren’t scope sensitive about donations, and so donations also have diminishing marginal returns (because small ones are disproportionately good at making people think you’re good).
  • When you’re already donating a lot, other EAs will be less likely to hold consumption against you (perhaps because they want to incentivize rich and altruistic people to hang out in EA without feeling judged for only donating 90% of their $10M annual expenditure or whatever).
  • When you’re high income, expensive time-money tradeoffs like business class flights start looking better. And it’s often pretty hard to tell which purchases are time-money tradeoffs vs selfish consumption, and if your time is valuable enough, it’s not worth very much time to try to distinguish between these two categories.
  • Early-career people want to donate in order to set themselves up for a habit of donating later (and in order to signal altruism to their peers, which might be rational on both a community and individual level).
  • As you get more successful, your peers will be wealthier, and this will push you towards higher consumption. (You can think of this as just an expense that happens as a result of being more successful.)

I think that it seems potentially pretty suboptimal to have different levels of consumption at different times in your life. Like, suppose you’re going to have a $60k salary one year and a $100k salary the next. It would be better from both an altruistic and selfish perspective to concentrate your donations in the year you’ll be wealthier; it seems kind of unfortunate if people are unable to make these internal trades.

EDIT: Maybe a clearer way of saying my main point here: Suppose you're a person who likes being altruistic and likes consuming things. Suppose you don't know how much money you're going to make next year. You'll be better off in expectation from both a selfish and altruistic perspective if you decide in advance how much you're going to consume, and donate however much you have above that. Doing anything else than this is Pareto worse.

Buck's Shortform

[epistemic status: I'm like 80% sure I'm right here. Will probably post as a main post if no-one points out big holes in this argument, and people seem to think I phrased my points comprehensibly. Feel free to leave comments on the google doc here if that's easier.]

I think a lot of EAs are pretty confused about Shapley values and what they can do for you. In particular Shapley values are basically irrelevant to problems related to coordination between a bunch of people who all have the same values. I want to talk about why. 

So Shapley values are a solution to the following problem. You have a bunch of people who can work on a project together, and the project is going to end up making some total amount of profit, and you have to decide how to split the profit between the people who worked on the project. This is just a negotiation problem. 

One of the classic examples here is: you have a factory owner and a bunch of people who work in the factory. No money is made by this factory unless there's both a factory there and people who can work in the factory, and some total amount of profit is made by selling all the things that came out of the factory. But how should the profit be split between the owner and the factory workers? The Shapley value is the most natural and mathematically nice way of deciding on how much of the profit everyone gets to keep, based only on knowing how much profit would be produced given different subsets of the people who might work together, and ignoring all other facts about the situation.

Let's talk about why I don't think it's usually relevant. The coordination problem EAs are usually interested in is: Suppose we have a bunch of people, and we get to choose which of them take which roles or provide what funds to what organizations. How should these people make the decision of what to do?

As I said, the input to the Shapley value is the coalition value function, which, for every subset of the people you have, tells you how much total value would be produced in the case where just that subset tried to work together.

But if you already have this coalition value function, you've already solved the coordination problem and there’s no reason to actually calculate the Shapley value! If you know how much total value would be produced if everyone worked together, in realistic situations you must also know an optimal allocation of everyone’s effort. And so everyone can just do what that optimal allocation recommended.

Another way of phrasing this is that step 1 of calculating the Shapley value is to answer the question “what should everyone do” as well as a bunch of other questions of the form “what should everyone do, conditioned on only this subset of EAs existing”. But once you’ve done step 1, there’s no reason to go on to step 2.

A related claim is that the Shapley value is no better than any other solution to the bargaining problem. For example, instead of allocating credit according to the Shapley value, we could allocate credit according to the rule “we give everyone just barely enough credit that it’s worth it for them to participate in the globally optimal plan instead of doing something worse, and then all the leftover credit gets allocated to Buck”, and this would always produce the same real-life decisions as the Shapley value.


So I've been talking here about what you could call global Shapley values, where we consider every action of everyone in the whole world. And our measure of profit or value produced is how good the whole world actually ends up being. And you might have thought that you could apply Shapley values in a more local sense. You could imagine saying “let's just think about the value that will be produced by this particular project and try to figure out how to divide the impact among the people who are working on this project”. But any Shapley values that are calculated in that way are either going to make you do the wrong thing sometimes, or rely on solving the same global optimization problem as we were solving before. 

Let's talk first about how the purely local Shapley values sometimes lead to you making the wrong decision. Suppose that some project that requires two people in order to do and will produce $10,000 worth of value if they cooperate on it. By symmetry, the Shapley value for each of them will be $5,000.

Now let’s suppose that one of them has an opportunity cost where they could have made $6,000 doing something else. Clearly, the two people should still do the $10,000 project instead of the $6,000 project. And so if they just made decisions based on the “local Shapley value”, they’d end up not doing the project. And that would end up making things overall worse. The moral of the story here is that the coalition profit function is measured in terms of opportunity cost, which you can’t calculate without reasoning globally. So in the case where one of the people involved had this $6,000 other thing they could have done with their time, the amount of total profit generated from the project is now actually only $4,000. Probably the best way of thinking about this is that you had to pay a $6,000 base salary to the person who could have made $6,000 doing something else. And then you split the $4k profit equally. And so one person ends up getting $8k and the other one ends up getting $2k. 


I think a lot of EAs are hoping that you can use Shapley values to get around a lot of these problems related to coordination and figuring out counterfactual impact and all this stuff. And I think you just basically can't at all. 

I think Shapley values are more likely to be relevant to cases where people have different values, because in this case you have more like a normal negotiation problem, but even here, I think people overstate their relevance. Shapley values are just a descriptive claim about what might happen in the world rather than a normative claim about what should happen. In particular, they assume that everyone has equal bargaining power to start with which doesn't seem particularly true.

I think the main way that Shapley values are relevant to coordination between people with different values is that they're kind of like a Schelling fair way of allocating stuff. Maybe you want to feel cooperative with other people and maybe you don't want to spend a lot of time going back and forth about how much everyone has to pay, and Shapley values are maybe a nice, fair solution to this. I haven’t thought this through properly yet.

In conclusion, Shapley values are AFAICT not relevant to figuring out how to coordinate between people who have the same goals.

EA Infrastructure Fund: Ask us anything!

I am not sure. I think it’s pretty likely I would want to fund after risk adjustment. I think that if you are considering trying to get funded this way, you should consider reaching out to me first.

EA Infrastructure Fund: Ask us anything!

I would personally be pretty down for funding reimbursements for past expenses.

EA Infrastructure Fund: Ask us anything!

This is indeed my belief about ex ante impact. Thanks for the clarification.

Load More