9066 karmaJoined Aug 2014


Researching Causality and Safe AI at Oxford

Previously, founder (with help from Trike Apps) of the EA Forum.

Discussing research etc at https://twitter.com/ryancareyai.


Topic contributions

^In summarising Why They Do It, Will says that usually, that most fraudsters aren't just "bad apples" or doing "cost-benefit analysis" on their risk of being punished. Rather, they fail to "conceptualise what they're doing as fraud". And that may well be true on average, but we know quite a lot about the details of this case, which I believe point us in a different direction.

In this case, the other defendants have said they knew what they're doing was wrong, that they were misappropriating customers' assets, and investing them. That seems to count somewhat against the misconceptualisation hypothesis.

On the other hand, we have some support for the bad apples idea. SBF has said:

In a lot of ways I don't really have a soul. This is a lot more obvious in some contexts than others. But in the end there's a pretty decent argument that my empathy is fake, my feelings are fake, my facial reactions are fake.

So I agree with Spencer, that SBF was at least deficient in affective experience, whether or not he was psychopathic.

Regarding cost-benefit analysis, I would tend to agree with Will that it's unlikely that SBF and company made a detailed calculation of the costs and benefits of their actions (and clearly they calculated incorrectly if they did). 

So based on the specific knowledge of the case, I think that the bad apples hypothesis makes more sense than the cost-benefit hypothesis and misconceptualisation hypotheses.

There is also a fourth category worth considering - whether SBF's views on side constraints were a likely factor - and I think overwhelmingly yes. Sure, as Will points out, SBF may have commented approvingly about a recent article on side constraints. But more  recently, he referred to ethics as "this dumb game we woke Westerners play where we say all say the right shibboleths and so everyone likes." Furthermore, if we're doing Facebook archaeology, we should also consider his earlier writing. In May 2012, SBF wrote about the idea of stealing to give:

I'm not sure I understand what the paradox is here. Fundamentally if you are going to donate the money to [The Humane League] and he's going to buy lots of cigarettes with it it's clearly in an act utilitarian's interest to keep the money as long as this doesn't have consequences down the road, so you won't actually give it to him if he drives you. He might predict this and thus not give you the ride, but then your mistake was letting Paul know that you're an act utilitarian, not in being one. Perhaps this was because you've done this before, but then not giving him money the previous time was possibly not the correct decision according to act utilitarianism, because although you can do better things with the money than he can, you might run in to problems later if you keep in. Similarly, I could go around stealing money from people because I can spend the money in a more utilitarian way than they can, but that wouldn't be the utilitarian thing to do because I was leaving out of my calculation the fact that I may end up in jail if I do so.


As others have said, I completely agree that in practice following rules can be a good idea. Even though stealing might sometimes be justified in the abstract, in practice it basically never is because it breaks a rule that society cares a lot about and so comes with lots of consequences like jail. That being said, I think that you should, in the end, be an act utilitarian, even if you often think like a rule utilitarian; here what you're doing is basically saying that society puts up disincentives for braking rules and those should be included in the act utilitarian calculation, but sometimes they're big enough that a rule utilitarian calculation approximates it pretty well in a much simpler fashion.

I'm sure people will interpret this passage in different ways. But it's clear that, at least at this point in time, he was a pretty extreme act utilitarian.

Taking this and other information on balance, it seems clear in retrospect that a major factor is that SBF didn't take side constraints that seriously.

Of course, most of this information wasn't available or wasn't salient in 2022, so I'm not claiming that we should have necessarily worried based on it - that's a further question.

There is also the theoretical possibility of disbursing a larger number of $ per hour of staff capacity.

I think you can get closer to dissolving this problem by considering why you're assigning credit. Often, we're assigning some kind of finite financial rewards. 

Imagine that a group of n people have all jointly created $1 of value in the world, and that if any one of them did not participate, there would only be $0 units of value. Clearly, we can't give $1 to all of them, because then we would be paying $n to reward an event that only created $0 of value, which is inefficient. If, however, only the first guy (i=1) is an "agent" that responds to incentives, while the others (1<=i<=n) are "environment" whose behaviour is unresponsive to incentives, then it is fine to give the first guy a reward of $1.

This is how you can ground the idea that agents who cooperate should share their praise (something like a Shapley Value approach), whereas rival agents who don't buy into your reward scheme should be left out of the shapley calculation.

Answer by RyanCareyApr 08, 202417

Hi Polkashell,

There are indeed questionable people in EA, as in all communities. EA may be worse in some ways, because of its utilitarian bent, and because many of the best EAs have left the community in the last couple of years.

I think it's common in EA for people to:

  • have high hopes in EA, and have them be dashed, when their preferred project is defunded, when a scandal breaks, and so on. 
  • burn out, after they give a lot of effort to a project. 

What can make such events more traumatic is if EA has become the source of their livelihood, meaning, friendships, etc., i.e. their whole life. 

I think this risk can be reduced by expecting less from EA, and being less invested in it.

The fact that you're already noticing discomfort with your local group suggests that it might be good to step away from EA, or at least hedge your bets in some way. That does not necessarily mean shying away from a cause area X just because there are some EA assholes in it. There are assholes everywhere, after all. But rather to figure out what kind of work, and what kind of life makes sense for you, rather than just from an EA perspective. Also, to maintain connections and support structures outside of EA.

I hope that helps

Julia tells me "I would say I listed it as a possible project rather than calling for it exactly."]

It actually was not just neutrally listed as a "possible" project, because it was the fourth bullet point under "Projects and programs we’d like to see" here.

It may not be worth becoming a research lead under many worldviews. 

I'm with you on almost all of your essay, regarding the advantages of a PhD, and the need for more research leads in AIS, but I would raise another kind of issue - there are not very many career options for a research lead in AIS at present. After a PhD, you could pursue:

  1. Big RFPs. But most RFPs from large funders have a narrow focus area - currently it tends to be prosaic ML, safety, and mechanistic interpretability. And having to submit to grantmakers' research direction somewhat defeats the purpose of being a research lead.
  2. Joining an org working on an adjacent research direction. But they may not exist, depending on what you're excited to work on.
  3. Academia. But you have to be willing to travel, teach a lot, and live on well below the salary for a research contributor.
  4. Little funders (like LTFF). But they may take 3+ months to apply for, and only last a year at a time, and they won't respond to your emails for an explanation of this.
  5. Get hired by as a researcher at OpenPhil? But very few will be hired and given research autonomy here.

For a many research leads, these options won't be very attractive, and I find it hard to feel positive about convincing people to become research leads until better opportunities are in place. What would make me excited? I think we should have:

A. Research agenda agnostic RFPs. There needs to be some way for experienced AI safety researchers to figure out whether AI safety is actually a viable long-term career for them. Currently, there's no way to get OpenPhil's opinion on this - you simply have to wait years until they notice you. But there aren't very many AI safety researchers, and there should be a way for them to run this test so that they can decide which way to direct their lives.

Concrete proposal: OpenPhil should say "we want applications from AIS researchers who we might be excited about as individuals, even if we don't find their research exciting" and should start an RFP along these lines.

B. MLGA (Make LTFF great again). I'm not asking much here, but they should be faster, be calibrated on their timelines, respond to email in case of delays, offer multi-year grants.

Concrete proposal: LTFF should say "we want to fund people for multiple years at a time, and we will resign if we can't get our grantmaking process work properly

C. At least one truly research agenda-agnostic research organisation, that will hire research leads to pursue their own research interests.

Concrete proposal: Folks should found an academic department-style research organisation that hires research leads, gets them office space and visas, and gives them a little support to apply for grants to support their teams. Of course this requires a level of interest from OpenPhil and other grantmakers in supporting this organisation. 

Finally, I conclude on a personal note. As Adam knows, and other readers may deduce, I myself am a research lead underwhelmed with options (1-5). I would like to fix C (or A-B) and am excited to talk about ways of achieving this, but a big part of me just wants to leave AIS for a while, as these options are so much stronger, from a selfish perspective. Given that AIS has been this way for years, I suspect many others might leave before these issues are fixed.

Thanks for engaging with my criticism in a positive way.

Regarding how timely the data ought to be, I don't think live data is necessary at all - it would be sufficient in my view to post updated information every year or two.

I don't think "applied in the last 30 days" is quite the right reference class, however, because by-definition, the averages will ignore all applications that have been waiting for over one month. I think the most useful kind of statistics would:

  1. Restrict to applications from n to n+m months ago, where n>=3
  2. Make a note of what percentage of these applicants haven't received a response
  3. Give a few different percentiles for decision-timelines, e.g. 20th, 50th, 80th, 95th percentiles. 
  4. Include a clear explanation of which applications are being included, or excluded, for example, are you including applications that were not at all realistic, and so were rejected as soon as they landed on your desk?

With such statistics on the website, applications would have a much better sense of what they can expect from the process.

I had a similar experience with 4 months of wait (uncalibrated grant decision timelines on the website) and unresponsiveness to email with LTFF, and I know a couple of people who had similar problems. I also found it pretty "disrespectful".

Its hard to understand why a) they wouldn't list the empirical grant timelines on their website, and b) why they would have to be so long.

There is an "EA Hotel", which is decently-sized, very intensely EA, and very cheap.

Occasionally it makes sense for people to accept very low cost-of-living situations. But a person's impact is usually a lot higher than their salary. Suppose that a person's salary is x, their impact 10x, and their impact is 1.1 times higher when they live in SF, due to proximity to funders and AI companies. Then you would have to cut costs by 90% to make it worthwhile to live elsewhere. Otherwise, you would essentially be stepping over dollars to pick up dimes.

Of course there are some theoretical reasons for growing fast. But theory only gets you so far, on this issue. Rather, this question depends on whether growing EA is promising currently (I lean against) compared to other projects one could grow. Even if EA looks like the right thing to build, you need to talk to people who have seen EA grow and contract at various rates over the last 15 years, to understand which modes of growth have been healthier, and have contributed to gained capacity, rather than just an increase in raw numbers. In my experience, one of the least healthy phases of EA was when there was the heaviest emphasis on growth, perhaps around 1.5-4 years ago, whereas it seemed to do better pretty-much all of the other times.

Load more