1434 karmaJoined Dec 2014


As someone who did recently set up an AI safety lab, success rates have certainly been on my mind. It's certainly challenging, but I think the reference class we're in might be better than it seems at first.

I think a big part of what makes succeeding as a for-profit tech start-up challenging is that so many other talented individuals are chasing the same, good ideas. For every Amazon there are 1000s of failed e-commerce start-ups. Clearly, Amazon did something much better than the competition. But what if Amazon didn't exist? What if there was a company that was a little more expensive, and had longer shipping times? I'd wager that company would still be highly successful.

Far fewer people are working on AI safety. That's a bad thing, but it does at least mean that there's more low-hanging fruit to be tapped. I agree with [Adam Binks](https://forum.effectivealtruism.org/posts/PJLx7CwB4mtaDgmFc/critiques-of-non-existent-ai-safety-labs-yours?commentId=eLarcd8no5iKqFaNQ) that academic labs might be a better reference class. But even there, AI safety has had far less attention paid to it than e.g. developing treatments for cancer or unifying quantum mechanics and general relativity. 

So overall it's far from clear to me that it's harder to make progress on AI safety than solve outstanding challenge problems in academia, or in trying to make a $1 bn+ company.

Thanks Lucretia for sharing your experience. This cannot have been an easy topic to write about, and I'm deeply sorry these events happened to you. I really appreciated the clarity of the post and found the additional context above the TIME article to be extremely valuable.

I liked your suggestions for people to take on an individual level. Building on the idea of back channel references, I'm wondering if there's value in having a centralised place to collect and aggregate potential red flags? Personal networks can only go so far, and it's often useful to distinguish between isolated instances and repeated patterns of behaviour. The CEA CH team partially serves this role within EA, but there's no equivalent in the broader Silicon Valley or AI communities.

For people not familiar with the UK, the London metropolitan area houses 20% of the UK's population, and a disproportionate share of the economic and research activity. The London-Cambridge-Oxford triangle in particular is by far the research powerhouse of the country, although there are of course some good universities elsewhere (e.g. Durham, St Andrews in the north). Unfortunately, anywhere within an hour's travel of London is going to be expensive. Although I'm sure you can find somewhat cheaper options than Oxford, I expect the cost savings will be modest (noting Oxford is already cheaper than central London), and you'll likely lose something else (e.g. location is harder to get to, or is some grungy commuter town).

I would like to hear if CEA considered non-Oxford locations (as there's an obvious natural bias given CEA is headquartered in Oxford), but it wouldn't surprise me if the benefit of CEA staff (who will often be running the events) having easy access to the venue genuinely outweighed any likely modest cost savings from locating elsewhere.

A 30 person office could not house the people attending, so you'd need to add costs of a hotel/AirBnB/renting nearby houses if going down that option. Even taking into account that commercial rest estate is usually more expensive than residential, I'd expect the attendee accommodation cost to be greater than the office rental simply because people need more living space than they do conference space.

Additionally in my experience retreats tend to go much better if everyone is on-site in one location: it encourages more spontaneous interaction outside of the scheduled time. There are also benefits to being outside a city center (too easy for people to get distracted and wander off otherwise).

Was Wytham a wise investment? I'm not sure, I'd love to see a calculation on it, and it probably comes down to things like the eventual utilization rate. But I think a fairer reference class would be "renting a conference center plus hotel" than "renting a 30-person office".

Note I don't see any results for FTX Foundation or FTX Philanthropy at https://apps.irs.gov/app/eos/determinationLettersSearch So it's possible it's not a 501(c)(3) (although it could still be a non-profit corporation).

Disclaimer: I do not work for FTX, and am basing this answer off publicly available information, which I have not vetted in detail.

Nick Beckstead in the Future Fund launch post described several entities (FTX Foundation Inc, DAFs) that funds will be disbursed out of: https://forum.effectivealtruism.org/posts/2mx6xrDrwiEKzfgks/announcing-the-future-fund-1?commentId=qtJ7KviYxWiZPubtY I would expect these entities to be sufficiently capitalized to provide continuity of operations, although presumably it'll have a major impact on their long-run scale.

IANAL but I'd expect the funds in the foundation/DAF to be fairly secure against bankruptcy or court proceedings. Bankruptcy courts can't just claw back money arbitrarily from other creditors, and limited liability corporations provide significant protection for directors. However, I'd expect assets donated to FTX Foundation or associated DAFs to largely be held  in-kind (again, this is speculation, but it's standard practice for large philanthropic foundations) not liquidated for cash. These assets mark-to-market value are likely worth a lot less than they were a week ago.

Hi Aaron, thanks for highlighting this. We inadvertently published an older version of the write-up before your feedback -- this has been corrected now. However, there are still a number of areas in the revised version which I expect you'll still take issue with, so I wanted to share a bit of perspective on this. I think it's excellent you brought up this disagreement in a comment, and would encourage people to form their own opinion.

First, for a bit of context, my grant write-ups are meant to accurately reflect my thought process, including any reservations I have about a grant. They're not meant to present all possible perspectives -- I certainly hope that donors use other data points when making their decisions, including of course CES's own fundraising materials.

My understanding is you have two main disagreements with the write-up: that I understate CES's ability to have an impact on the federal level, and that the cost effectiveness is lower than you believe to be true.

On the federal level, my updated write-up acknowledges that "CES may be able to have influence at the federal level by changing state-level voting rules on how senators and representatives are elected. This is not something they have accomplished yet, but would be a fairly natural extension of the work they have done so far." However, I remain skeptical regarding the Presidential general for the reasons stated: it'll remain effectively a two-candidate race until a majority of electoral college votes can be won by approval voting. I do not believe you ever addressed that concern.

Regarding the cost effectiveness, I believe your core concern was that we included your total budget as a cost, whereas much of your spending is allocated towards longer-term initiatives that do not directly win a present-day approval voting campaign. This was intended as a rough metric -- a more careful analysis would be needed to pinpoint the cost effectiveness. However, I'm not sure that such an analysis would necessarily give a more favorable figure. You presumably went after jurisdictions where winning approval voting reform is unusually easy; so we might well expect your cost per vote to increase in future. If you do have any internal analysis to share on that then I'm sure I and others would be interested to see it.

You could argue from a "flash of insight" and scientific paradigm shifts generally giving rise to sudden progress. We certainly know contemporary techniques are vastly less sample and compute efficient than the human brain -- so there does exist some learning algorithm much better than what we have today. Moreover there probably exists some learning algorithm that would give rise to AGI on contemporary (albeit expensive) hardware. For example, ACX notes there's a supercomputer than can do $10^17$ FLOPS vs the estimated $10^16 needed for a human brain. These kinds of comparisons are always a bit apples to oranges, but it does seem like compute is probably not the bottleneck  (or won't be in 10 years) for a maximally-efficient algorithm.

The nub of course is whether such an algorithm is plausibly reachable by human flash of insight (and not via e.g. detailed empirical study and refinement of a less efficient but working AGI). It's hard to rule out. How simple/universal we think the algorithm the human brain implements is one piece of evidence here -- the more complex and laden with inductive bias (e.g. innate behavior), the less likely we are to come up with it. But even if the human brain is a Rube Goldberg machine, perhaps there does exist some more straightforward algorithm evolution did not happen upon.

Personally I'd put little weight on this. I have <10% probability on AGI in next 10 years, and think I put no more than 15% on AGI being developed ever by something that looks like a sudden insight than more continuous progress. Notably even if such an insight does happen soon, I'd expect it to take at least 3-5 years for it to gain recognition and be sufficiently scaled up to work. I do think it's probable enough for us to actively keep an eye out for promising new ideas that could lead to AGI so we can be ahead of the game. I think it's good for example that a lot of people working on AI safety were working on language models "before it was cool" (I was not one of these people), for example, although we've maybe now piled too much into that area.

I agree with a lot of this post. In particular, getting more precision in timelines is probably not going to help much with persuading most people, or in influencing most of the high-level strategic questions that Miles mentions. I also expect that it's going to be hard to get much better predictions than we have now: much of the low-hanging fruit has been plucked. However, I'd personally find better timelines quite useful for prioritizing my technical research agenda problems to work on. I might be in a minority here, but I suspect not that small a one (say 25-50% of AI safety researchers).

There's two main ways timelines influence what I would want to work on. First, it directly changes the "deadline" I am working towards. If I thought the deadline was 5 years, I'd probably work on scaling up the most promising approaches we have now -- warts and all.  If I thought it was 10 years away, I'd try and make conceptual progress that could be scaled in the future. If it was 20 years away, I'd focus more on longer-term field building interventions: clarifying what the problems are, helping develop good community epistemics, mentoring people, etc. I do think what matters here is something like the log-deadline more than the deadline itself (5 vs 10 is very decision relevant, 20 vs 25 much less so) which we admittedly have a better sense of, although there's still some considerable disagreement.

The second way timelines are relevant is that my prediction on how AI is developed changes a lot conditioned on timelines. I think we should probably just try to forecast or analyze how-AI-is-developed directly -- but timelines are perhaps easier to formalize. If timelines are less than 10 years I'd be confident we develop it within the current deep learning paradigm. More than that and possibilities open up a lot. So overall longer timelines would push me towards more theoretical work (that's generally applicable across a range of paradigms) and taking bets on underdog areas of ML . There's not much research into, say, how to align an AI built on top of a probabilistic programming language. I'd say that's probably not a good use of resources right now -- but if we had a confident prediction human-level AI was 50 years away, I might change my mind.

Load more