BT

Benjamin_Todd

8822 karmaJoined

Comments
869

Glad it's useful! I categorise RL on chain of thought as a type of post-training, rather than test time compute. (Sometimes people lump them together as both 'inference scaling', but I think that's confusing.) I agree RL opens up novel capabilities you can't get just from next token prediction on the internet.

For test time compute, you need to do logarithmic increases of compute to get linear increases in accuracy on the benchmark. It's similar to the pretraining scaling law.

I agree test time compute isn't especially explosive – it mainly serves to "pull forward" more advanced capabilities by 1-2 years.

More broadly, you can swap training for inference: https://epoch.ai/blog/trading-off-compute-in-training-and-inference

On brute force, I mainly took Toby's thread to be saying we don't clearly have enough information to know how effective test time compute is vs. brute force. 

Spitballing: EA entrepreneurs should be preparing for one of two worlds:

i) Short timelines with 10x funding of today

ii) Longer timelines with relatively scarce funding

The response to Michael is an interesting point, but it only concerns diminishing returns in individual capabilities of new members. 

Diminishing returns are mainly driven by the quality of opportunities being used up, rather than the capabilities.

IIRC a 10x in resources to get a 3x in impact was a typical response in the old coordination forum survey responses.

In the past at 80k I'd often assume a 3x increase in inputs (e.g. advising calls) to get a 2x increase in outputs (impact-adjusted plan changes), and that seemed to be roughly consistent with the data (though the data don't tell us that much). In some cases, returns seem to diminish a lot faster than that. And you often face diminishing returns at several levels (e.g. 3x as much marketing to get 2x as many applicants to advising).

I agree returns are more linear in areas where EA resources are a small fraction of the total, like global health, but that's not the case in areas like AI safety, GCBRs, new causes like digital sentience, or promoting EA.

And even in global health, if GiveWell only had $100m to allocate, average cost-effectiveness would be a lot higher (maybe 3-10x higher?) than where the marginal dollar goes today. If GiveWell had to allocate $10bn, I'd guess returns would be at least several fold lower again on the marginal spending.

Aside: A more compelling argument against growth in this area to me is something like "EA should focus on improving its brand and comms skills, and on making reforms & changing its messaging to significantly reduce the chance of something like FTX happening again, before trying to grow aggressively again"; rather than "the possibility of scandals means it should never grow".

Another one is "it's even more high priority to grow X others movements than EA" rather than "EA is net negative to grow".

Less importantly, I also feel less confident coordination benefits would mean impact per member goes up with the number of members.

I understand that the value of a social network like Facebook grows with the number of members. But many forms of coordination become much harder with the number of members.

As an analogy, it's significantly easier for 2 people to decide where to go to dinner than for 3 people to decide. And 10 people in a group discussion can take ages to come to consensus.

Or, it's much harder to get a new policy adopted in an organisation of 100 than an organisation of 10, because there are more stakeholders to consult and compromise with, and then more people to train in the new policy etc. And large organisations are generally way more bureaucratic than smaller ones.

I think these analogies might be closer than the analogy of Facebook.

You also get effects like in a movement of under 1000, it's possible to have met in person most of the people, and know many of them well; while in a movement of 10,000, coordination has to be based on institutional mechanisms, which tend to involve a lot of overhead and not be as good.

Overall it seems to me that movement growth means more resources and skills, more shared knowledge, infrastructure and brand effects, but also many ways that it becomes harder to work together, and the movement becoming less nimble. I feel unsure which effect wins, but I put a fair bit of credence on the term decreasing rather than increasing.

If it were decreasing, and you also add in diminishing returns, then impact per member could be going down quite fast.

Thanks for the analysis! I think it makes sense to me, but I'm wondering if you've missed an important parameter: diminishing returns to resources.

If there are 100 community members they can take the 100 most impactful opportunities (e.g. writing DGB, publicising that AI safety is even a thing), while if there are 1000 people, they will need to expand into opportunities 101-1000, which will probably be lower impact than the first 100 (e.g. becoming the 50th person working on AI safety).

I'd guess a 10x increase to labour or funding working on EA things (even setting aside coordination and reputation issues) only increases impact by ~3x.

It seems like that might make significant difference to the model - if I've understood, currently the impact of marginal members in the model is actually increasing due coordination benefits, whereas this could mean it's decreasing. 

I'd still guess marginal growth is net positive, but I feel less confident than the post suggests.

Thank you, I appreciate that.

Hey JWS, 

These comments were off-hand and unconstructive, have been interpreted in ways I didn't intend, and twitter isn't the best venue for them, so I apologise for posting, and I'm going to delete them. My more considered takes are here. Hopefully I can write more in the future.

Load more