I’m a generalist and open sourcerer that does a bit of everything, but perhaps nothing particularly well. I’m currently a Software Engineer in the Worldview Investigations Team at Rethink Priorities, as well as an undergrad majoring in Computer Science Engineering 👨💻 at UC Chile.
We used to have a caching layer meant to fix this, but the objective is for there not to be too much inter-run variability.
With "variables are simulated freshly for each model", do you mean that certain probability distributions are re-sampled when performing cause comparisons?
I can second this: CEA's retreats are top tier
Yes! We plan to release the source code soon.
Hi Michael! Some answers:
2. Is there a way to increase the sample size? It's 150,000 by default, and you say it takes billions of samples to see the dominance of x-risk work.
There will be! We hope to release an update in the following days, implementing the ability to change the sample size, and allowing billions of samples. This was tricky because it required some optimizations on our end.
3. Only going 1000 years into the future seems extremely short for x-risk interventions by default if we’re seriously entertaining expectational total utilitarianism and longtermism. It also seems several times too long for the "common-sense" case for x-risk reduction.
We were divided on selecting a reasonable default here, and I agree that a shorter default might be more reasonable for the latter case. This was more of a compromise solution, but I think we could pick either perspective and stick with it for the defaults.That said, I want to emphasize that all default assumptions in CCM should be taken lightly, as we were focused on making a general tool, instead of refining (or agreeing upon) our own particular assumptions.
5. It seems the AI Misalignment Megaproject is more likely to fail (with the same probability of backfire conditional on failing) than the Small-scale AI Misalignment Project. Why is that? I would expect a lower chance of doing nothing, but a higher chance of success and a higher chance of backfire.
As with (3), I agree with your reasoning, and we'll probably be updating some of these template projects soon, but I would encourage you to tweak these assumptions to match yours.
There’s a common use case of wanting to force one distribution of samples to be manually correlated with another. However, there are many ways of doing this; it’s unclear what’s best. It probably makes sense to have several options eventually, but it would be good to have one or two robust recommended options.
I recently implemented this for squigglepy, but using the (somewhat) simpler Iman-Conover Method instead of copulas. Copulas are more flexible, but I imagine IMC would be much easier to implement for Squiggle.
If anyone wants to implement something similar for Squiggle, I would love to help in any way.
I met Yonatan some months ago when I was making some key career decisions regarding whether to pursue AI Safety as a career or look for high impact jobs in software engineering.
While I'm still not settled for the long term, Yonatan pushed me to be more pragmatic and form a strategy around experimenting with different career paths, and in particular recommended I apply to more software engineering roles in the EA community.
Yesterday, I accepted a job offer at Rethink Priorities as a software engineer, and I'm extremely grateful for Yonatan's help. I believe it's likely I wouldn't have gotten the motivation to apply for the role (or more broadly, to SWE roles in EA) without Yonatan's coaching (as well as the excellent coaching from Abigail Novick at 80k, who referred me to Yonatan in the first place).
I think Yonatan is an amazing coach and I would highly recommend talking to him.
I love this project!
I would like to see a detailed write-up on the theory of change, but a priori this seems like a very underexplored area inside EA.