mariushobbhahn

2591 karmaJoined Dec 2020

Bio

I recently founded Apollo Research: https://www.apolloresearch.ai/

I was previously doing a Ph.D. in ML at the International Max-Planck research school in Tübingen, worked part-time with Epoch and did independent AI safety research.

For more see https://www.mariushobbhahn.com/aboutme/

I subscribe to Crocker's Rules

Posts
33

Sorted by New

117

There should be more AI safety orgs

mariushobbhahn

· 2y ago

Apollo Research is hiring evals and interpretability engineers & scientists

mariushobbhahn

· 2y ago · 2m read

158

Announcing Apollo Research

mariushobbhahn

· 2y ago

130

The next decades might be wild

mariushobbhahn

· 2y ago

Announcing AI safety Mentors and Mentees

mariushobbhahn

· 2y ago

Disagreement with bio anchors that lead to shorter timelines

mariushobbhahn

· 2y ago

Some advice on independent research

mariushobbhahn

· 2y ago

138

Lessons learned from talking to >100 academics about AI safety

mariushobbhahn

· 2y ago

112

What success looks like

5 authors

· 3y ago · 23m read

183

Announcing Epoch: A research organization investigating the road to Transformative AI

5 authors

· 3y ago · 3m read

Comments
97

This might be the last AI Safety Camp

mariushobbhahn1y39

TL;DR: At least in my experience, AISC was pretty positive for most participants I know and it's incredibly cheap. It also serves a clear niche that other programs are not filling and it feels reasonable to me to continue the program.

I've been a participant in the 2021/22 edition. Some thoughts that might make it easier to decide for funders/donors.
1. Impact-per-dollar is probably pretty good for the AISC. It's incredibly cheap compared to most other AI field-building efforts and scalable.
2. I learned a bunch during AISC and I did enjoy it. It influenced my decision to go deeper into AI safety. It was less impactful than e.g. MATS for me but MATS is a full-time in-person program, so that's not surprising.
3. AISC fills a couple of important niches in the AI safety ecosystem in my opinion. It's online and part-time which makes it much easier to join for many people, it implies a much lower commitment which is good for people who want to find out whether they're a good fit for AIS. It's also much cheaper than flying everyone to the Bay or London. This also makes it more scalable because the only bottleneck is mentoring capacity without physical constraints.
4. I think AISC is especially good for people who want to test their fit but who are not super experienced yet. This seems like an important function. MATS and ARENA, for example, feel like they target people a bit deeper into the funnel with more experience who are already more certain that they are a good fit.
5. Overall, I think AISC is less impactful than e.g. MATS even without normalizing for participants. Nevertheless, AISC is probably about ~50x cheaper than MATS. So when taking cost into account, it feels clearly impactful enough to continue the project. I think the resulting projects are lower quality but the people are also more junior, so it feels more like an early educational program than e.g. MATS.
6. I have a hard time seeing how the program could be net negative unless something drastically changed since my cohort. In the worst case, people realize that they don't like one particular type of AI safety research. But since you chat with others who are curious about AIS regularly, it will be much easier to start something that might be more meaningful. Also, this can happen in any field-building program, not just AISC.
7. Caveat: I have done no additional research on this. Maybe others know details that I'm unaware of. See this as my personal opinion and not a detailed research analysis.

There should be more AI safety orgs

mariushobbhahn2y2

I touched on this a little bit in the post. I think it really depends on a couple of assumptions.
1. How much management would they actually get to do in that org? At the current pace of hiring, it's unlikely that someone could build a team as quickly as you can with a new org.
2. How different is their agenda from existing ones? What if they have an agenda that is different from any agenda that is currently done in an org? Seems hard/impossible to use the management skills in an existing org then.
3. How fast do we think the landscape has to grow? If we think a handful of orgs with 100-500 members in total is sufficient to address the problem, this is probably the better path. If we think this is not enough, starting and scaling new orgs seems better.

But like I said in the post, for many (probably most) people starting a new org is not the best move. But for some it is and I don't think we're supporting this enough as a community.

There should be more AI safety orgs

mariushobbhahn2y21

I have heard mixed messages about funding.

From the many people I interact with and also from personal experience it seems like funding is tight right now. However, when I talk to larger funders, they typically still say that AI safety is their biggest priority and that they want to allocate serious amounts of money toward it. I'm not sure how to resolve this but I'd be very grateful to understand the perspective of funders better.

I think the uncertainty around funding is problematic because it makes it hard to plan ahead. It's hard to do independent research, start an org, hire, etc. If there was clarity, people could at least consider alternative options.

Critiques of prominent AI safety labs: Conjecture

mariushobbhahn2y3

The comment I was referring to was in fact yours. After re-reading your comment and my statement, I think I misunderstood your comment originally. I thought it was not only praising the process but also the content itself. Sorry about that.

I updated my comment accordingly to indicate my misunderstanding.

The "all the insights" was not meant as a literal quote but more as a cynical way of saying it. In hindsight, this is obviously bound to be misunderstood and I should have phrased it differently.

Critiques of prominent AI safety labs: Conjecture

mariushobbhahn2y1

I'll only briefly reply because I feel like I've said most of what I wanted to say.
1) Mostly agree but that feels like part of the point I'm trying to make. Doing good research is really hard, so when you don't have a decade of past experience it seems more important how you react to early failures than whether you make them.
2) My understanding is that only about 8 people were involved with the public research outputs and not all of them were working on these outputs all the time. So the 1 OOM in contrast to ARC feels more like a 2x-4x.
3) Can't share.
4) Thank you. Hope my comments helped.
5) I just asked a bunch of people who work(ed) at Conjecture and they said they expect the skill building to be better for a career in alignment than e.g. working with a non-alignment team at Google.

Critiques of prominent AI safety labs: Conjecture

mariushobbhahn2y38

Hmm, yeah. I actually think you changed my mind on the recommendations. My new position is something like:
1. There should not be a higher burden on anti-recommendations than pro-recommendations.
2. Both pro- and anti-recommendations should come with caveats and conditionals whenever they make a difference to the target audience.
3. I'm now more convinced that the anti-recommendation of OP was appropriate.
4. I'd probably still phrase it differently than they did but my overall belief went from "this was unjustified" to "they should have used different wording" which is a substantial change in position.
5. In general, the context in which you make a recommendation still matters. For example, if you make a public comment saying "I'd probably not recommend working for X" the severity feels different than "I collected a lot of evidence and wrote this entire post and now recommend against working for X". But I guess that just changes the effect size and not really the content of the recommendation.

Critiques of prominent AI safety labs: Conjecture

mariushobbhahn2y5

Meta: Thanks for taking the time to respond. I think your questions are in good faith and address my concerns, I do not understand why the comment is downvoted so much by other people.

1. Obviously output is a relevant factor to judge an organization among others. However, especially in hits-based approaches, the ultimate thing we want to judge is the process that generates the outputs to make an estimate about the chance of finding a hit. For example, a cynic might say "what has ARC-theory achieve so far? They wrote some nice framings of the problem, e.g. with ELK and heuristic arguments, but what have they ACtUaLLy achieved?" To which my answer would be, I believe in them because I think the process that they are following makes sense and there is a chance that they would find a really big-if-true result in the future. In the limit, process and results converge but especially early on they might diverge. And I personally think that Conjecture did respond reasonably to their early results by iterating faster and looking for hits.
2. I actually think their output is better than you make it look. The entire simulators framing made a huge difference for lots of people and writing up things that are already "known" among a handful of LLM experts is still an important contribution, though I would argue most LLM experts did not think about the details as much as Janus did. I also think that their preliminary research outputs are pretty valuable. The stuff on SVDs and sparse coding actually influenced a number of independent researchers I know (so much that they changed their research direction to that) and I thus think it was a valuable contribution. I'd still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.
3. (copied from response to Rohin): Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I'm aware of (not all of which are mentioned in the post and I'm not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like "Connor didn't tell the VCs about the alignment plans or neglects them in conversation". However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it's clear that products are a part of their way to address alignment but to the best of my knowledge, every VC who invested was very aware about what their getting into. At this point, it's really hard for me to judge because I think that a) on priors, VCs are profit-seeking, and b) different sources said different things some of which are mutually exclusive. I don't have enough insight to confidently say who is right here. I'm mainly saying, the confidence of you surprised me given my previous discussions with staff.
4. Regarding confidence: For example, I think saying "We think there are better places to work at than Conjecture" would feel much more appropriate than "we advice against..." Maybe that's just me. I just felt like many statements are presented with a lot of confidence given the amount of insight you seem to have and I would have wanted them to be a bit more hedged and less confident.
5. Sure, for many people other opportunities might be a better fit. But I'm not sure I would e.g. support the statement that a general ML engineer would learn more in general industry than with Conjecture. I also don't know a lot about CoEm but that would lead me to make weaker statements than suggesting against it.

Thanks for engaging with my arguments. I personally think many of your criticisms hit relevant points and I think a more hedged and less confident version of your post would have actually had more impact on me if I were still looking for a job. As it is currently written, it loses some persuasion on me because I feel like your making too broad unqualified statements which intuitively made me a bit skeptical of your true intentions. Most of me thinks that you're trying to point out important criticism but there is a nagging feeling that it is a hit piece. Intuitively, I'm very averse against everything that looks like a click-bait hit piece by a journalist with a clear agenda. I'm not saying you should only consider me as your audience, I just want to describe the impression I got from the piece.

Critiques of prominent AI safety labs: Conjecture

mariushobbhahn2y10

Maybe we're getting too much into the semantics here but I would have found a headline of "we believe there are better places to work at" much more appropriate for the kind of statement they are making.
1. A blanket unconditional statement like this seems unjustified. Like I said before, if you believe in CoEm, Conjecture probably is the right place to work for.
2. Where does the "relatively weak for skill building" come from? A lot of their research isn't public, a lot of engineering skills are not very tangible from the outside, etc. Why didn't they just ask the many EA-aligned employees at Conjecture about what they thought of the skills they learned? Seems like such an easy way to correct for a potential mischaracterization.
3. Almost all AI alignment organizations are "plausibly" net negative. What if ARC evals underestimates their gain-of-function research? What if Redwood's advances in interpretability lead to massive capability gains? What if CAIS's efforts with the letter had backfired and rallied everyone against AI safety? This bar is basically meaningless without expected values.

Does that clarify where my skepticism comes from? Also, once again, my arguments should not be seen as a recommendation for Conjecture. I do agree with many of the criticisms made in the post.

Critiques of prominent AI safety labs: Conjecture

mariushobbhahn2y9

Meta: maybe my comment on the critique reads stronger than intended (see comment with clarifications) and I do agree with some of the criticisms and some of the statements you made. I'll reflect on where I should have phrased things differently and try to clarify below.
Hits-based research: Obviously results are one evaluation criterion for scientific research. However, especially for hits-based research, I think there are other factors that cannot be neglected. To give a concrete example, if I was asked whether I should give a unit under your supervision $10M in grant funding or not, I would obviously look back at your history of results but a lot of my judgment would be based on my belief in your ability to find meaningful research directions in the future. To a large extent, the funding would be a bet on you and the research process you introduce in a team and much less on previous results. Obviously, your prior research output is a result of your previous process but especially in early organizations this can diverge quite a bit. Therefore, I think it is fair to say that both a) the output of Conjecture so far has not been that impressive IMO and b) I think their updates to early results to iterate faster and look for more hits actually is positive evidence about their expected future output.
Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I'm aware of (not all of which are mentioned in the post and I'm not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like "Connor didn't tell the VCs about the alignment plans or neglects them in conversation". However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it's clear that products are a part of their way to address alignment but to the best of my knowledge, every VC who invested was very aware about what their getting into. At this point, it's really hard for me to judge because I think that a) on priors, VCs are profit-seeking, and b) different sources said different things some of which are mutually exclusive. I don't have enough insight to confidently say who is right here. I'm mainly saying, the confidence of OP surprised me given my previous discussions.
On recommendations: I have also recommended people in private not to work at specific organizations. However, this was always conditional on their circumstances. For example, often people aren't aware on what exactly different safety teams are working on, so conditional on their preferences they should probably not work for lab X. Secondly, I think there is a difference between you saying something like this in private, even if it is unconditional, vs in public. In public, the audience is much larger and has much less context, etc. So I feel like your burden of proof is much higher.

lmk if that makes my position and disagreements clearer.

Critiques of prominent AI safety labs: Conjecture

mariushobbhahn2y32

Some clarifications on the comment:
1. I strongly endorse critique of organisations in general and especially within the EA space. I think it's good that we as a community have the norm to embrace critiques.
2. I personally have my criticisms for Conjecture and my comment should not be seen as "everything's great at Conjecture, nothing to see here!". In fact, my main criticism of leadership style and CoEm not being the most effective thing they could do, are also represented prominently in this post.
3. I'd also be fine with the authors of this post saying something like "I have a strong feeling that something is fishy at Conjecture, here are the reasons for this feeling". Or they could also clearly state which things are known and which things are mostly intuitions.
4. However, I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process.
5. My main problem with the post is that they make a list of specific claim with high confidence and I think that is not warranted given the evidence I'm aware of. That's all.

mariushobbhahn

Bio

Posts 33

Comments97

Posts
33

Comments
97