F

Frederik

78 karmaJoined Jul 2018

Comments
14

Topic contributions
1

I don't have a great model of the constraints but my guess is that we're mostly talent and mentoring constrained in that we need to make more research progress but also we don't have enough mentoring to upskill new researchers (programs like SERI MATS are trying to change this though).. but also we need to be able to translate that into actual systems so buy in from the biggest players seems crucial.

I agree that most safety work isn't monetizable, some things are, e.g. in order to make a nicer chat bot, but it's questionable whether that actually reduces X risk. Afaik the companies which focus the most on safety (in terms of employee hours) are Anthropic and Conjecture. I don't know how they aim to make money. For the time being most of their funding seems to come from philanthropic investors.

When I say that it's in the company's DNA then I mean that the founders value safety for its own sake and not primarily as a money making scheme. This would explain why they haven't shut down their safety teams after they failed to provide immediate monetary value.

People, including engineers, can definitely spend all their time on safety at DM (can't speak for OpenAI). I obviously can't comment on the by me perceived priorities of DM leadership when it comes to the considerations around safety and ethics, beyond what is publicly available. In terms of the raw number of people working on it, I think it's accurate for both companies that most ppl are not working on safety.

Thanks a lot! Best of luck with your career development!

(context: I've recently started as a research engineer (RE) on DeepMind's alignment team. All opinions are my own)

Hi, first off, it's really amazing that you are looking into changing your career to help reduce x-risks from AI.

I'll give my perspective on your questions.

1.

a. All of this is on a spectrum but.. There is front-end engineering, which to my knowledge is mostly used to build human-feedback-interfaces or general dialogue chats like ChatGPT.

Then there's research engineering, which I'd roughly sort into two categories. One is more low-level machine learning engineering, like ensuring that you can train, query, serve (large) models, making your ML code more efficient or making certain types of analyses feasible in the first place. This one seems pretty crucial to a lot of orgs and is in especially high demand afaict. The second is more research-y, analysing existing models, or training/fine-tuning models in order to testing a hypothesis that's somehow related to safety. Again those exist on a spectrum.

c. In my work, I need to use quite a bit of ML/DL knowledge regularly and it was expected and tested for in the application process (I doubt that that would be the case for front-end engineering roles though). ML theory is basically never used in my work. I think this is similar across orgs who are working on "prosaic alignment", i.e. directly working with deep learning models, although I could be missing some cases here.

2.

... why are there so many different organizations?

Different beliefs about theory of change and threat models, but I'd say also more money and interested researchers than management capacity. Let's say there are 100 qualified people who would like to work on team X, chances are team X can only absorb a handful per year without imploding. What are those 100 people going to do? Either start their own org, work independently or try to find some other team.

a. That's a super valid point. All these organizations state to aim to reduce AI x-risk. As I see it they mainly differ along the axes "threat/development model" (how will dangerous AI be built and how is it causing x-risk?) and "theory of change" (how can we reduce the x-risk?). Of course there is still substantial overlap between orgs.

b. i. MIRI has chosen to follow a "not-publishing-by-default" policy, explaining some of the lack of publications. Afaict, MIRI still operates and can best be modeled as a bunch of semi-independent researchers (although I have especially little knowledge here)

c. For a slightly outdated overview you could checkout Larks' review from 2021

d. This seems like an accurate description, not sure what exactly you would like more clarity on. The field is in fact quite fragmented in that regard. Regarding the private companies (in particular OpenAI, DeepMind, Anthropic): They were founded with more or less focus on safety but all of them did have the safety of AI systems as an important part of their founding DNA.

3. I don't feel qualified to say much here. My impression is that frontend comes mostly into play when gathering human feedback/rating which is important for some alignment schemes but not others.

4. I'm not aware about groups in Asia. Regarding Europe there is DeepMind (with some work on technical safety and some work on AI governance) and Conjecture, both based in London. I think there are academic groups working on alignment as well, most notably David Krueger's group at Cambridge and Jan Kulveit's group in Prague. I'm probably forgetting someone here.

Is there a way to this post-hoc, while keeping the comments in tact? Otherwise, I'd just leave it as is now that it already received answers.

Oops! I was under the impression that I had done that when clicking on "New question". But maybe something somewhere went wrong on my end when switching interfaces :D

Nothing to add -- just wanted to explicitly say I appreciate a lot that you took the time to write the comment I was too lazy to.

I can support the last point for Germany at least. There's relatively little stratification among universities. It's mostly about which subject you want to study, with popular subjects like medicine requiring straight A's at basically every university. However you can get into a STEM program at the top universities without being in the top-10% at highschool level.

Oh yes I agree. I think that'd be a wonderful addition and would lower the barrier to entry!.

My intuition is that this is quite relevant if the goal is to appeal to a wider audience. Not everyone, not even most, people are drawn in by purely written fiction.

There is much that could be said in response to this.

  1. The tone of your comment is not very constructive. I get that you're upset but I would love if we could aim for a higher standard on this platform.

  2. The EA community is not a monolithic super-agent that has perfect control over what all its parts do -- far from it. That is actually one of the strengths of the community (and some might even say that we give too much credit to orthodoxy). So even if everyone on this forum or the wider community did agree that this was a stupid idea, then we could still do nothing about it since it is FTX's money and theirs to do with what they want. It does not make sense to talk about this in terms of "the EA community" launching this fellowship in the same way that it does not make sense to say that "the EA community" thinks that [cause XYZ] is the most important one. You could of course argue that the post received a lot of positive attention and that indicates support from the wider community and that would be a good argument, but it's far from being equivalent to everyone (or even the majority) of the EA community agreeing that this is the best way to spend a few million marginal dollars (ballpark estimate I just made up).

  3. You are getting the direction of causality wrong. FTX moved to the Bahamas because it is crypto-friendly. That makes sense because FTX is a crypto exchange. Afaict, they want to build an EA community there because that's where they are located which also makes sense from their perspective (I don't necessarily agree it's the best place for this kind of project but I can at least understand the reasoning).

  4. You seem to be mixing two arguments here. One is the "PR disaster" angle which might be valid regardless of the actual merits of the project. The other one seems to be an argument against the actual merits of the project, but you don't provide actual arguments on the object level, so I don't know what to respond here.

Load more