(xPost LW) How to turn money into AI safety?

Charlie Steiner

Crosspost of https://www.lesswrong.com/posts/Q3itckRo4WoDYzFeL/how-to-turn-money-into-ai-safety, hoping for comments/discussion.

I have heard through the grapevine that we seem to be constrained - there's money that donors and organizations might be happy to spend on AI safety work, but aren't because of certain bottlenecks - perhaps talent, training, vetting, research programs, or research groups are in short supply. What would the world look like if we'd widened some of those bottlenecks, and what are local actions that people can do to move in that direction? I'm not an expert either from the funding or organizational side, but hopefully I can leverage Cunningham's law and get some people more in the know to reply in the comments.

Of the bottlenecks I listed above, I am going to mostly ignore talent. IMO, talented people aren't the bottleneck right now, and the other problems we have are more interesting. We need to be able to train people in the details of an area of cutting-edge research. We need a larger number of research groups that can employ those people to work on specific agendas. And perhaps trickiest, we need to do this within a network of reputation and vetting that makes it possible to selectively spend money on good research without warping or stifling the very research it's trying to select for.

In short, if we want to spend money, we can't just hope that highly-credentialed, high-status researchers with obviously-fundable research will arise by spontaneous generation. We need to scale up the infrastructure. I'll start by taking the perspective of individuals trying to work on AI safety - how can we make it easier for them to do good work and get paid?

There are a series of bottlenecks in the pipeline from interested amateur to salaried professional. From the the individual entrant's perspective, they have to start with learning and credentialing. The "obvious path" of training to do AI safety research looks like getting a bachelor's or PhD in public policy, philosophy, computer science, or math, (for which there are now fellowships, which is great) trying to focus your work towards AI safety, and doing a lot of self-study on the side. These programs are often an imprecise fit for the training we want - we'd like there to be graduate-level classes that students can take that cover important material in AI policy, technical alignment research, the philosophy of value learning, etc.

Opportunity 1: Develop course materials and possibly textbooks for teaching courses related to AI safety. This is already happening somewhat. Encourage other departments and professors to offer courses covering these topics.

Even if we influence some parts of academia, we may still have a bottleneck where there aren't enough departments and professors who can guide and support students focusing on AI safety topics. This is especially relevant if we want to start training people fast, as in six months from now. To bridge this gap this it would be nice to have training programs, admitting people with bachelor's- or master's-level skills, at organizations doing active AI safety research. Like a three-way cross between internship, grad school, and AI Safety Camp. The intent is not just to have people learn and do work, but also to help them produce credible signals of their knowledge and skills, over a timespan of 2-5 years. Not just being author number 9 out of 18, but having output that they are primarily responsible for. The necessity of producing credible signals of skill makes a lot of sense when we look at the problem from the funders' perspective later.

Opportunity 2: Expand programs located at existing research organizations that fulfill training and signalling roles. This would require staff for admissions, support, and administration.

This would also provide an opportunity for people who haven't taken the "obvious path" through academia, of which there are many in the AI safety community, who otherwise would have to create their own signalling mechanisms. Thus it would be a bad outcome if all these internships got filled up with people with ordinary academic credentials and no "weirdness points," as admissions incentives might push towards. Strong admissions risk-aversion may also indicate that we have lots of talent, and not enough spots (more dakka required).

Such internships would take nontrivial effort and administrative resources - they're a negative for the research output of the individuals who run them. To align the incentives to make them happen, we'd want top-down funding intended for this activity. This may be complicated by the fact that a lot of research happens within corporations, e.g. at DeepMind. But if people actually try, I suspect there's some way to use money to expand training+signalling internships at corporate centers of AI safety research.

Suppose that we blow open that bottleneck, and we have a bunch of people with some knowledge of cutting-edge research, and credible signals that they can do AI safety work. Where do they go?

Right now there are only a small number of organizations devoted to AI safety research, all with their own idiosyncrasies, and all accepting only a small number of new people. And yet we want most research to happen in organizations rather than alone: Communicating with peers is a good source of ideas. Many projects require the efforts or skillsets of multiple people working together. Organizations can supply hardware, administrative support, or other expertise to allow research to go smoother.

Opportunity 3: Expand the size and scope of existing organizations, perhaps in a hierarchical structure. Can't be done indefinitely (will come back to this), but I don't think we're near the limits.

In addition to increasing the size of existing organizations, we could also found new groups altogether. I won't write that one down yet, because it has some additional complications. Complications that are best explored from a different perspective.

If you're a grant-making organization, selectivity is everything. Even if you want to spend more money, if you offer money for AI safety research but have no selection process, a whole bushel of people are going to show up asking for completely pointless grants, and your money will be wasted. But it's hard to filter for people and groups who are going to do useful AI safety research.

So you develop a process. You look at the grantee's credentials and awards. You read their previous work and try to see if it's any good. You ask outside experts for a second opinion, both on the work and on the grantee themselves. Et cetera. This is all a totally normal response to the need to spend limited resources in an uncertain world. But it is a lot of work, and can often end up incentivizing picking "safe bets."

Now let's come back the unanswered problem of increasing the number of research organizations. In this environment, how does that happen? The fledgling organization would need credentials, previous work, and reputation with high-status experts before ever receiving a grant. The solution is obvious: just have a central group of founders with credentials, work, and reputation ("cred" for short) already attached to them.

Opportunity 4: Entice people who have cred to found new organizations that can get grants and thus increase the amount of money being spent doing work.

This suggests that the number of organizations can only grow exponentially, through a life cycle where researchers join a growing organization, do work, gain cred, and then bud off to form a new group. Is that really necessary, though? What if a certain niche just obviously needs to be filled - can you (assuming you're Joe Schmo with no cred) found an organization to fill it? No, you probably cannot. You at least need some cred - though we can think about pushing the limits later. Grant-making organizations get a bunch of bad requests all the time, and they shouldn't just fund all of them that promise to fill some niche. There are certainly ways to signal that you will do a good job spending grant money even if you utterly lack cred, but those signals might take a lot of effort for grant-making organizations to interpret and compare to other grant opportunities, which brings us to the "vetting" bottleneck mentioned at the start of the post. Being vetting-constrained means that grant-making organizations don't have the institutional capability to comb through all the signals you might be trying to send, nor can they do detailed follow-up on each funded project sufficient to keep the principal-agent problem in check. So they don't fund Joe Schmo.

But if grant-making orgs are vetting-constrained, why can't they just grow? Or if they want to give more money and the number of research organizations with cred is limited, why can't those grantees just grow arbitrarily?

Both of these problems are actually pretty similar to the problem of growing the number of organizations. When you hire a new person, they need supervision and mentoring from a person with trust and know-how within your organization or else they're probably going to mess up, unless they already have cred. This limits how quickly organizations can scale. Thus we can't just wait until research organizations are most needed to grow them - if we want more growth in the future we need growth now.

Opportunity 5: Write a blog post urging established organizations to actually try to grow (in a reasonable manner), because their intrinsic growth rate is an important limiting factor in turning money into AI safety.

All of the above has been in the regime of weak vetting. What would change if we made grant-makers' vetting capabilities very strong? My mental image of strong vetting is grant-makers being able to have a long conversation with an applicant, every day for a week, rather than a 1-hour interview. Or being able to spend four days of work evaluating the feasibility of a project proposal, and coming back to the proposer with a list of suggestions to talk over. Or having the resources to follow up on how your money is being spent on a weekly basis, with a trusted person available to help the grantee or step in if things aren't going to plan. If this kind of power was used for good, it would open up the ability to fund good projects that previously would have been lost in the noise (though if used for ill it could be used to gatekeep for existing interests). This would decrease the reliance on cred and other signals, and increase the possible growth rate, closer to the limits from "talent" growth.

An organization capable of doing this level of vetting blurs the line between a grant-making organization and a centralized research hub. In fact, this fits into a picture where research organizations have stronger vetting capabilities for individuals than grant-making organizations do for research organizations. In a growing field, we might expect to see a lot of intriguing but hard-to-evaluate research take place as part of organizations but not get independently funded.

Strong vetting would be impressive, but it might not be as cost-effective as just lowering standards, particularly for smaller grants. It's like a stock portfolio - it's fine to invest in lots of things that individually have high variance so long as they're uncorrelated. But a major factor in how low your standards can be is how well weak vetting works at separating genuine applicants from frauds. I don't know much about this, so I'll leave this topic to others.

The arbitrary growth of research organizations also raises some questions about research agendas (in the sense of a single, cohesive vision). A common pattern of thought is that if we have more organizations, and established organisms have different teams of people working under their umbrellas, then all these groups of people need different things to do, and that might be a bottleneck. That what's best is when groups are working towards a single vision, articulated by the leader, and if we don't have enough visions we shouldn't found more organizations.

I think this picture makes a lot of sense for engineering problems, but not a lot of sense for blue-sky research. Look at the established research organizations - FHI, MIRI, etc. - they have a lot of people working on a lot of different things. What's important for a research group is trust and synergy; the "top-down vision" model is just a special case of synergy that arises when the problem is easily broken into hierarchical parts and we need high levels of interoperability, like an engineering problem. We're not at that stage yet with AI safety or even many of its subproblems, so we shouldn't limit ourselves to organizations with single cohesive visions.

III

Let's flip the script one last time - if you don't have enough cred to do whatever you want, but you think we need more organizations doing AI safety work, is there some special type you can found? I think the answer is yes.

The basic ingredient is something that's both easy to understand and easy to verify. I'm staying at the EA Hotel right now, so it's the example that comes to mind. The concept can be explained in about 10 seconds (it's a hotel that hosts people working on EA causes), and if you want me to send you some pictures I can just as quickly verify that (wonder of wonders) there is a hotel full of EAs here. But the day-to-day work of administrating the hotel is still nontrivial, and requires a small team funded by grant money.

This is the sort of organization that is potentially foundable even without much cred - you promise something very straightforward, and then you deliver that thing quickly, and the value comes from its maintenance or continuation. When I put it that way, now maybe it sounds more like Our World In Data's covid stats. Or like 80kh's advising services. Or like organizations promising various meta-level analyses, intended for easy consmption and evaluation by the grant-makers themselves.

Opportunity 6: If lacking cred, found new organizations with really, extremely legible objectives.

The organization-level corollary of this is that organizations can spend money faster if they spend it on extremely legible stuff (goods and services) rather than new hires. But as they say, sometimes things that are expensive are worse. Overall this post has been very crassly focusing on what can get funded, not what should get funded, but I can be pretty confident that researchers give a lot more bang per buck than a bigger facilities budget. Though perhaps this won't always be true; maybe in the future important problems will get solved, reducing researcher importance, while demand for compute balloons, increasing costs.

I think I can afford to be this crass because I trust the readers of this post to try to do good things. The current distribution of AI safety research is pretty satisfactory to me given what I perceive to be the constraints, we just need more. It turned out that when I wrote this post about the dynamics of more, I didn't need to say much about the content of the research. This isn't to say I don't have hot takes, but my takes will have to stay hot for another day.

Thanks for reading.

Thanks to Jason Green-Lowe, Guillaume Corlouer, and Heye Groß for feedback and discussion at CEEALAR.

Effective Altruism Forum
EA Forum

(xPost LW) How to turn money into AI safety?

16

16

Reactions

More posts like this