mic

I got into effective altruism in eighth grade from reading 80,000 Hours. I co-founded EA at Georgia Tech in April 2021 and will be a Summer 2022 research intern at the Center for Human-Compatible AI.

Topic Contributions

Comments

How to become more agentic, by GPT-EA-Forum-v1

As a countervailing perspective, Dan Hendrycks thinks that it would be valuable to have automated moral philosophy research assistance to "help us reduce risks of value lock-in by improving our moral precedents earlier rather than later" (though I don't know if he would endorse this project). Likewise, some AI alignment researchers think it would be valuable to have automated assistance with AI alignment research. If EAs could write a nice EA Forum post just by giving GPT-EA-Forum a nice prompt and revising the resulting post, that could help EAs save time and explore a broader space of research directions. Still, I think some risks are:

  • This bot would write content similar to what the EA Forum has already written, rather than advancing EA philosophy
  • The content produced is less likely to be well-reasoned, lowering the quality of content on the EA Forum
Software Developers: How to have Impact? [WIP]

Distributed computing seems to be a skill in high demand among AI safety organizations. Does anyone have recommendations for resources to learn about it? Would it look like using the PyTorch Distributed package or something like a microservices architecture?

AGI Ruin: A List of Lethalities

I feel somewhat concerned that after reading your repeated writing saying "use your AGI to (metaphorically) burn all GPUs", someone might actually do so, but of course their AGI isn't actually aligned or powerful enough to do so without causing catastrophic collateral damage. At least the suggestion encourages AI race dynamics – because if you don't make AGI first, someone else will try to burn all your GPUs! – and makes the AI safety community seem thoroughly supervillain-y.

Points 5 and 6 suggest that soon after someone develops AGI for the first time, they must use it to perform a pivotal act as powerful as "melt all GPUs", or else we are doomed. I agree that figuring out how to align such a system seems extremely hard, especially if this is your first AGI. But aiming for such a pivotal act with your first AGI isn't our only option, and this strategy seems much riskier than if we take some more time use our AGI to solve alignment further before attempting any pivotal acts. I think it's plausible that all major AGI companies could stick to only developing AGIs that are (probably) not power-seeking for a decent number of years. Remember, even Yann LeCun of Facebook AI Research thinks that AGI should have strong safety measures. Further, we could have compute governance and monitoring to prevent rogue actors from developing AGI, at least until we solve alignment enough to entrust more capable AGIs to develop strong guarantees against random people developing misaligned superintelligences. (There are also similar comments and responses on LessWrong.)

Perhaps a crux here is that I'm more optimistic than you about things like slow takeoffs, AGI likely being at least 20 years out, the possibility of using weaker AGI to help supervise stronger AGI, and AI safety becoming mainstream. Still, I don't think it's helpful to claim that we must or even should aim to try to "burn all GPUs" with our first AGI, instead of considering alternative strategies.

How to dissolve moral cluelessness about donating mosquito nets

Thanks for writing this! I've seen Hilary Greaves' video on longtermism and cluelessness in a couple university group versions of the Intro EA Program (as part of the week on critiques and debates), so it's probably been influencing some people's views. I think this post is a valuable demonstration that we don't need to be completely clueless about the long-term impact of presentist interventions.

Four Concerns Regarding Longtermism

I'm really sorry that my comment was harsher than I intended. I think you've written a witty and incisive critique which raises some important points, but I had raised my standards since this was submitted to the Red Teaming Contest.

Four Concerns Regarding Longtermism

For future submissions to the Red Teaming Contest, I'd like to see posts that are much more rigorously argued than this. I'm not concerned about whether the arguments are especially novel.

My understanding of the key claim of the post is, EA should consider reallocating some more resources from longtermist to neartermist causes. This seems plausible – perhaps some types of marginal longtermist donations are predictably ineffective, or it's bad if community members feel that longtermism unfairly has easier access to funding – but I didn't find the four reasons/arguments given in this post particularly compelling.

The section Political Capital Concern appears to claim: If EA as a movement doesn't do anything to help regular near-term causes, people will think that it's not doing anything to help people, and it could die as a movement. I agree that this is possible (though I also think a "longtermism movement" could still be reasonably successful, though unlikely to have much membership compared to EA.) However, EA continues dedicate substantial resources to near-term causes – hundreds of millions of dollars of donations each year! – and this number is only increasing, as GiveWell hopes to direct 1 billion dollars of donations per year. EA continues to highlight its contributions to near-term causes. As a movement, EA is doing fine in this regard.

So then, if the EA movement as a whole is good in this regard, who should change their actions based on the political capital concern? I think it's more interesting to examine whether local EA groups, individuals, and organizations should have a direct positive impact on near-term causes for signalling reasons. The post only gives the following recommendation (which I find fairly vague): "Instead, the thought is: when running your utility models, factor this in however you can. Consider that utility translated from EA resources to present life, when done effectively and messaged well, [4] redounds as well on the gains to future life." However, rededicating resources from longtermism to neartermism has costs to the longtermist projects you're not supporting. How do we navigate these tradeoffs? It would have been great to see examples for this.

The "Social Capital Concern" section writes:

focusing on longterm problems is probably way more fun than present ones.[7] Longtermism projects seem inherently more big picture and academic, detached from the boring mundanities of present reality.

This might be true for some people, but I think for most EAs, concrete or near-term ways of helping people has a stronger emotional appeal, all else equal. I would find the inverse of the sentence a lot more convincing, to be honest: "focusing on near-term problems is probably way more fun than ones in the distant future. Near-term projects seem inherently more appealing and helpful, grounded in present-day realities."

But that aside, if I am correct that longtermism projects are sexier by nature, when you add communal living/organizing to EA, it can probably lead to a lot of people using flimsy models to talk and discuss and theorize and pontificate, as opposed to creating tangible utility, so that they can work on cool projects without having to get their hands too dirty, all while claiming the mantle of not just the same, but greater, do-gooding.

Longtermist projects may be cool, and their utility may be more theoretical than near-term projects, but I'm extremely confused what you mean when they don't involve getting your hands dirty (in a way such that near-termist work, such as GiveWell's charity effectiveness research, involves more hands-on work). Effective donations have historically been the main neartermist EA thing to do, and donating is quite hands-off.

So individual EA actors, given social incentives brought upon by increased communal living, will want to find reasons to engage in longtermism projects because it will increase their social capital within the community.

This seems likely, and thanks for raising this critique (especially if it hasn't been highlighted before), but what should we do about it? The red-teaming contest is looking for constructive and action-relevant critiques, and I think it wouldn't be that hard to take some time to propose suggestions. The action implied by the post is that we should consider shifting more resources to near-termism, but I don't think that would necessarily be the right move, compared to, e.g., being more thoughtful about social dynamics and making an effort to welcome neartermist perspectives.

The section on Muscle Memory Concern writes:

I think this is a reason to avoid a disproportionate emphasis on longtermism projects. Because longtermism efficacy is inherently more difficult to calculate with confidence, it can become quite easy to forget how to provide utility quickly and confidently.

I don't know, even the most meta of longtermist projects, such as longtermist community building (or to go even another meta level, support for longtermist community building), is quite grounded in metrics and have short feedback loops, such that you can tell if your activities are having an impact – if not impact on the utility across all time, then at least something tangible, such as high-impact career transitions. I think the skills would transfer fairly well over to something more near-termist, such as community organizing for animal welfare, or running organizations in general. In contrast, if you're doing charity effectiveness research, whether near-termist or longtermist, it can be hard to tell if your work is any good. Over time, I think that now that we have more EAs getting their hands dirty with projects instead of just earning to give, as a community, we have more experience to be able to execute projects, whether longtermist or near-termist.

As for the final section, the discount factor concern:

Future life is less likely to exist than current life. I understand the irony here, since longtermism projects seek to make it more likely that future life exists. But inherently you just have to discount the utility of each individual future life. In the aggregate, there's no question that the utility gains are still enormous. But each individual life should have some discount based on this less-likely-to-exist factor.

I think longtermists are already accounting for the fact that we should discount future people by their likelihood to exist. That said, longtermist expected utility calculations are often more naive than they should be. For example, we often wrongly interpret reducing x-risk reduction from one cause by 1% as reducing x-risk as a whole by 1%, or conflate a 1% x-risk reduction this century with a 1% x-risk reduction across all time.

(I hope you found this comment informative, but I don't know if I'll respond to this comment, as I already spent an hour writing this and don't know if it was a good use of my time.)

What's the value of creating my own fellowship program when I can direct people to the virtual programs?

Some quick thoughts:

  • EA Virtual Programs should be fine in my opinion, especially if you think you have more promising things to do than coordinating logistics for a program or facilitating cohorts
  • The virtual Intro EA Program only has discussions in English and Spanish. If group members would much prefer to have discussions in Hungarian instead, it might be useful for you to find some Hungarian-speaking facilitators.
  • Like Jaime commented, if you're delegating EA programs to EA Virtual Programs, it's best for you to have some contact with participants, especially particularly engaged ones, so that you can have one-on-one meetings exploring their key uncertainties, share with them relevant opportunities, encouraging them to  etc.
  • It's rare for the EAIF to provide full-time funding for community building (see this comment)
  • I'd try to see if you could do more publicity of EA Virtual Programs, such as at Hungarian universities
What does the Project Management role look like in AI safety?
Answer by micMay 24, 20222

I see two new relevant roles on the 80,000 Hours job board right now:

Here's an excerpt from Anthropic's job posting. It's looking for basic familiarity with deep learning and mechanistic interpretability, but mostly nontechnical skills.

In this role you would:

  • Partner closely with the interpretability research lead on all things team related, from project planning to vision-setting to people development and coaching.
  • Translate a complex set of novel research ideas into tangible goals and work with the team to accomplish them.
  • Ensure that the team's prioritization and workstreams are aligned with its goals.
  • Manage day-to-day execution of the team’s work including investigating models, running experiments, developing underlying software infrastructure, and writing up and publishing research results in a variety of formats.
  • Unblock your reports when they are stuck, and help get them whatever resources they need to be successful.
  • Work with the team to uplevel their project management skills, and act as a project management leader and counselor.
  • Support your direct reports as a people manager - conducting productive 1:1s, skillfully offering feedback, running performance management, facilitating tough but needed conversations, and modeling excellent interpersonal skills.
  • Coach and develop your reports to decide how they would like to advance in their careers and help them do so.
  • Run the interpretability team’s recruiting efforts, in concert with the research lead.

You might be a good fit if you:

  • Are an experienced manager and enjoy practicing management as a discipline.
  • Are a superb listener and an excellent communicator.
  • Are an extremely strong project manager and enjoy balancing a number of competing priorities.
  • Take complete ownership over your team’s overall output and performance.
  • Naturally build strong relationships and partner equally well with stakeholders in a variety of different “directions” - reports, a co-lead, peer managers, and your own manager.
  • Enjoy recruiting for and managing a team through a period of growth.
  • Effectively balance the needs of a team with the needs of a growing organization.
  • Are interested in interpretability and excited to deepen your skills and understand more about this field.
  • Have a passion for and/or experience working with advanced AI systems, and feel strongly about ensuring these systems are developed safely.

Other requirements:

  • A minimum of 3-5 years of prior management or equivalent experience
  • Some technical or science-based knowledge or expertise
  • Basic familiarity in deep learning, AI, and circuits-style interpretability, or a desire to learn
  • Previous direct experience in machine learning is a plus, but not required
The real state of climate solutions - want to help?

You might want to share this project idea in the Effective Environmentalism Slack, if you haven't already done so.

Apply to help run EAGxIndia, Berkeley, Singapore and Future Forum!

Is the application form "EAGxBerkeley, India & Future Forum Organizing Team Expression of Interest" supposed to have questions asking about whether you're interested in organizing the Future Forum? I don't see any; I only see questions about EAGxBerkeley and EAGxIndia.

Load More