1964Joined Jan 2019


CS student at the University of Southern California. Previously worked for three years as a data scientist at a fintech startup. Before that, four months on a work trial at AI Impacts. Currently working with Professor Lionel Levine on language model safety research.


I think it's worth engaging with Carol, the Salinas campaign, and more generally people who have been adversely affected by EA efforts. If EA wants win elections in party politics, it will require working together with people who run those parties. Narrowly speaking, you might think that they're not focused on the most important issues or that you have better policy ideas, and you might be right. But the ability to build coalitions, working together despite disagreements to accomplish common goals, is a central challenge of party politics. 

I'm not convinced that EAs should donate to the Salinas campaign. FiveThirtyEight gives her a 78% chance of winning her race, meaning that closer races would offer a better chance for donations to tip the scales. Salinas also doesn't list pandemic preparedness on her Issues page, which was the key issue of the Flynn campaign and I believe an important and neglected cause. But if the argument for the cost-effectiveness of donations to the Salinas campaign were to change, or if EAs found a more cost-effective way to offset possible harms of the Flynn campaign by continuing to engage with Oregonian or Democratic politics, I would consider supporting such an effort. 

More simply, EAs should be kind and understanding in our discussions with Carol and others affected by our work. Maybe they're interested in the EA mindset, but they're unsure how to interpret our actions. We should show them good examples of how we think. 


I believe we should think in terms of marginal effectiveness rather than offsetting particular harms we (individually or as a community) cause (see the author's "you will have contributed in a small way to this failure" argument). If you want to offset harm that you have done or if you feel guilty, there's little reason to do good in that particular domain (in this case, by donating to Salinas) rather than doing good in a more effective manner.

I think many people would disagree, and I expect that they'll interpret your unwillingness to offset direct harms as a moral failure and an inability to cooperate with others. There are some domains that call for ruthless cost-effectiveness, and others that call for building relationships and trust with people with whom you might not always agree. I think politics is the latter. 

I would find this more persuasive if it had thorough references to the existing science of consciousness. I was under the impression that we still don’t know what the necessary conditions for consciousness are, and there are many competing theories on the topic. Stating that one theory is correct doesn’t answer that key question for me.

The BioAnchors review by Jennifer Lin is incredible. Has it ever been shared as an independent post to LessWrong or EAF? I think many people might learn from it, and further discussion could be productive.

+1 to contacting Nuclear Threat Initiative, they seem to be active and well connected across many relevant areas. 

Really cool topic, thanks for sharing. One of the ways that alignment techniques could gain adoption and reduce the alignment tax is by integrating them into popular open source libraries.

 For example, the library TRL allows researchers to implement RLHF techniques which can benefit safety, but can also contribute to dangerous capabilities. On the other hand, I'm not aware of any open-source implementation of the techniques described in Red Teaming LMs with LMs, which could be used to filter or fine-tune the outputs of a generative language model. 

Hopefully we'll see more open source contributions of safety techniques, which could bring more interest to safety topics. Some might argue that implementing safety techniques in current models doesn't reduce x-risk, and they're probably right that current models aren't directly posing x-risks, but early adoption of safety techniques seems useful for ensuring further adoption in the years to come. 

Answer by aogaraSep 28, 202250

I’m actively seeking funding for direct research and community building in technical AI safety. I‘ve been working at Lionel Levine’s lab on language model honesty research since the summer, and I recently founded USC AI Safety with more than 30 members with ML background in our first fellowship. Happy to provide much more detail upon request, send me a DM. Alternatively, if this isn’t an appropriate request I’ll take this down.

I like this framing a lot. My 60 second pitch for AI safety often includes something like this. “It’s all about making sure AI benefits humanity. We think AI could develop really quickly and shape our society, and the big corporations building it are thinking more about profits than about safety. We want to do the research they should be doing to make sure this technology helps everyone. It’s like working on online privacy in the 1990s and 2000s: Companies aren’t going to have the incentive to care, so you could make a lot of progress on a neglected problem by bringing early attention to the issue.”

Answer by aogaraSep 23, 202287

It’s tough to turn down an opportunity for career growth, but I would consider what kind of growth you’d get here. Building organizational tech isn’t directly related to research on AI safety, so it’s not a quick path to working on AI x-risk. I’m not sure that the more x-risk focused AI organizations are hiring for organizational tech, though perhaps you could get hired for a more general software position. A better opportunity for career growth might come from applying to LTFF or FTX regrantors to fund a Master’s in ML or independent reskilling.

That said, if you’re not certain about wanting to work on AI safety, there’s plenty of organizations in global poverty, alternative protein, public policy advocacy, and more than need organizational tech. While DeepMind does care about safety, I think their contribution to hastening the onset of AGI is ultimately very dangerous, and I would caution against supporting them in a general organizational capacity.

Answer by aogaraSep 22, 20221617

Currently I don't think my late donation to the Carrick Flynn campaign was cost effective in expectation. The post calling for donations explicitly framed the decision in terms of buying ads, which seemed like a good idea until I later learned how much had been spent by PACs on Flynn ads. Full comment here:

In retrospect I regret this donation. After the election, several post-mortems of the campaign made clear that the Flynn campaign spent millions in advertising, several times more than any of its competitors. Based on local media coverage, the additional marginal advertising my donation might have purchased could plausible have been net-negative for the campaign, generating more animosity among people who'd already seen far too many Flynn commercials. 

I'm interested in EAs running for political office, but would not again support a candidate when my dollars could be easily replaced by FTX or another megadonor. IMO global poverty and other shovel-ready causes are a better use of marginal funds than the already-crowded longtermist space. The full cost-benefit analysis is difficult and debatable, but at a minimum I wish I was better informed about the level of spending by the Flynn campaign at the time I donated. 

Load More