quinn

https://quinnd.net

Host of the Technical AI Safety Podcast https://technical-ai-safety.libsyn.com

Coorganizer of EA Philly

Streams Linear Algebra Done Right in Coq on twitch https://twitch.tv/quinndougherty92

Comments

Hiring Director of Applied Data & Research - CES

Awesome! I probably won't apply as I lack political background and couldn't tell you the first thing about running a poll, but my eyes will be keenly open in case you post a broader data/analytics job as you grow. Good luck with the search! 

The Importance of Artificial Sentience

I'm thrilled about this post - during my first two-three years of studying math/cs and thinking about AGI my primary concern was the rights and liberties of baby agents (but I wasn't giving suffering nearly adequate thought). Over the years I became more of an orthodox x-risk reducer, and while the process has been full of nutritious exercises, I fully admit that becoming orthodox is a good way to win colleagues, not get shrugged off as a crank at parties, etc. and this may have played a small role, if not motivated reasoning then at least humbly deferring to people who seem like they're thinking clearer than me. 

I think this area is sufficiently undertheorized and neglected that the following is only hypothetical, but could become important: how is one to tradeoff between existential safety (for humans) and suffering risks (for all minds)? 

  1. Value is complex and fragile. There are numerous reasons to be more careful than kneejerk cosmopolitanism, and if one's intuitions are "for all minds, of course!" it's important to think through what steps one'd have to take to become someone who thinks safeguarding humanity is more important than ensuring good outcomes for creatures in other substrates. This was best written about, to my knowledge, in the old Value Theory sequence by Eliezer Yudkowsky and to some extent Fun Theory, while it's not 100% satisfying I don't think one go-to sequence is the answer, as a lot of this stuff should be left as exercise for the reader.
  2. Is anyone worried about x-risk and s-risk signaling a future of two opposite factions of EA? That is to say, what are the odds that there's no way for humanity-preservers and suffering-reducers to get along? You can easily imagine disagreement about how to tradeoff research resources between human existential safety and artificial welfare, but what if we had to reason about deployment? Do we deploy an AI that's 90% safe against some alien paperclipping outcome, 30% reduction in artificial suffering; or one that's 75% safe against paperclipping, 70% reduction in artificial suffering? 
  3. If we're lucky, there will be a galaxy-brained research agenda or program, some holes or gaps in the theory or implementation that allows and even encourages coalitioning between humanity-preservers and suffering-reducers. I don't think we'll be this lucky, in the limiting case where one humanity-preserver and one suffering-reducer are each at the penultimate stages of their goals. However we shouldn't be surprised if there is some overlap, the cooperative AI agenda comes to mind. 

I find myself shocked at point #2, at the inadequacy of the state of theory of these tradeoffs. Is it premature to worry about that before the AS movement has even published a detailed agenda/proposal of how to allocate research effort grounded in today's AI field? Much theorization is needed to even get to that point, but it might be wise to think ahead. 

I look forward to reading the preprint this week, thanks

AMA: Ajeya Cotra, researcher at Open Phil

I've been increasingly hearing advice to the effect that "stories" are an effective way for an AI x-safety researcher to figure out what to work on, that drawing scenarios about how you think it could go well or go poorly and doing backward induction to derive a research question is better than traditional methods of finding a research question. Do you agree with this? It seems like the uncertainty when you draw such scenarios is so massive that one couldn't make a dent in it, but do you think it's valuable for AI x-safety researchers to make significant (i.e. more than 30% of their time) investments in both 1. doing this directly by telling stories and attempting backward induction, and 2. training so that their stories will be better/more reflective of reality (by studying forecasting, for instance)? 

What would an EA do in the american revolution?

So I read Gwern and I also read this Dylan Matthews piece, I'm fairly convinced the revolution did not lead to the best outcomes for slaves and for indigenous people. I think there are two cruxes for believing that it would be possible to make this determination in real-time: 

  1. as Matthews points out, follow the preferences of slaves.
  2. notice that a complaint in the declaration of independence was that the british wanted to citizenize indigenous people. 

One of my core assumptions, which is up for debate, is that EAs ought to focus on outcomes for slaves and indigenous people more than the general case of outcomes. 

Promoting EA to billionaires?

I'm puzzled by the lack of push to convert Patrick Collision. Paul Graham once tweeted that Stripe would be the next google, so if Patrick Collision doesn't qualify as a billionaire yet, it might be a good bet that he will someday (I'm not strictly basing that on PG's authority, I'm also basing that on my personal opinion that Stripe seems like world domination material). He cowrote this piece "We need a science of progress" and from what I heard in this interview, signs point to a very EA-sympathetic person. 

What would an EA do in the american revolution?

My first guess, based on the knowledge I have, is that the abolitionist faction was good, and that supporting them would be necessary for an EA in that time (but maybe not sufficient). Additionally, my guess is that I'd be able to determine this in real time. 

My upcoming CEEALAR stay

Maybe! I'm only going after a steady stream of 2-3 chapters per week. Be in touch if you're interested: I'm re-reading the first quarter of PLF since they published a new version in the time since I knocked out the first quarter of it. 

My upcoming CEEALAR stay

Thanks for the comment. I wasn't aware of yours and Rohin's discussion on Arden's post. Did you flesh out the inductive alignment idea on lw or alignment forum? It seems really promising to me. 

I want to jot down notes more substantive than "wait until I post 'Going Long on FV' in a few months" today. 

FV in AI Safety in particular

As Rohin's comment suggests, both aiming proofs about properties of models toward today's type theories and aiming tomorrow's type theories toward ML have two classes of obstacles: 1. is it possible? 2. can it be made competitive?

I've gathered that there's a lot of pessimism about 1, in spite of MIRI's investment in type theory and in spite of the word "provably" in CHAI's charter. My personal expected path to impact as it concerns 1. is "wait until theorists smarter than me figure it out", and I want to position myself to worry about 2.. 

I think there's a distinction between theories and products, and I think programmers need to be prepared to commercialize results. There's a fundamental question: should we expect that a theory's competitiveness can be improved one or more orders of magnitude by engineering effort, or will engineering effort only provide improvements of less than an order of magnitude? I think a lot depends on how you feel about this. 

Asya: 

While I agree that proof assistants right now are much slower than doing math proofs yourself, verification is a pretty immature field. I can imagine them becoming a lot better such that they do actually become better to use than doing math proofs yourself, and don't think this would be the worst thing to invest in.

Asya may not have been speaking about AI safety here, but my basic thinking is that if less primitive proof assistants end up drastically more competitive, and at the same time there are opportunities convert results in verified ML into tooling, expertise in this area could gain a lot of leverage. 

FV in other paths to impact

Rohin: 

it is plausibly still worthwhile becoming an expert on formal verification because of the potential applications to cybersecurity. (Though it seems like in that case you should just become an expert on cybersecurity.)

It's not clear to me that grinding FV directly is as wise as, say, CompTIA certifications. From the expectation that FV pays dividends in advanced cybersec, we cannot conclude that FV is relevant to early stages of a cybersec path. 

Related: Information security careers for GCR reduction. I think the software safety standards in a wide variety of fields have a lot of leverage over outcomes.