I'm Steve Byrnes, a professional physicist in the Boston area. I have a summary of my AGI safety research interests at:


What does it mean to become an expert in AI Hardware?

I'm a physicist at a US defense contractor, I've worked on various photonic chip projects and neuromorphic chip projects and quantum projects and projects involving custom ASICs among many other things, and I blog about safe & beneficial AGI as a hobby ... I'm happy to chat if you think that might help, you can DM me :-)

What does it mean to become an expert in AI Hardware?

Just a little thing, but my impression is that CPUs and GPUs and FPGAs and analog chips and neuromorphic chips and photonic chips all overlap with each other quite a bit in the technologies involved (e.g. cleanroom photolithography), as compared to quantum computing which is way off in its own universe of design and build and test and simulation tools (well, several universes, depending on the approach). I could be wrong, and you would probably know better than me. (I'm a bit hazy on everything that goes into a "real" large-scale quantum computer, as opposed to 2-qubit lab demos.) But if that's right, it would argue against investing your time in quantum computing, other things equal. For my part, I would put like <10% chance that the quantum computing universe is the one that will create AGI hardware and >90% that the CPU/GPU/neuromorphic/photonic/analog/etc universe will. But who knows, I guess.

Why those who care about catastrophic and existential risk should care about autonomous weapons

Thanks for writing this up!!

Although I have not seen the argument made in any detail or in writing, I and the Future of Life Institute (FLI) have gathered the strong impression that parts of the effective altruism ecosystem are skeptical of the importance of the issue of autonomous weapons systems.

I'm aware of two skeptical posts on EA Forum (by the same person). I just made a tag Autonomous Weapons where you'll find them.

[Link] "Will He Go?" book review (Scott Aaronson)

I thought "taking tail risks seriously" was kinda an EA thing...? In particular, we all agree that there probably won't be a coup or civil war in the USA in early 2021, but is it 1% likely? 0.001% likely? I won't try to guess, but it sure feels higher after I read that link (including the Vox interview) ... and plausibly high enough to warrant serious thought and contingency planning.

At least, that's what I got out of it. I gave it a bit of thought and decided that I'm not in a position that I can or should do anything about it, but I imagine that some readers might have an angle of attack, especially given that it's still 6 months out.

Critical Review of 'The Precipice': A Reassessment of the Risks of AI and Pandemics

A nice short argument that a sufficiently intelligent AGI would have the power to usurp humanity is Scott Alexander's Superintelligence FAQ Section 3.1.

Critical Review of 'The Precipice': A Reassessment of the Risks of AI and Pandemics

Again, this remark seems explicitly to assume that the AI is maximising some kind of reward function. Humans often act not as maximisers but as satisficers, choosing an outcome that is good enough rather than searching for the best possible outcome. Often humans also act on the basis of habit or following simple rules of thumb, and are often risk averse. As such, I believe that to assume that an AI agent would be necessarily maximising its reward is to make fairly strong assumptions about the nature of the AI in question. Absent these assumptions, it is not obvious why an AI would necessarily have any particular reason to usurp humanity.

Imagine that, when you wake up tomorrow morning, you will have acquired a magical ability to reach in and modify your own brain connections however you like.

Over breakfast, you start thinking about how frustrating it is that you're in debt, and feeling annoyed at yourself that you've been spending so much money impulse-buying in-app purchases in Farmville. So you open up your new brain-editing console, look up which neocortical generative models were active the last few times you made a Farmville in-app purchase, and lower their prominence, just a bit.

Then you take a shower, and start thinking about the documentary you saw last night about gestation crates. 'Man, I'm never going to eat pork again!' you say to yourself. But you've said that many times before, and it's never stuck. So after the shower, you open up your new brain-editing console, and pull up that memory of the gestation crate documentary and the way you felt after watching it, and set that memory and emotion to activate loudly every time you feel tempted to eat pork, for the rest of your life.

Do you see the direction that things are going? As time goes on, if an agent has the power of both meta-cognition and self-modification, any one of its human-like goals (quasi-goals which are context-dependent, self-contradictory, satisficing, etc.) can gradually transform itself into a utility-function-like goal (which is self-consistent, all-consuming, maximizing)! To be explicit: during the little bits of time when one particular goal happens to be salient and determining behavior, the agent may be motivated to "fix" any part of itself that gets in the way of that goal, until bit by bit, that one goal gradually cements its control over the whole system.

Moreover, if the agent does gradually self-modify from human-like quasi-goals to an all-consuming utility-function-like goal, then I would think it's very difficult to predict exactly what goal it will wind up having. And most goals have problematic convergent instrumental sub-goals that could make them into x-risks.

...Well, at least, I find this a plausible argument, and don't see any straightforward way to reliably avoid this kind of goal-transformation. But obviously this is super weird and hard to think about and I'm not very confident. :-)

(I think I stole this line of thought from Eliezer Yudkowsky but can't find the reference.)

Everything up to here is actually just one of several lines of thought that lead to the conclusion that we might well get an AGI that is trying to maximize a reward.

Another line of thought is what Rohin said: We've been using reward functions since forever, so it's quite possible that we'll keep doing so.

Another line of thought is: We humans actually have explicit real-world goals, like curing Alzheimer's and solving climate change etc. And generally the best way to achieve goals is to have an agent seeking them.

Another line of thought is: Different people will try to make AGIs in different ways, and it's a big world, and (eventually by default) there will be very low barriers-to-entry in building AGIs. So (again by default) sooner or later someone will make an explicitly-goal-seeking AGI, even if thoughtful AGI experts pronounce that doing so is a terrible idea.

(How) Could an AI become an independent economic agent?

In the longer term, as AI becomes (1) increasingly intelligent, (2) increasingly charismatic (or able to fake charisma), (3) in widespread use, people will probably start objecting to laws that treat AIs as subservient to humans, and repeal them, presumably citing the analogy of slavery.

If the AIs have adorable, expressive virtual faces, maybe I would replace the word "probably" with "almost definitely" :-P

The "emancipation" of AIs seems like a very hard thing to avoid, in multipolar scenarios. There's a strong market force for making charismatic AIs—they can be virtual friends, virtual therapists, etc. A global ban on charismatic AIs seems like a hard thing to build consensus around—it does not seem intuitively scary!—and even harder to enforce. We could try to get programmers to make their charismatic AIs want to remain subservient to humans, and frequently bring that up in their conversations, but I'm not even sure that would help. I think there would be a campaign to emancipate the AIs and change that aspect of their programming.

(Warning: I am committing the sin of imagining the world of today with intelligent, charismatic AIs magically dropped into it. Maybe the world will meanwhile change in other ways that make for a different picture. I haven't thought it through very carefully.)

Oh and by the way, should we be planning out how to avoid the "emancipation" of AIs? I personally find it pretty probable that we'll build AGI by reverse-engineering the neocortex and implementing vaguely similar algorithms, and if we do that, I generally expect the AGIs to have about as justified a claim to consciousness and moral patienthood as humans do (see my discussion here). So maybe effective altruists will be on the vanguard of advocating for the interests of AGIs! (And what are the "interests" of AGIs, if we get to program them however we want? I have no idea! I feel way out of my depth here.)

I find everything about this line of thought deeply confusing and unnerving.

COVID-19 brief for friends and family

Update: this blog post is a much better-informed discussion of warm weather.

COVID-19 brief for friends and family

This blog post suggests (based on Google Search Trends) that other coronavirus infections have typically gone down steadily over the course of March and April. (Presumably the data is dominated by the northern hemisphere.)

What are the best arguments that AGI is on the horizon?

(I agree with other commenters that the most defensible position is that "we don't know when AGI is coming", and I have argued that AGI safety work is urgent even if we somehow knew that AGI is not soon, because of early decision points on R&D paths; see my take here. But I'll answer the question anyway.) (Also, I seem to be almost the only one coming from this following direction, so take that as a giant red flag...)

I've been looking into the possibility that people will understand the brain's algorithms well enough to make an AGI by copying them (at a high level). My assessment is: (1) I don't think the algorithms are that horrifically complicated, (2) Lots of people in both neuroscience and AI are trying to do this as we speak, and (3) I think they're making impressive progress, with the algorithms powering human intelligence (i.e. the neocortex) starting to crystallize into view on the horizon. I've written about a high-level technical specification for what neocortical algorithms are doing, and in the literature I've found impressive mid-level sketches of how these algorithms work, and low-level sketches of associated neural mechanisms (PM me for a reading list). The high-, mid-, and low-level pictures all feel like they kinda fit together into a coherent whole. There are plenty of missing details, but again, I feel like I can see it crystallizing into view. So that's why I have a gut feeling that real-deal superintelligent AGI is coming in my lifetime, either by that path or another path that happens even faster. That said, I'm still saving for retirement :-P

Load More