Konstantin

Bio

Happy to chat about AI policy, meta-ethics, movement building, and many other things. Just text me and we can schedule a call!

How others can help me

Skillbuilding in alignment and AI policy. Want to be my study buddy? Hit me up with any projects/ reading groups you want to start :)

I'm also organizing a reading group in AI governance for people with a bit of background knowledge (e.g. having done the AGISF gov track), get in touch if you'd like to join

 

Please give me (anonymous) feedback on our interaction.

Comments
59

Since they are not on the menu, these are the options: 

1. Risotto with green beans, zucchini and onions with red pesto, 13.50€

2. Pasta with broccoli, onions and beetroot sauce + roasted almonds 12.70€

3. Fried tofu with salad, olives, dried tomatoes, cucumber, bell peppers french dressing and baguette 13.90€

 

Please bring cash as the restaurant doesn’t accept payment by card

Re: bioweapons convention: Good point, so maybe not as straightforward as I described.

Re: predicting AI: You can always not publish the research you are doing or only inform safety-focused institutions about it. I agree that there are some possible downsides to knowing more precisely when AI will be developed, but there seem to be much worse downsides to not knowing when AI will be developed (mainly that nobody is preparing for it policy- and coordination-wise)
I think the biggest risk is getting governments too excited about AI. So I'm actually not super confident that any work on this is 10x more likely to be positive.

Re: policy & alignment: I'm very confident, that there is some form of alignment work that is not speeding up capabilities, especially the more abstract one. Though I agree on interpretability. On policy, I would also be surprised if every avenue of governance was as risky as you describe. Especially laying out big picture strategies and monitoring AI development seem pretty low-risk.

Overall, I think you have done a good job scrutinizing my claims and I'm much less confident now. Still, I'd be really surprised if every type of longtermist work was as risky as your examples - especially for someone as safety-conscious as you are. (Actually, one very positive thing might be criticizing different approaches and showing their downsides)
 

Note that even if alignment research may sometimes speed up AI development, most AI safety work is still making alignment more likely overall. So I agree that there are downsides here, but it seems really wild to think that it would be better not to do any alignment research instead.

That's also my understanding. However, Will probably has some power over it. I.e. he can talk to his literary agent to actively approach publishers and even offer money to foreign publishers for translating the book.

Some ideas for career paths that I think have a very low chance of terrible outcomes and a reasonable chance to do a ton of good for the long-term future (I'm not claiming that they definitely  will be net-positive, I'm claiming they are more than 10x more likely to be net positive than to be net negative):

  • Developing early warning systems for future pandemics (and related work) (technical bio work)
  • Strengthening the bioweapons convention and building better enforcement mechanisms (bio policy)
  • Predicting how fast powerful AI is going to be developed to get strategic clarity (AI strategy)
  • Developing theories of how to align AI and reasoning about how they could fail (AI alignment research)
  • Building institutions that are ready to govern AI effectively once it starts being transformative (AI governance)

Besides these, I think that almost all work longtermists work on today has a positive expected value, even if it has large downsides. Your comparison to deworming isn't perfect. Failed deworming is not causing direct harm. It is still better to give money to ineffective deworming than to do nothing.

Please try to get this book translated into as many languages as possible! I think it's a great chance to get attention to longtermism in non-English countries too. Happy to assist with organizing a German translation!

Agree that it depends a lot on the training procedure. However, I think that given high situational awareness, we should expect the AI to know its shortcomings very well. 

So I agree that it won't be able to do a backflip on the first try. But it will know that it would likely fail and thus not rely on plans that require backflips or if it needs backflips it will find a way of learning them without being suspicious. (I.e. by manipulating a human into training it to learn backflips)

I think overthrowing humanity is certainly hard. But it still seems possible for a patient AGI that slowly accumulates wealth and power by exploiting human conflicts, getting involved in crucial economic processes, and potentially gaining control of communication systems in the military with deepfakes & the wealth and power it has accumulated. (And all this can be done by just interacting with a computer interface as in Cotra's example) It's also fairly likely that there are some exploits in the way humans work that we are not aware of that the AGI would learn from being trained with tons of data that would make it even easier.

So overall, I agree the AGI will have bugs, but it will also know it likely has bugs and thus will be very careful with any attempts at overthrowing humanity.

Interesting perspective. Though leaning on Cotra's recent post, if the first AGI will be developed by iterations of reinforcement learning in different domains, it seems likely that will develop a rather accurate view of the world, as that will give the highest rewards. This means the AGI will have high situational awareness. I.e., it will know that it's an AGI and it will very likely know about human biases. I thus think it will also be aware that it contains mental bugs itself and may start actively trying to fix them (since that will be reinforced as it gives higher rewards in the longer run).
I thus think that we should expect it to contain a surprisingly low number of very general bugs such as weird ways of thinking or false assumptions in its worldview. 
That's why I believe the first AGI will already be very capable and smart enough to hide for a long time until it strikes and overthrows its owners.

If you seriously think switching to Notion would improve the productivity of some orgs by 10% you should write this up as fast as possible and convince them to do so!
 

I honestly don't see why.
I think I'm much below 130 and still, 80k advised me. The texts they write about why AI might literally kill all of us and what I could do to prevent are not only relevant for oxford graduates but also for me who just attended an average German University. I think everyone can contribute to the world's most pressing problems. What's needed is not intelligence but ambition and open-mindedness. EA is not just math geniuses devising abstract problems it's hundreds of people running the everyday work of organizations, coming up with new approaches to community building, becoming politically active to promote animal welfare, or earning money to donate to the most important causes. None of these are only possible with an above-average IQ. 

Load More