All of robertskmiles's Comments + Replies

I have a couple of videos that talk about this! This one sets up the general idea:

This one talks about how like this is to happen in practice:

My perspective, I think, is that most of the difficulties that people think of as being the extra, hard part of to one->many alignment, are already present in one->one alignment. A single human is already a barely coherent mess of conflicting wants and goals interacting chaotically, and the strong form of "being aligned to one human" requires a solution that can resolve values conflicts between incompatible 'parts' of that human and find outcomes that are satisfactory to all interests. Expanding this to more than one person is a change of degree but ... (read more)

2
Geoffrey Miller
1y
Hi Robert, thanks for your perspective on this. I love your YouTube videos by the way -- very informative and clear, and helpful for AI alignment newbies like me. My main concern is that we still have massive uncertainty about what proportion of 'alignment with all humans' can be solved by 'alignment with one human'. It sounds like your bet is that it's somewhere above 50% (maybe?? I'm just guessing); whereas my bet is that it's under 20% -- i.e. I think that aligning with one human leaves most of the hard problems, and the X risk, unsolved.  And part of my skepticism in that regard is that a great many humans -- perhaps most of the 8 billion on Earth -- would be happy to use AI to inflict harm, up to and including death and genocide, on certain other individuals and groups of humans. So, AI that's aligned with frequently homicidal/genocidal individual humans would be AI that's deeply anti-aligned with other individuals and groups.

To avoid lowering the quality of discussion by just posting snarky memes, I'll explain my actual position:
"People may have bad reasons to believe X" is a good counter against the argument "People believe X, therefore X".  So for anyone whose thought process is "These EAs are very worried about AI so I am too", I agree that there's a worthwhile discussion to be had about why those EAs believe what they do, what their thought process is, and the track record both of similar claims and of claims made by people using similar thought processes. This is bec... (read more)

To avoid lowering the quality of discussion by just posting snarky memes, I'll explain my actual position:
"People may have bad reasons to believe X" is a good counter against the argument "People believe X, therefore X".  So for anyone whose thought process is "These EAs are very worried about AI so I am too", I agree that there's a worthwhile discussion to be had about why those EAs believe what they do, what their thought process is, and the track record both of similar claims and of claims made by people using similar thought processes. This is bec... (read more)

This sounds like a really useful thing to make!

Do you think there would be value in using the latest language models to do semantic search over this set of (F)AQs, so people can easily see if a question similar to theirs has already been answered? I ask because I'm thinking of doing this for AI Safety questions, in which case it probably wouldn't be far out of my way to do it for librarian questions as well.

1
calebp
2y
That's sounds super cool! I expect this will work best for broader/more general questions e.g. "What do people mean by the word utility? I'm interested in Biosecurity, what are some intro resources that would be suitable for someone with little background in biology" as opposed to "I am a 3rd year undergraduate with a double major in CS and Music, I am worried that majoring in music might but off employers in the AI safety space, how should I test my assumption". I could of course be wrong about the types of questions this would be better for. Questions so far have been more of the form of the latter than the former, I am not entirely sure why this is and we have some ideas for generating more questions like the former so I don't know what the distribution will be like in a few weeks.  I'll make a note to get back to you on this further down the line if I think that it would be useful.

I see several comments here expressing an idea like "Perhaps engaging writing is better, but is it worth the extra effort?", and I just don't think that that trade-off is actually real for most people. I think a more conversational and engaging style is quicker and easier to write than the slightly more formal and serious tone which is now the norm. Really good, polished, highly engaging writing may be more work, but on the margin I think there's a direction we can move that is downhill from here on both effort and boringness.

The S-Process is fascinating to me! Do you know of any proper write-ups of how it works? I'm especially interested in code or pseudocode, as I might want to try applying something similar to one of my projects

2
Larks
2y
Unfortunately I don't think so. Here is a rough summary, based on my recollections, but I was only involved in one part of it so my memory or understanding might be awry: * Charities etc. submit applications * Funders choose evaluators to deputise (can be paid or unpaid) * Evaluators read applications, do calls, read background, other due diligence etc. * Evaluators write up their notes and assign the following parameters for each grant they looked at: * Marginal Utility of the First Dollar to this application * The process is invariant under a linear transformation so this is less onerous than it sounds * Dollar at which Marginal Utility = 0 * (Optional) convexity/concavity  * Evaluators read each others' notes and discuss, then make any final adjustments to their own inputs. * Funders read these notes and review recordings of the discussions. * Funders assign the following parameters to the Evaluators: * Marginal Utility of the First Dollar to this Evaluator * Dollar at which Marginal Utility = 0 * (Optional) convexity/concavity * The simulation then basically waterfalls the dollars down, where each funder gives $1,000 to the evaluator they think has the highest marginal utility, who then gives it to the charity they think has the highest marginal utility. Then all the marginal utilities are updated, and the next $1,000 is allocated to an Evaluator, who again then allocates it to a charity. There were also some other 'social' elements like disclosure and conflict of interest policies and the like. This has a number of properties: * If an application is really liked by any one evaluator it will get funded, even if the others dislike it (unless they can persuade the one otherwise). * Not every evaluator has to look at every grant. * There is less incentive for evaluators to be dishonest than in other systems. * It can be counter-intuitive what individual evaluators end up funding, because all their favourite ideas might end up

Exciting! But where's the podcast's URL? All I can find is a Spotify link.

Edit: I was able to track it down, here it is https://spkt.io/f/8692/7888/read_8617d3aee53f3ab844a309d37895c143

4
Kat Woods
3y
Here it is! Spotify, Google Podcasts, Pocket Casts. Or just search for it in your preferred podcasting app. We put it on all the biggest ones. Just let us know if it's not on one and it'll be easy enough for us to add it.  Added clarification at the top of the post to make it easier to find. :) 

A minor point, but I think this overestimates the extent to which a small number of people with an EA mindset can help in crowded cause areas that lack such people. Like, I don't think PETA's problem is that there's nobody there talking about impact and effectiveness. Or rather, that is their problem, but adding a few people to do that wouldn't help much, because they wouldn't be listened to. The existing internal political structures and discourse norms of these spaces aren't going to let these ideas gain traction, so while EAs in these areas might be abl... (read more)

2
nadavb
3y
I totally agree. In order for an impact-oriented individual to contribute significantly in an area, there has to be some degree of openness to good ideas in that area, and if it is likely that no one will listen to evidence and reason then I'd tend to advise EAs to stay away from there. I think there are such areas where EAs could contribute and be heard. And I think the more mainstream the EA mindset will be, the more such places will exist. That's one of the reasons why we really should want EA to become more mainstream, and why we shouldn't hide ourselves from the rest of the world by operating in such a narrow set of domains.