All of alex lawsen (previously alexrjl)'s Comments + Replies

Why do you think superforecasters who were selected specifically for assigning a low probability to AI x-risk are well described as "a bunch of smart people with no particular reason to be biased"?

For the avoidance of doubt, I'm not upset that the supers were selected in this way, it's the whole point of the study, made very clear in the write-up, and was clear to me as a participant. It's just that "your arguments failed to convince randomly selected superforecasters" and "your arguments failed to convince a group of superforecasters who were specifically selected for confidentiality disagreeing with you" are very different pieces of evidence.

One small clarification: the skeptical group was not all superforecasters. There were two domain experts as well. I was one of them.

I'm sympathetic to David's point here. Even though the skeptic camp was selected for their skepticism, I think we still get some information from the fact that many hours of research and debate didn't move their opinions. I think there are plausible alternative worlds where the skeptics come in with low probabilities (by construction), but update upward by a few points after deeper engagement reveals holes in their early thinking.

6
David Mathers
1mo
Ok, I slightly overstated the point. This time, the supers selected were not a (mostly) random draw from the set of supers. But they were in the original X-risk tournament, and in that case too, they were not persuaded to change their credences via further interaction with the concerned (that is the X-risk experts.) Then, when we took the more skeptical of them and gave them yet more exposure to AI safety arguments, that still failed to move the skeptics. I think taken together, these two results show that AI safety arguments are not all that persuasive to the average super. (More precisely, that no amount of exposure to them will persuade all supers as a group to the point where they get a median significantly above 0.75% in X-risk by the centuries end.) 

They weren't randomly selected, they were selected specifically for scepticism!

2
David Mathers
1mo
Ok yes, in this case they were.  But this is a follow-up to the original X-risk tournament, where the selection really was fairly random (obviously not perfectly so, but it's not clear in what direction selection effects in which supers participated biased things.) And in the original tournament, the supers were also fairly unpersuaded (mostly) by the case for AI X-risk. Or rather, to avoid putting it in too binary a way, they didn't not move their credences further on hearing more argument after the initial round of forecasting. (I do think the supers level of concern was enough to motivate worrying about AI given how bad extinction is, so "unpersuaded" is a little misleading.) At that point, people then said 'they didn't spend the  enough time on it, and they didn't get the right experts'. Now, we have tried further with different experts, more time and effort lots of back and forth etc. and those who participated in the second round are still not moved. Now, it is possible that the only reason the participants were not moved 2nd time round was because they were more skeptical than some other supers the first time round. (Though the difference between medians of 0.1% and 0.3% medians in X-risk by 2100 is not that great.) But I think if you get 'in imperfect conditions, a random smart crowd were not moved at all, then we tried the more skeptical ones in much better conditions and they still weren't moved at all', the most likely conclusion is that even people from the less skeptical half of the distribution from the first go round would not have moved their credences either had they participated in the second round. Of course, the evidence would be even stronger if the people had been randomly selected the first time as well as the second. 

The smart people were selected for having a good predictive track record on geopolitical questions with resolution times measured in months, a track record equaled or bettered by several* members of the concerned group. I think this is much less strong evidence of forecasting ability on the kinds of question discussed than you do.

*For what it's worth, I'd expect the skeptical group to do slightly better overall on e.g. non-AI GJP questions over the next 2 years, they do have better forecasting track records as a group on this kind of question, it's just not a stark difference.

6
David Mathers
1mo
TL;DR Lots of things are believed by some smart, informed, mostly well calibrated people. It's when your arguments are persuasive to (roughly) randomly selected smart, informed, well-calibrated people that we should start being really confident in them. (As a rough heuristic, not an exceptionless rule.) 

I agree this is quite different from the standard GJ forecasting problem. And that GJ forecasters* are primarily selected for and experienced with forecasting quite different sorts of questions. 

But my claim is not "trust them, they are well-calibrated on this". It's more "if your reason for thinking X will happen is a complex multi-stage argument, and a bunch of smart people with no particular reason to be biased, who are also selected for being careful and rational on at least some complicated emotive stuff, spend hours and hours on your argument an... (read more)

The first bullet point of the concerned group summarizing their own position was "non-extinction requires many things to go right, some of which seem unlikely".

This point was notably absent from the sceptics summary of the concerned position.

Both sceptics and concerned agreed that a different important point on the concerned side was that it's harder to use base rates for unprecedented events with unclear reference classes.

I think these both provide a much better characterisation of the difference than the quote you're responding to.

I'm still saving for retirement in various ways, including by making pension contributions.

If you're working on GCR reduction, you can always consider your pension savings a performance bonus for good work :)

3
NickLaing
2mo
Sometimes I wish we had laughing emoji on the form for nice comments like this. But I get the downsides too :D.

I'm not officially part of the AMA but I'm one of the disagreevotes so I'll chime in.

As someone who's only recently started, the vibe this post gives of it being hard for me to disagree with established wisdom and/or push the org to do things differently, meaning my only role is to 'just push out more money along the OP party line', is just miles away from what I've experienced.

If anything, I think how much ownership I've needed to take for the projects I'm working on has been the biggest challenge of starting the role. It's one that (I hope) I'm rising to... (read more)

4
Austin
5mo
Ah, sorry you got that impression from my question! I mostly meant Harvard in terms of "desirability among applicants" as opposed to "established bureaucracy". My outside impression is that a lot of people I respect a lot (like you!) made the decision to go work at OP instead of one of their many other options. And that I've heard informal complaints from leaders of other EA orgs, roughly "it's hard to find and keep good people, because our best candidates keep joining OP instead". So I was curious to learn more about OP's internal thinking about this effect.

I think 1 unfortunately ends up not being true in the intensive farming case. Lots of things are spread by close enough contact that even intense uvc wouldn't do much (and it would be really expensive)

I wouldn't expect the attitude of the team to have shifted much in my absence. I learned a huge amount from Michelle, who's still leading the team, especially about management. To the extent you were impressed with my answers, I think she should take a large amount of the credit.

On feedback specifically, I've retained a small (voluntary) advisory role at 80k, and continue to give feedback as part of that, though I also think that the advisors have been deliberately giving more to each other.

The work I mentioned on how we make introductions to others and tr... (read more)

This seems extremely uncharitable. It's impossible for every good thing to be the top priority, and I really dislike the rhetorical move of criticising someone who says their top priority is X for not caring at all about Y. 

In the post you're replying to Chana makes the (in my view) virtuous move of actually being transparent about what CH's top priorities are, a move which I think is unfortunately rare because of dynamics like this. You've chosen to interpret this as 'a decision not to have' [other nice things that you want], apparently realised that... (read more)

6
Defacto
7mo
Your reply contains a very strong and in my view, highly incorrect read, and says I am far too judgemental and critical.    Please review my comment again. 1. I'm simply pointing to a practice or principle common in many orgs, companies, startups  and teams to have principles and flow from them, in addition to "maximizing EV" or "maximizing profits". This may be wrong or right. 2. I'm genuinely not judging but keeping it open, like, I literally said this. I specifically suggest writing. While this wasn't the focus, I haven't thought about it, but I probably do think Chana's writing is virtuous. I actually have very specific reasons to think why the work is shallow, but this is a distinct thing from the principle or choice I've talked about. Community health is hard and the team is sort of given an awkward ball to catch. An actual uncharitable opinion: I understand this is the EA forum, so as one of the challenges of true communication, critiques and devastating things written by "critics" are often masked or coached as insinuations, but I don't feel like this happened and I kind of resent having to put my comments through these lenses. 

I'm fairly disappointed with how much discussion I've seen recently that either doesn't bother to engage with ways in which the poster might be wrong, or only engages with weak versions. It's possible that the "debate" format of the last week has made this worse, though not all of the things I've seen were directly part of that.

I think that not engaging at all, and merely presenting one side while saying that's what you're doing, seems better than presenting and responding to counterarguments (but only the weak ones), which still seems better than strawmanning arguments that someone else has presented.

Thank you for all of your work organizing the event, communicating about it, and answering people's questions. None of these seem like easy tasks!

I'm no longer on the team but my hot take here is that a good bet is just going to be trying really hard to work out which tools you can use to accelerate/automate/improve your work. This interview with Riley Goodside might be interesting to listen to, not only for tips on how to get more out of AI tools, but also to hear about how the work he does in prompting those tools has rapidly changed, but that he's stayed on the frontier because the things he learned have transferred.

Hey, it's not a direct answer but various parts of my recent discussion with Luisa cover aspects of this concern (it's one that frequently came up in some form or other when I was advising), in particular, I'd recommend skimming the sections on 'trying to have an impact right now', 'needing to work on AI immediately', and 'ignoring conventional career wisdom'.

It's not a full answer but I think the section of my discussion with Luisa Rodriguez on 'not trying hard enough to fail' might be interesting to read/listen to if you're wondering about this. 

Responding here to parts of the third point not covered by "yep, not everyone needs identical advice, writing for a big audience is hard" (same caveats as the other reply):

"And for years it just meant I ended up being in a role for a bit, and someone suggested I apply for another one. In some cases, I got those roles, and then I’d switch because of a bunch of these biases, and then spent very little time getting actually very good at one thing because I’ve done it for years or something." - are you sure this is actually bad? If each time you moved to somet

... (read more)

I don't think it's worth me going back and forth on specific details, especially as I'm not on the web team (or even still at 80k), but these proposals are different to the first thing you suggested. Without taking a position on whether this structure would overall be an improvement, it's obviously not the case that just having different sections for different possible users ensures that everyone gets the advice they need.

For what it's worth, one of the main motivations for this being an after-hours episode, which was promoted on the EA forum and my twitte... (read more)

[I left 80k ~a month ago, and am writing this in a personal capacity, though I showed a draft of this answer to Michelle (who runs the team) before posting and she agrees it provides an accurate representation. Before I left, I was line-managing the 4 advisors, two of whom I also hired.]

Hey, I wanted to chime in with a couple of thoughts on your followup, and then answer the first question (what mechanisms do we have in place to prevent this). Most of the thoughts on the followup can be summarised by ‘yeah, I think doing advising well is really hard’.

Advis

... (read more)
3
Grumpy Squid
7mo
Thanks for this in-depth response, it makes me feel more confident in the processes for the period of time when you were at 80K.  However, since you have left the team, it would be helpful to know which of these practices your successor will keep in place and how much they will change - for example, since you mentioned you were on the high end for giving feedback on calls, for example.  My understanding of many meta EA orgs is that individuals have a fair amount of autonomy. This definitely has its upsides, but it also means that practices can change (substantially) between managers. 

Thanks for asking these! Quick reaction to the first couple of questions, I'll get to the rest later if I can (personal opinions, I haven't worked on the web team, no longer at 80k etc. etc.):


I don't think it's possible to write a single page that gives the right message to every user - having looked at the pressing problems page - the second paragraph visible on that page is entirely caveat. It also links to an FAQ, where multiple parts of the FAQ directly talk about whether people should just take the rankings as given. When you then click through to the... (read more)

5
Yonatan Cale
7mo
Hey Alex :) 1. My own attempt to solve this is to have the article MAINLY split up into sections that address different readers, which you can skip to.   2. 2.2. [edit: seems like you agree with this. TL;DR: too many caveat already] My own experience from reading EA material in general, and 80k material specifically, is that there is going to be lots of caveat which I didn't (and maybe still don't) know how to parse. it feels almost like how people are polite in the U.S (but not in Israel), or like fluff in emails, I'm never sure how to  understand it, so I mostly skip it 2.3. [more important] I think there's a difference between saying "we're not sure" and saying "here's a way that reading this advice might do you more harm than good" (which I personally often write in my docs (example), including something I wrote today)

[not on the LTFF and also not speaking for Open Phil, just giving a personal take]

A few reactions:

  • Which AI systems are conscious seems like a good candidate for an extremely important humanity will need to solve at some point.
  • studying current systems, especially in terms of seeing what philosophical theories of consciousness have to say about them, seems like a reasonable bet to make right now if you're excited about this problem being solved at some point.
  • To the extent you want to bet on a person to help push this field forward, Rob seems like an exce
... (read more)

Can confirm that:

"sr EAs [not taking someone seriously if they were] sloppy in their justification for agreeing with them"

sounds right based on my experience being on both sides of the "meeting senior EAs" equation at various times.

(I don't think I've met Quinn, so this isn't a comment on anyone's impression of them or their reasoning)

So there's now a bunch of speculation in the comments here about what might have caused me and others to criticise this post. 

I think this speculation puts me (and, FWIW, HLI) in a pretty uncomfortable spot for reasons that I don't think are obvious, so I've tried to articulate some of them:
- There are many reasons people might want to discuss others' claims but not accuse them of motivated reasoning/deliberately being deceptive/other bad faith stuff, including (but importantly not limited to): 
a) not thinking that the mistake (or any other behav... (read more)

My comment wasn't about whether there are any positives in using WELLBYs (I think there are), it was about whether I thought that sentence and set of links gave an accurate impression. It sounds like you agree that it didn't, given you've changed the wording and removed one of the links. Thanks for updating it.

I think there's room to include a little more context around the quote from TLYCs.

 



In short, we do not seek to duplicate the excellent work of other charity evaluators. Our approach is meant to complement that work, in order to expand the list o

... (read more)

[Speaking for myself here]

I also thought this claim by HLI was misleading. I clicked several of the links and don't think James is the only person being misrepresented. I also don't think this is all the "major actors in EA's GHW space" - TLYCS, for example, meet reasonable definitions of "major" but their methodology makes no mention of wellbys

4
MichaelPlant
9mo
Hello Alex, Reading back on the sentence, it would have been better to put 'many' rather than 'all'. I've updated it accordingly. TLYCS don't mention WELLBYs, but they did make the comment "we will continue to rely heavily on the research done by other terrific organizations in this space, such as GiveWell, Founders Pledge, Giving Green, Happier Lives Institute [...]". It's worth restating the positives. A number of organisations have said that they've found our research useful. Notably, see the comments by Matt Lerner (Research Director, Founders Pledge) below and also those from Elie Hassenfield (CEO, GiveWell), which we included in footnote 3 above. If it wasn't for HLI's work pioneering the subjective wellbeing approach and the WELLBY, I doubt these would be on the agenda in effective altruism. 

I find this surprising, given that I've heard numbers more like 100-200 $/h claimed by people considerably more senior than top-uni community builders (and who are working in similar fields/with similar goals).

1[comment deleted]10mo

(I'm straight up guessing, and would be keen for an answer from someone familiar with this kind of study)

This also confused me. Skimming the study, I think they're calculating efficacy from something like how long it takes people to get malaria after the booster, which makes sense because you can get it more than once. Simplifying a lot (and still guessing), I think this means that if e.g. on average people get malaria once a week, and you reduce it to once every 10 weeks, you could say this has a 90% efficacy, even though if you looked at how many people ... (read more)

5
JoshuaBlake
1y
They're using 1 minus the hazard ratio, the reduction in the proportion of not yet infected people who are infected at any given time. That is, an 80% efficacy would would that if x% of unvaccinated and as yet uninfected people were infected at some time then (1-0.8)x% of vaccinated and as yet uninfected people would be. The advantage here is that the proportion of people infected would (unless your vaccine is perfect) eventually go to 100% in both groups, so how long you follow them up for will matter a lot.

This is a useful consideration to point out, thanks. I push back a bit below on some specifics, but this effect is definitely one I'd want to include if I do end up carving out time to add a bunch more factors to the model.

I don't think having skipped the neglectedness considerations you mention is enough to call the specific example you quote misleading though, as it's very far from the only thing I skipped, and many of the other things point the other way. Some other things that were skipped:

  • Work after AGI likely isn't worth 0, especially with e.g. Me

... (read more)
4
Benjamin_Todd
1y
Good point there are reasons why work could get more valuable the closer you are – I should have mentioned that. Also interesting points about option value.

Most podcast apps let you subscribe to an RSS feed, and an RSS feed of the audio is available on the site

1
OscarD
1y
Ah whoops, thanks! I have only ever subscribed to podcasts by searching for the name within my app, which didn't work, but using the RSS feed worked :)

I'm a little confused about what "too little demand" means in the second paragraph. Both of the below seem like they might be the thing you are claiming:

  • There is not yet enough demand for a business only serving EA orgs to be self sustaining.
  • EA orgs are making a mistake by not wanting to pay for these things even though they would be worth paying for.

I'd separately be curious to see more detail on why your guess at the optimal structure for the provision of the kind of services you are interested in is "EA-specific provider". I'm not confident that it... (read more)

1
Deena Englander
1y
It's your second point that we're addressing- * EA orgs are making a mistake by not wanting to pay for these things even though they would be worth paying for. Although your first point is also true. I don't think that's a problem, though. Most businesses have diverse audiences and I don't think EA is large enough to support most business models independently.

I think "different timelines don't change the EV of different options very much" plus "personal fit considerations can change the EV of a PhD by a ton" does end up resulting in an argument for the PhD decision not depending much on timelines. I think that you're mostly disagreeing with the first claim, but I'm not entirely sure.

In terms of your point about optimal allocation, my guess is that we disagree to some extent about how much the optimal allocation has changed, but that the much more important disagreement is about whether some kind of centrally pl... (read more)

7
Tristan Cook
1y
Yep, that's right that I'm disagreeing with the first claim.  I think one could argue the main claim either by: 1. Regardless of your timelines, you (person considering doing a PhD) shouldn't take it too much into consideration 2. I (advising you on how to think about whether to do a PhD) think timelines are such that you shouldn't take timelines too much into consideration I think (1) is false, and think that (2) should be qualified by how one's advice would change depending on timelines. (You do briefly discuss (2), e.g. the SOTA comment).  To put my cards on the table, on the object level, I have relatively short  timelines and that fewer people should be doing PhDs on the margin. My highly speculative guess is that this post has the effect of marginally pushing more people towards doing PhDs (given the existing association of shorter timelines => shouldn't do a PhD).

(I'm excited to think more about the rest of the ideas in this post and might have further comments when I do)

Commenting briefly to endorse the description of my course as an MVP. I'd love for someone to make a better produced version, and am happy for people to use any ideas from it that they think would be useful in producing the better version

2
alex lawsen (previously alexrjl)
1y
(I'm excited to think more about the rest of the ideas in this post and might have further comments when I do)
2
Abby Hoskin
1y
As someone who did a PhD, this all checks out to me. I especially like your framing  of PhDs "as more like an entry-level graduate researcher job than ‘n more years of school’".  Many people outside of academia don't understand this, and think of graduate school as just an extension of undergrad when it is really a completely different environment. The main reason to get a PhD is if you want to be a professional researcher (either within or outside of academia), so from this perspective, you'll have to be a junior researcher somewhere for a few years anyway.  In the context of short timelines: if you can do direct work on high impact problems during your PhD, the opportunity cost of a 5-7 year program is substantially lower.  However, in my experience, academia makes it very hard to focus on questions of highest impact; instead people are funneled into projects that are publishable by academic journals. It is really hard to escape this, though having a supportive supervisor (e.g., somebody who already deeply cares about x-risks, or an already tenured professor who is happy to have students study whatever they want) gives you a better shot at studying something actually useful. Just something to consider even if you've already decided you're a good personal fit for doing a PhD!

[context: I'm one of the advisors, and manage some of the others, but am describing my individual attitude below]

FWIW I don't think the balance you indicated is that tricky, and think that conceiving of what I'm doing when I speak to people as 'charismatic persuasion' would be a big mistake for me to make. I try to:

 

  • Say things I think are true, and explain why I think them (both the internal logic and external evidence if it exists) and how confident I am.

     
  • Ask people questions in a way which helps them clarify what they think is true, and which t
... (read more)

Epistemic status: I've thought about both how people should thinking about PhDs and how people should think about timelines a fair bit, both in my own time and in my role as an advisor at 80k, but I wrote this fairly quickly. I'm sharing my take on this rather than intending to speak on behalf of the whole organisation, though my guess is that the typical view is pretty similar.



BLUF:

  • Don’t pay too much attention to median timelines estimates. There’s a lot of uncertainty, and finding the right path for you can easily make a bigger difference than matching t
... (read more)

I'm very happy to see this! Thank you for organising it.

I read this comment as implying that HLI's reasoning transparency is currently better than Givewell's, and think that this is both:

  • False.

  • Not the sort of thing it is reasonable to bring up before immediately hiding behind "that's just my opinion and I don't want to get into a debate about it here".

I therefore downvoted, as well as disagree voting. I don't think downvotes always need comments, but this one seemed worth explaining as the comment contains several statements people might reasonably disagree with.

8
Barry Grimes
1y
Thanks for explaining your reasoning for the downvote. I don’t expect everyone to agree with my comment but if you think it is false then you should explain why you think that. I value all feedback on how HLI can improve our reasoning transparency. However, like I said, I’m going to wait for GWWC’s evaluation before expressing any further personal opinions on this matter.

I'm keen to listen to this, thanks for recording it! Are you planning to make the podcast available on other platforms (stitcher, Google podcasts etc - I haven't found it)

whether you have a 5-10 year timeline or a 15-20 year timeline

Something that I'd like this post to address that it doesn't is that to have "a timeline" rather than a distribution seems ~indefensible given the amount of uncertainty involved. People quote medians (or modes, and it's not clear to me that they reliability differentiate between these) ostensibly as a shorthand for their entire distribution, but then discussion proceeds based only on the point estimates.

I think a shift of 2 years in the median of your distribution looks like a shift of only a... (read more)

1
NickLaing
1y
There's a balance here for communication purposes - concrete potential timelines are easier for some of us to understand than distributions. Perhaps both could be used?
5
simeon_c
1y
Thanks for your comment!  That's an important point that you're bringing up.  My sense is that at the movement level, the consideration you bring up is super important. Indeed, even though I have fairly short timelines, I would like funders to hedge for long timelines (e.g.  fund stuff for China AI Safety). Thus I think that big actors should have in mind their full distribution to optimize their resource allocation.  That said, despite that, I have two disagreements:  1. I feel like at the individual level (i.e. people working in governance for instance, or even organizations), it's too expensive to optimize over a distribution and thus you should probably optimize with a strategy of "I want to have solved my part of the problem by 20XX". And for that purpose, identifying the main characteristics of the strategic landscape at that point (which this post is trying to do) is useful. 2. "the EV gained in the worlds where things move quickly is not worth the expected cost in worlds where they don't." I disagree with this statement, even at the movement level. For instance I think that the trade-off of "should we fund this project which is not the ideal one but still quite good?" is one that funders often encounter and I would expect that funders have more risk adverseness than necessary because when you're not highly time-constrained, it's probably the best strategy (i.e. in every fields except in AI safety, it's probably a way better strategy to trade-off a couple of years against better founders).   Finally, I agree that "the best strategies will have more variance" is not a good advice for everyone. The reason I decided to write it rather than not is because I think that the AI governance community tends to have a too high degree of risk adverseness (which is a good feature in their daily job) which penalizes mechanically a decent amount of actions that are way more useful under shorter timelines. 

Huh, I took 'confidently' to mean you'd be willing to offer much better odds than 1:1.

I'm going to try to stop paying so much attention to the story while it unfolds, which means I'm retracting my interest in betting. Feel free to call this a win (as with Joel).

7
Sabs
1y
well of course I have to  try to start negotiating the bet at completely terrible odds for you, how else am I suppose to make any money?
1
Ram Rachum
1y
Thanks. I looked at their GitHub repo, and it's full of visualizations such as these: I guess I'll have to dig deeper to understand the meaning of this chart, but right off the bat I'm not seeing a relatable and intuitive demonstration that laypeople can understand. Thanks for sending it anyway, I might find useful things there.

If there's any money left over after you've agree a line with Joel and Nuno, I've got next.

2
Sabs
1y
we can do 10k at evens

No worries on the acknowledgement front (though I'm glad you found chatting helpful)!

One failure mode of the filtering idea is that the AGI corporation does not use it because of the alignment tax, or because they don't want to admit that they are creating something that is potentially dangerous

 

I think it's several orders of magnitude easier to get AGI corporations to use filtered safe data than to agree to stop using any electronic communication for safety research. Why is it appropriate to consider the alignment tax of "train on data that someone h... (read more)

1
Peter S. Park
1y
I didn't mean to imply that the cost will be trivial. The cost will either be a significant reduction in communication between AI safety researchers who are far apart (which I agree harms our x-risk reduction efforts), or a resource cost paid by a collaboration of AI safety researchers and EAs with a variety of skillsets to create the infrastructure and institutions needed for secure AI safety research norms. The latter is what I had in mind, and it probably cannot be a decentralized effort like "don't use Google Docs or gmail."

 the two-player zero-sum game can be a decent model of the by-default adversarial interaction

 

I think this is the key crux between you and the several people who've brought up points 1-3. The model you're operating with here is roughly that the alignment game we need to play goes something like this:
1. Train an unaligned ASI
2. Apply "alignment technique"
3. ASI either 'dodges' the technique (having anticipated it), or fails to doge the technique and is now aligned.

I think most of the other people thinking about alignment are trying to prevent step... (read more)

2
Peter S. Park
1y
Thank you so much for your helpful feedback, Alex! I really appreciate your time. (And I had forgotten to acknowledge you in the post for our extremely helpful conversation on this topic earlier this month; so sorry about that! This has been fixed.) I think my threat model is relevant even for the category of AI safety plans that fall into "How do we use SGD to create an AI that will be aligned when it reaches high capabilities?" though perhaps with less magnitude. (Please see my reply to Steven's comment.) The fundamental problem underlying a wide class of AI safety plans is that we likely cannot predict the precise moment the AI becomes agentic and/or dangerous, and that we will likely not be able to know a priori which SGD-based plan will work reliably.  This means that conditional on building an AGI, enabling safe trial and error towards alignment is probably where most of our survival probability lies. (Even if it's hard!) I think the filtering idea is excellent and we should pursue it. I think preserving the secrecy-based value of AI safety plans will realistically be a Swiss cheese approach that combines many helpful but incomplete solutions (hopefully without correlated failure modes). One failure mode of the filtering idea is that the AGI corporation does not use it because of the alignment tax, or because they don't want to admit that they are creating something that is potentially dangerous. This is a general concern that has led me to think that a large-scale shift in research norms towards security mindset would be broadly helpful for x-risk reduction.

If you haven't already seen them, you might find some of the posts tagged "task y" interesting to read.

EA fellowships and summer programmes should have (possibly more competitive) "early entry" cohorts with deadlines in September/October, where if you apply by then you get a guaranteed place, funding, and maybe some extra perk to encourage it, could literally be a slack with the other participants.

Consulting, finance etc have really early processes which people feel pressure to accept in case they don't get anything else, and then don't want to back out of.

That last comment seems very far from the original post which claimed

We have no good reason, only faith and marketing, to believe that we will accomplish AGI by pursuing the DL based AI route.

If we don't have a biological representation of how BNNs can represent and perform symbolic representation, why do we have reason to believe that we know ANNs can't?

Without an ability to point to the difference, this isn't anything close to a reductio, it's just saying "yeah I don't buy it dude, I don't reckon AI will be that good"

1
cveres
2y
Sorry I think you are misunderstanding the reductio argument. That argument simply undermines the claim that natural language is not based on a generative phrase structure grammar. That is, that non symbolic DL is the "proper" model of language. In fact they are called "language models". I claim they are not models of language, and therefore there is no reason to discard symbolic models ... which is where the need for symbol manipulation comes from. Hence a very different sort of architecture than current DL And of course we can point to the difference between artificial and biological networks. I didn't because there are too many! One of the big ones is back propagation. THE major reason we have ANNs in the first place, completely implausible biologically. No back propagation in the brain. 

Could you mechanistically explain how any of the 'very many ways' biological neurons are different mean that the the capacity for symbol manipulation is unique to them?

They're obviously very different, but what I don't think you've done is show that the differences are responsible for the impossibility of symbolic manipulation in artificial neural networks.

1
cveres
2y
I think I may have said something to confuse the issue. Artificial neural networks certainly ARE capable of representing classical symbolic computations. In fact the first neural networks (e.g. perceptron) did just that. They typically do that with local representations where individual nodes assume the role of representing a given variable. But these were not very good at other tasks like generalisation. More advanced distributed networks emerged with DL being the newest incarnation. These have representations which makes it very difficult (if not impossible) to dedicate nodes to variables. Which does not worry the architects because they specifically believe that the non-localised representation is what makes them so powerful (see Bengio, LeCun and Hinton's article for their Turing award) Turning to real neurons, the fact is that we really don't know all that much about how they represent knowledge. We know where they tend to fire in response to given stimuli, we know how they are connected, and we know that they have some hierarchical representations. So I can't give you a biological explanation of how neural ensembles can represent variables. All I can do is give you arguments that humans DO perform symbolic manipulation on variables, so somehow their brain has to be able to encode this.  If you can make an artificial network somehow do this eventually then fine. I will support those efforts. But we are nowhere near that, and the main actors are not even pushing in that direction. 

I live in London and have quite a lot of EA and non-EA friends/colleagues/acquaintances, and my impression is that group houses "by choice" are much more common among the EAs. It's noteworthy that group houses are common among students and lower paid/early stage working professionals for financial reasons though.

If you agree that bundles of biological neurons can have the capacity for symbolic thought, and that non-classical systems can create something symbolic, I don't understand why you think anything you've said shows that DL cannot scale to AGI, even granting your unstated assumption that symbolic thought is necessary for AGI.

(I think that last assumption is false, but don't think it's a crux here so I'm keen to grant it for now, and only discuss once we've cleared up the other thing)

1
cveres
2y
Biological neutrons have very different properties from artificial networks in very many ways. These are well documented. I would never deny that ensembles of biological neutrons have the capacity for symbol manipulation.  I also believe that non-classical systems can learn mappings between symbols, because this is in fact what they do. Language models map from word tokens to word tokens. What they don't do, as the inventors of DL insist, is learn classical symbol manipulation with rules defined over symbols.
Load more