All of PatrickL's Comments + Replies

Do you have any recommendations for doing threat modeling well? In particular, resources that seem applicable to many different risks - nuclear, AI, bio.

Good find, thanks! I'm not very keen on instructing teams to run bug bounties and not other mechanisms, so am not particularly enthusiastic about this. 

It looks like this would focus on infosecurity of the AI systems being used (i.e. can this weapon's AI be hacked?) rather than testing for potential vulnerabilities from the AI systems themself.

How unusual does he think the current policy interest in AI safety is? Will this be a temporary window or an ever-increasing level of interest?

Yeah good find, I also think that passes the bar. Although I do think people have generally overestimated GPT's essay-writing ability compared to humans, and think I might be falling for that here. 

I'm not planning to change the doc because Bing's AI wasn't released by Feb 23, but if you think it should be included (which would be reasonable given OpenAI pretty obviously made this before Feb 23), it would mean:

  • Experts expected 9 milestones to be met vs actually 11 milestones
  • The calibration curve looks four percentage points worse at the 10% mark
  • Bulls'
... (read more)

The experts thought beating humans at Angry Birds would be relatively easy - they put a 90% chance of it being feasible by now. The surprise is that it has not been done - although I think this is mostly explained by no labs seriously trying it.

4
NickLaing
1y
Yeah I think an AI researcher working for a serious company who solved Angry birds might be fired for timewasting :D

Hmm I disagree on the numbers - have I got something wrong in the below?

If you assigned 0% or 100% by coin flip, you would get a Brier score of 0.5 (half the time you would get 0, half the time you would get 1), and if you assigned a random probability between 0% and 100% for every question, you would get a Brier score of 0.33. If you put 50% on everything you would indeed get 0.25.

As the experts had to give 10%, 50%, and 90% forecasts, if they had done this at random they would have ended up with a score of 0.36 [1].

So I think they - including the bu... (read more)

No, it was me who got this wrong. Thanks!

I was also looking at https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/#Data (the dataset itself; the summary doesn't include the milestones). This new version seems like total garbage. The experts continue to predict several of the milestones are five years out, including milestones that were achieved by ChatGPT (ie a few months after the survey) and at least one milestone that had already clearly been achieved by the time the survey was released! 

I've only given the data a quick look and found it hard to analyse - but yeah, many of the for... (read more)

This is great - thanks for this comment! I've gone through each to explain my reasoning. Your comments/sources changed my opinion on Starcraft and Explain - I've updated the post and scores to reflect this, and think the conclusion is now the same but slightly weaker, because the experts' Brier score is 0.2 points worse, but the comparative Brier scores are also worse to a similar amount. There's also my reasoning for other milestones in the appendix (and I've copy-pasted some of them below).

Zach Stein-Perlman from AI Impacts said that he thought "efficien

... (read more)
7
Scott Alexander
1y
Update: I think Bing passes the high school essay bar, based on the section "B- Essays No More" at https://oneusefulthing.substack.com/p/i-hope-you-werent-getting-too-comfortable
8
Scott Alexander
1y
Thank you. I misremebered the transcription question. I now agree with all of your resolutions, with the most remaining uncertainty on translation.

(Scott is correct that I said--and strongly feel--that LLMs don't count for the sorting milestones. Patrick's source for the "sorting large lists" milestone is not an LLM, and Patrick is correct that I later read a draft of this blogpost and deferred to him on whether the "sorting large lists" milestone had been achieved.)

Thanks Adam :)

I have a rough (i.e. considered for <15 minutes) take: if top labs one year ago had attempted these particular milestones, and had the same policies on disclosing capabilities as they currently seem to,  then there's a 40-50% chance they would have achieved 2 of Angry Birds, Atari fifty , Laundry and Go low by now. But I don't put much weight on my prediction, whereas I put a lot more weight on my analysis of what has happened (though this is also somewhat subjective!). 

I agree though that checking what has actually happened ends... (read more)

I'd be interested in seeing this too! Although I'm not planning to spend the time to do this soon, I'd be up for having a quick chat if someone else was up for it. The constant risk model seems to me like a better-than-nothing model, so could be a slight update if it gives different results.  Or could just use a linear increase of probability from 0 years to 10 years.

FWIW - from this survey alone, I'm not convinced that timelines are systematically longer with a fixed-year framing.  I sampled ten forecasts where probabilities were given in both f... (read more)

I feel like I want to comment, for people that know me, that this wasn't me (same first name, the auhtor's description could probably be me, it's the type of thing I'd say (this isn't me necessarily endorsing the post)).

2
Patrick Sue Domin
1y
Yes, this is a different person. I chose a pseudonym with a different first name than my real first name for the obvious reason.

Thanks for posting this. I thought it was well-considered with lots of good brainstroming of options for valuable policy/system change, so am happy you're joining the community! I think 3 is a particularly interesting idea. I think the EA community is far too large and independent to get to 80% agreement of EAs endorsing something, but there may be some way of surveying EAs in relevant fields for each of your policy proposals. It seems probably valuable to me to get politically-interested EAs to have in-depth knowledge on certain areas, although I think th... (read more)

I noticed Ali Abdaal referencing some of the examples from this intro in his summary of WWOTF :) 
 

07:50 and 10:45
 

2
Clifford
2y
That's really cool to see - thanks for sharing Patrick!

I would reframe this as 'community builders don't spend enough time on other EA things' rather than 'community builders spend too much time community building'. I know the thought is that less time marketing -> more time on other stuff, but I think it's worth setting a different tone. I worry there's too much of a meme going round like 'community building is not that good a thing to do', where it should be 'we are finding ways community building could be done better - exciting!'

To me, this is a great post for suggesting how community building could be done better through more direct experience - exciting!

I would be very happy to talk to people! I have been to EAG London three times. 

Once, I didn't know anyone and was too shy to reach out to anyone so floated round different talks. Once, I volunteered on reception and also got in some good conversations and workshops. Once, I spent most of the weekend in individual conversations - coming out inspired and exhausted! 

Very happy to talk to anyone if you message me :) I also used to work in the UK's foreign office and department for international development and am now studying AI. I think I'm nice and approachable!

I liked this review - useful tips I will consider if facilitating again!

I've found the connections/friendships formed the most valuable part - which would mean a strong +1 for your suggestion of games/debates. In my version of the usefulness list, the personal connections would probably go number one - particularly those nearby that you can eventually meet in person. 

Also, I get the impression from this you'd be a great facilitator!

I listened and come away with the same feeling as I commented above- IMO Rory is being a good ambassador for GiveDirectly here!

Also, I was excited about this because I thought Rory Stewart was the new Comms Director at No. 10, which I've just realised was an April Fools prank...

2
Simon_M
2y
Thank you - I will update accordingly.

Thanks! Yes this was just my impression from reading, not listening. I'll hopefully get round to listening later and see if that updates my impression.

3
PatrickL
2y
I listened and come away with the same feeling as I commented above- IMO Rory is being a good ambassador for GiveDirectly here! Also, I was excited about this because I thought Rory Stewart was the new Comms Director at No. 10, which I've just realised was an April Fools prank...

In 2015, it seems to be ~2% (£200m/£12bn).  This was general support for cash transfer schemes which included other features though, like nutritional support. Seems very high though still! Can't see anything more recent - my naive guess would be its less than this now.

Link on UK spend 2011-15 on cash transfers.

Thanks v much for posting this transcript! I agree this is on net good and think I took a more positive impression from Rory Stewart's points :)

Someone who has worked in international development for 30 years and headed DfID(!) is only just now finding out about cash transfers, and thinks it's the most effective intervention you can do. (Although perhaps with his caveat about "for a single poor family" makes it true?

I didn't get the impression from this transcript that Rory Stewart has just heard of cash transfers - is there any part which implied that? It... (read more)

1
Simon_M
2y
Reading the transcript cold, maybe it doesn't give that impression. If you're willing to listen to the episodes (there's two of them and the topic comes up a few times intersperced throughout) I'd be interested if your view changes with his joke. (He certainly gives off a tone of surprise). I also think this: Pretty strongly gives the impression that he hadn't seen it before.  Again, I think it's worth listening to the full context, the impression the listener is given is very much that this is a pancea and better than "charities who are going in doing  [..] health programs". I'm very happy to be wrong on this so I am very keen to grab onto anything saying the opposite, I just can't shake the first impression I got from listening.

I loved this article! and have used it to explain my interests to family who aren't familiar/emotionally connected with longtermism. I also frequently used OWiD pieces (e.g. health + climate) when working in the FCDO - it became IMO the most credible and impartial source for providing new ideas & information to us, and I think OWiD can achieve this for longtermism-related data.

I wondered if it is possible to add a visualisation of a short animation: first, of the hourglass representing past and present (10 millions of) people, then zooming out to have ... (read more)

Thank you all for comments! Filming has been delayed until April/May 2022, so have a good five months to consider and also practice my quizzing.

My feeling at the moment is to keep it simple and positive (e.g. 'I have recently become really interested in effective altruism - which is about trying to have as positive an impact as possible with your time and money - I joined a community of others thinking and working on the same thing ~4 years ago and love it') - allowing people to look up effective altruism if they are interested but not going in to much det... (read more)

I wondered this as well. I think the context here makes me think otherwise:

It's a positive-spirited show where contestants are treated well, so wouldn't be like a debate or a news article.

It's this or not this (rather than choosing to allocate resource in high-fidelity rather than mass media).

Do you think something broad, like Aaron's suggested  'I like thinking about effective altruism, which is the art of doing as much good as you can with your money and your time'   has possible negatives, like being misinterpreted badly or putting people off EA?

7
DavidNash
2y
I think for most people who hadn't heard of EA, it's very unlikely that they'll start searching for it online after hearing about it briefly on daytime TV. For those that have already heard about EA it may just reinforce what they already think about it, some positive and some negative.  Even just the phrase effective altruism can be interpreted as arrogant if you don't spend some time explaining what you mean. I prefer people to have high value impressions when they come across EA, whether that's online/in person,  rather than having more but less valuable touch points.

Helpful points, thanks! I think that will be the challenge: sounding like a nice relatable person and not an ad, while also fitting in a plug. I think the formula is to have one topic only in your intro- so would prefer not to but probably will have to sacrifice the hobbies to talk EA... and rely on innate charm and humility to get myself across!

This seems very sane advice - thank you!

Out of interest, did you ever hear from anyone who looked in to GiveWell after your plug?

6
Aaron Gertler
2y
Yes, though it's not clear whether it was the broadcast plug or some Tweeting/streaming I did later with more plugs. A couple of people just told me they donated without mentioning an amount (and I assume the donations were small). One person claims to have matched all the GiveWell donations I made that year (over $25,000) — I think this claim is probably true, because a GiveWell staffer responded on Twitter to thank them. I also heard from another person who told me they registered for EA Global in part because they remembered my talking about effective altruism, which could have originated with the broadcast plug. That's the only impact that comes to mind.

Thanks for the questions- and sorry for the delay answering. I'll go through 1 and 2 in turn but think 3 is too political for me to answer - sorry!

1)I was instrumental in setting up and playing secretariat to a group of development ministers that convened during the beginning of the covid pandemic. This allowed like-minded ministers to share best practice and coordinate in what was a rapidly changing crisis for many developing countries. Some new principles spun from this group for how to support certain countries and I think it probably made a very slight... (read more)

1
jchen1
3y
Thank you for the detailed response, very helpful!

That's helpful context, thanks! There are definitely chances to be imaginative and creative in my role - but I think significantly less than in smaller, more agile organisations. New ideas are very much encouraged, but it can take more work to see them in to practice, as more steps of approval are needed, and they need to fit in to a large, well-established institution.

2
Madhav Malhotra
3y
Thank you for the followup! Appreciate you writing about your job just to help others :-)

Great question- I think this is particularly important because a lot of the value from government jobs comes later in your career, so if you are unsure it is a good fit, you particularly want to be gaining transferrable skills. I haven't worked in other EA career paths, which limits my insight a bit, but here's my best bet:

Yes, I think it has given me a good (not excellent) skillset. 

  • Working on policy questions gives good research skills- I've become skilled at digesting complex information from a range of sources, figuring which elements are relevant
... (read more)

Yeah I definitely have colleagues on both ends- some get frustrated when there aren't enough opportunities for spontaneity or risks, others like working in a well-established institution with  set norms. I would add that people are very much encouraged to bring their own style and ideas to work and it's safe to challenge things- but it inevitably is harder to shift culture given the size of the organisation. 

I personally don't mind the peaks and troughs/inconsistency of work- keeps things exciting!

Does that sound consistent to you? I'm conscious terms like hierarchy or openness have different meanings to different people.

1
Madhav Malhotra
3y
I understand the terms have different meanings, yeah :/ I was actually thinking about it in the context of the Big 5 personality trait 'openness'.  I'm not sure whether the work you're describing is consistent. :/ I'd need a lot more specific examples to add judgement there.