All of Lukas_Finnveden's Comments + Replies

[AMA] Announcing Open Phil’s University Group Organizer and Century Fellowships

The page for the Century Fellowship outlines some things that fellows could do, which are much broader than just university group organizing:

When assessing applications, we will primarily be evaluating the candidate rather than their planned activities, but we imagine a hypothetical Century Fellow may want to:

... (read more)
When we were originally thinking about the fellowship, one of the cases for impact was making community building a more viable career (hence the emphasis in this post), but it’s definitely intended more broadly for people working on the long-term future. I’m pretty unsure how the fellowship will shake out in terms of community organizers vs researchers vs entrepreneurs long-term – we’ve funded a mix so far (including several people who I’m not sure how to categorize / are still unsure about what they want to do).
Punching Utilitarians in the Face

I'm not saying it's infinite, just that (even assuming it's finite) I assign non-0 probability to different possible finite numbers in a fashion such that the expected value is infinite. (Just like the expected value of an infinite st petersburg challenge is infinite, although every outcome has finite size.)

Punching Utilitarians in the Face

The topic under discussion is whether pascalian scenarios are a problem for utilitarianism, so we do need to take pascalian scenarios seriously, in this discussion.

Punching Utilitarians in the Face

I simply don’t believe that infinities exist, and even though 0 isn’t a probability, I reject the probabilistic argument that any possibility of infinity allows them to dominate all EV calculations.

Problems with infinity doesn't go away just because you assume that actual infinities don't exist. Even with just finite numbers, you can face gambles that have infinite expected value, if increasingly good possibilities have insufficiently rapidly diminishing probabilities. And this still causes a lot of problems.

(I also don't think that's an esoteric possib... (read more)

Under mainstream conceptions of physics (as I loosely understand them), the number of possible lives in the future is unfathomably large, but not actually infinite.
Some research questions that you may want to tackle

10^12 might be too low. Making up some numbers: If future civilizations can create 10^50 lives, and we think there's an 0.1% chance that 0.01% of that will be spent on ancestor simulations, then that's 10^43 expected lives in ancestor simulations. If each such simulation uses 10^12 lives worth of compute, that's a 10^31 multiplier on short-term helping.

Some research questions that you may want to tackle

A proper treatment of this should take into account that short-term helping also might have positive effects in lots of simulations to a much greater extent than long-term helping.

4Zach Stein-Perlman1mo
Sure, want to change the numbers by a factor of, say, 10^12 to account for simulation? The long-term effects still dominate. (Maybe taking actions to influence our simulators is more effective than trying to cause improvements in the long-term of our universe, but that isn't an argument for doing naive short-term interventions.)
(Even) More Early-Career EAs Should Try AI Safety Technical Research

I agree. Anecdotally, among people I know, I've found aphantasia to be more common among those who are very mathematically skilled.

(Maybe you could have some hypothesis that aphantasia tracks something slightly different than other variance in visual reasoning. But regardless, it sure seems similar enough that it's a bad idea to emphasize the importance of "shape rotating". Because that will turn off some excellent fits.)

What’s the theory of change of “Come to the bay over the summer!”?

But note the hidden costs. Climbing the social ladder can trade of against building things. Learning all the Berkeley vibes can trade of against, eg., learning the math actually useful for understanding agency.

This feels like a surprisingly generic counterargument, after the (interesting) point about ladder climbing. "This could have opportunity costs" could be written under every piece of advice for how to spend time.

In fact, it applies less to this posts than to most advice on how to spend time, since the OP claimed that the environment caused them to... (read more)

Some potential lessons from Carrick’s Congressional bid

By the way, as an aside, the final chapter here is that Protect our Future PAC went negative in May -- perhaps a direct counter to BoldPAC's spending. (Are folks here proud of that? Is misleading negative campaigning compatible with EA values?)

I wanted to see exactly how misleading these were. I found this example of an attack ad, which (after some searching) I think cites this, this, this, and this. As far as I can tell:

  • The first source says that Salinas "worked for the chemical manufacturers’ trade association for a year", in the 90s.
  • The second source sa
... (read more)

Yeah, bummer, not happy about this.

Thanks for checking. I initially thought _pk's claims were overblown, so it was helpful to get a sanity check. I now agree that the claims were quite misleading. 

I at least do not want to be associated with claims at this level of misleadingness. I guess it's possible that this is just "American politics as usual" (I'm pretty unfamiliar with this space). To the extent that this is normal/default politics, then I guess we have to reluctantly accede to the usual norms. But this appears regrettable, and to the extent it's abnormal, my own opinion is that we should have a pretty high bar before endorsing such actions. 

Replicating and extending the grabby aliens model

(1) maybe doom should be disambiguated between  "the short-lived simulation that I am in is turned of"-doom (which I can't really observe) and "the basement reality Earth I am in is turned into paperclips by an unaligned AGI"-type doom.

Yup, I agree the disambiguation is good. In aliens-context, it's even useful to disambiguate those types of doom from "Intelligence never leaves the basement reality Earth I am on"-doom. Since paperclippers probably would become grabby.

Replicating and extending the grabby aliens model

When I model the existence of simulations like us, SIA does not imply doom (as seen in the marginalised posteriors for  in the appendix here). 

It does imply doom for us, since we're almost certainly in a short-lived simulation.

And if we condition on being outside of a simulation, SIA also implies doom for us, since it's more likely that we'll find ourselves outside of a simulation if there are more basement-level civilizations, which is facilitated by more of them being doomed.

It just implies that there  weren't necessarily a lot of ... (read more)

4Tristan Cook3mo
I agree with what you say, though would note (1) maybe doom should be disambiguated between "the short-lived simulation that I am in is turned of"-doom (which I can't really observe) and "the basement reality Earth I am in is turned into paperclips by an unaligned AGI"-type doom. (2) conditioning on me being in at least one short-lived simulation, if the multiverse is sufficiently large and the simulation containing me is sufficiently 'lawful' then I may also expect there to be basement reality copies of me too. In this case, doom is implied for (what I would guess is) most exact copies of me.
Discussion on Thomas Philippon's paper on TFP growth being linear

There's an excellent critique of that paper on LW:

The conclusion is that exponentials look better for longer-run trends, if you do fair comparisons. And that linear being a better fit than exponentials in recent data is more about the error-model than the growth-model, so it shouldn't be a big update against exponential growth.

1Arjun Yadav3mo
Great post! I was mainly concerned with thep-values heading haha. I wonder if Thomas Philippon will follow up on all of the attention his paper received.
How about we don't all get COVID in London?

It's table 3 I think you want to look at. For fatigue and other long covid symptoms, belief that you had covid has a higher odds ratio than does confirmed covid

That's exactly what we should expect if long covid is caused by symptomatic covid, and belief-in-covid is a better predictor of symptomatic covid than positive-covid-test. (The latter also picks up asymptomatic covid, so it's a worse predictor of symptomatic covid.)

Announcing What The Future Owes Us

The future's ability to affect the past is truly a crucial consideration for those with high discount rates. You may doubt whether such acausal effects are possible, but in expectation, on e.g. an ultra-neartermist view, even a 10^-100 probability that it works is enough, since anything that happened 100 years ago is >>10^1000 times as important as today is, with an 80%/day discount rate.

Indeed, if we take the MEC approach to moral uncertainty, we can see that this possibility of ultra-neartermism + past influence will dominate our actions for any re... (read more)

Future-proof ethics

I think the title of this post doesn't quite match the dialogue. Most of the dialogue is about whether additional good lives is at least somewhat good. But that's different from whether each additional good life is morally equivalent to a prevented death. The former seems more plausible than the latter, to me.

Separating the two will lead to some situations where a life is bad to create but also good to save, once started. That seems more like a feature than a bug. If you ask people in surveys, my impression is that some small fraction of people say that th... (read more)

2Holden Karnofsky4mo
I think that's a fair point. These positions just pretty much end up in the same place when it comes to valuing existential risk.
We're announcing a $100,000 blog prize

I assume it's fine to prominently link to the EA forum or LW as the place to leave comments? Like e.g. cold takes does.

6Nick Whitaker5mo
Yes, that's fine.
Yonatan Cale's Shortform

2. The best workout game I found is "thrill of the fight", I have some tips before you try it. Also, not everyone will like it

What are your tips?

2Yonatan Cale6mo
TIPS FOR THRILL OF THE FIGHT SAFETY 1. Configure the game (the "guardian") to keep some space from things like walls, so you won't punch them by accident 2. Don't straighten your elbow completely when you punch (keep it a bit bent), otherwise you might damage it in real life 1. For the same reason, don't do strange bad things with your posture, such as bending your spine sideways 2. Generally bad pain is bad, you know. 3. I can say more about this if it would help END YOUR FIRST ROUND EARLY, MAYBE Consider doing the first round for only 20 seconds or so to avoid becoming overly exhausted without noticing (this has happened to one person I saw). How to stop? * You can just take your headset off. If you're like me, you're totally going to forget this, but you can. * You can also "hug" your opponent for ~5 seconds to end the fight. TIPS THAT WOULD APPEAR IN A TUTORIAL, IF IT WOULD EXIST, I THINK Consider skipping this if you want to investigate the game's mechanics completely by yourself, hpmor style, but here goes: * I'd start by fighting the "dummy". You can see your stats on the right of the dummy, including how much damage your last punch did. Then go to "fight" * The game cares a lot about how strong you punch * Many punches you do will be too weak and do 0 damage * To see how much damage you did, check the color the punch when it hits (you'll see) * blue = zero damage * yellow = nice damage * red = a ton of damage * You can doge, including by ducking (great workout if you ask me) * You can block. If your opponent's punch hits your glove before it hits you, it will do zero damage to you * Professional boxers on Youtube say that this game is reasonably realistic (even if not perfect), I'd take that as a prior for most uncertainties that I have (mainly around what technique to use) * Consider startin
Simplify EA Pitches to "Holy Shit, X-Risk"

You could replace working on climate change with 'working on or voting in elections', which are also all or nothing.

(Edit: For some previous arguments in this vein, see this post .)

3Neel Nanda6mo
Yep! I think this phenomena of 'things that are technically all-or-nothing, but it's most useful to think of them as a continuous thing' is really common. Eg, if you want to reduce the amount of chickens killed for meat, it helps to stop buying chicken. This lowers demand, which will on average lower chickens killed. But the underlying thing is meat companies noticing and reducing production, which is pretty discrete and chunky and hard to predict well (though not literally all-or-nothing). Basically any kind of campaign to change minds or achieve social change with some political goal also comes under this. I think AI Safety is about as much a Pascal's Mugging as any of these other things
New EA Cause Area: Run Blackwell's Bookstore

SSC argued that there was not enough money in politics

To be clear, SSC argued that there was surprisingly little money in politics. The article explicitly says "I don’t want more money in politics".

2Ben Pace6mo
That’s right.
External Evaluation of the EA Wiki

Here's one idea: Automatic or low-effort linking to wiki-tags when writing posts or comments. A few different versions of this:

  • When you write a comment or post that has contains the exact name of a tag/wiki article, those words automatically link to that tag. (This could potentially be turned on/off in the editor or in your personal prefs.)
  • The same as the above except it only happens if you do something special to the words, e.g. enclose them in [[double brackets]], surround them by [tag] [/tag], or capitalise correctly. (Magic the gathering forums often h
... (read more)
3Paal Fredrik Skjørten Kvarberg8mo
Strong upvoted! I think something like this would introduce exactly the kinds of people whom we would like to use the wiki, to the wiki. I like the first version best, as many writers might not be aware of the ways to link to tags, and not be aware of what tags exist. Also, this nudges writers to use the same concepts for their words (because it is embarrassing to use a word linked to a tag in another sense then is explained in that tag).
What is the EU AI Act and why should you care about it?

I think this is a better link to FLI's position on the AI act:

(The one in the post goes to their opinion on liability rules. I don't know the relationship between that and the AI act.)

Thank you for spotting this that mistake. This is the position I meant to link to, I've replaced the link in the post.
How many EA 2021 $s would you trade off against a 0.01% chance of existential catastrophe?

Seems better than the previous one, though imo still worse than my suggestion, for 3 reasons:

  • it's more complex than asking about immediate extinction. (Why exactly 100 year cutoff? why 50%?)
  • since the definition explicitly allows for different x-risks to be differently bad, the amount you'd pay to reduce them would vary depending on the x-risk. So the question is underspecified.
  • The independence assumption is better if funders often face opportunities to reduce a Y%-risk that's roughly independent from most other x-risk this century. Your suggestion is bette
... (read more)
How many EA 2021 $s would you trade off against a 0.01% chance of existential catastrophe?

Currently, the post says:

A risk of catastrophe where an adverse outcome would permanently cause Earth-originating intelligent life's astronomical value to be <50% of what it would otherwise be capable of.

I'm not a fan of this definition, because I find it very plausible that the expected value of the future is less than 50% of what humanity is capable of. Which e.g. raises the question: does even extinction fulfil the description? Maybe you could argue "yes": but the mix of causing  an actual outcome compared with what intelligent life is "capable ... (read more)

How do people feel about a proposed new definition:
Listen to more EA content with The Nonlinear Library

I see you've started including some text from the post in each episode description, which is useful! Could you also include the URL to the post, at the top of the episode description? I often want to check out comments on interesting posts.

Opportunity Costs of Technical Talent: Intuition and (Simple) Implications

For example, I can't imagine any EA donor paying a non-ML engineer/manager $400,000, even if that person could make $2,000,000 in industry.

Hm, I thought lightcone infrastructure might do that.

Our current salary policy is to pay rates competitive with industry salary minus 30%. Given prevailing salary levels in the Bay Area for the kind of skill level we are looking at, we expect salaries to start at $150k/year plus healthcare (but we would be open to paying $315k for someone who would make $450k in industry). (read more)

4Ozzie Gooen9mo
Yea; lightcone is much closer to any other group I knew of before. I was really surprised by their announcement. I think it's highly unusual (this seems much higher than the other and previous non-ai eng roles I knew of). I'd also be very surprised if Lightcone chose someone for $400,000 or more. My guess is that they'll be aiming for the sorts of people who aren't quite that expensive. So, I think Lightcone is making a reasonable move here, but it's an unusual move. Also, if we thought that future ea/engineering projects would have to pay $200k-500k per engineer, I think that would change the way we think about them a lot.
Preprint is out! 100,000 lumens to treat seasonal affective disorder

For 100,000 LM, 12 hours a day, that would be 1000W * 12h/day * 20c/kwh = $2.4.

yes, when we did the calculation, it was something like €2 per day (for ~6-8 hours per day). Still very cheap for a depression treatment :-)
Why aren't you freaking out about OpenAI? At what point would you start?

The website now lists Helen Toner, but do not list Holden, so it seems he is no longer on the board.

That's pretty wild, especially considering getting Holden on the board was a major condition of OpenPhilanthropy's $30,000,000 grant:

Thought it also says the grant was for 3 years, so maybe it shouldn't be surprising that his board seat only lasted that long.

We're Redwood Research, we do applied alignment research, AMA

Hm, could you expand on why collusion is one of the most salient ways in which "it’s possible to build systems that are performance-competitive and training-competitive, and do well on average on their training distribution" could fail?

Is the thought here that — if models can collude — then they can do badly on the training distribution in an unnoticeable way, because they're being checked by models that they can collude with?

Yeah basically.
When pooling forecasts, use the geometric mean of odds

My answer is that we need to understand the resilience of the aggregated prediction to new information.

This seems roughly right to me. And in particular, I think this highlights the issue with the example of institutional failure. The problem with aggregating predictions to a single guess p of annual failure, and then using p to forecast, is that it assumes that the probability of failure in each year is independent from our perspective. But in fact, each year of no failure provides evidence that the risk of failure is low. And if the forecasters' estimate... (read more)

1Jaime Sevilla10mo
I think this is a good account of the institutional failure example, thank you!
EA Hangout Prisoners' Dilemma

According to wikipedia, the $300  vs $100 is fine for a one-shot prisoner's dilemma. But an iterated prisoner's dilemma would require (defect against cooperate)+(cooperate against defect) < 2*(cooperate cooperate), since the best outcome is supposed to be permanent cooperate/cooperate rather than alternating cooperation/defection.

However, the fact that this games gives out the same 0$ for both cooperate/defect and defect/defect means it nevertheless doesn't count as an ordinary prisoner's dilemma. Defecting against someone who defects needs to be s... (read more)

MichaelA's Shortform

Thanks, I appreciate having something to link to! My independent impression is that it would be even easier to link to and easier to find as a top-level post.

Thanks for the suggestion - I've now gone ahead and made that top-level post [] :)
Why AI alignment could be hard with modern deep learning

FWIW, I think my median future includes humanity solving AI alignment but messing up reflection/coordination in some way that makes us lose out on most possible value. I think this means that longtermists should think more about reflection/coordination-issues than we're currently doing. But technical AI alignment seems more tractable than reflection/coordination, so I think it's probably correct for more total effort to go towards alignment (which is the status quo).

I'm undecided about whether these reflection/coordination-issues are best framed as "AI risk" or not. They'll certainly interact a lot with AI, but we would face similar problems without AI.

Honoring Petrov Day on the EA Forum: 2021

This was proposed and discussed 2 years ago here.

What should "counterfactual donation" mean?

Say I offer to make a counterfactual donation of $50 to the Against Malaria Foundation (AMF) if you do a thing; which of the following are ok for me to do if you don't?

I think this misses out on an important question, which is "What would you have done with the money if you hadn't offered the counterfactual donation?"

If you were planning to donate to AMF, but then realised that you could make me do X by commiting to burn the money if I don't do X, I think that's not ok, in two senses:

  • Firstly, if you just state that the donation is counterfactual, I would i
... (read more)
How to succeed as an early-stage researcher: the “lean startup” approach

I'm confused about your FAQ's advice here. Some quotes from the longer example:

Let’s say that Alice is an expert in AI alignment, and Bob wants to get into the field, and trusts Alice’s judgment. Bob asks Alice what she thinks is most valuable to work on, and she replies, “probably robustness of neural networks”. [...]  I think Bob should instead spend some time thinking about how a solution to robustness would mean that AI risk has been meaningfully reduced. [...] It’s possible that after all this reflection, Bob concludes that impact regularization

... (read more)
3Rohin Shah1y
In that example, Alice has ~5 min of time to give feedback to Bob; in Toby's case the senior researchers are (in aggregate) spending at least multiple hours providing feedback (where "Bob spent 15 min talking to Alice and seeing what she got excited about" counts as 15 min of feedback from Alice). That's the major difference. I guess one way you could interpret Toby's advice is to simply get a project idea from a senior person, and then go work on it yourself without feedback from that senior person -- I would disagree with that particular advice. I think it's important to have iterative / continual feedback from senior people.
What is the EU AI Act and why should you care about it?

Thank you for this! Very useful.

The AI act creates institutions responsible for monitoring high-risk systems and the monitoring of AI progress as a whole.

In what sense is the AI board (or some other institution?) responsible for monitoring AI progress as a whole?

Sorry I should have said "monitoring AI progress in Europe as a whole" and even then I think it might be misleading. One of the three central tasks of the AI board is to 'coordinate and contribute to guidance and analysis by the Commission and the national supervisory authorities and other competent authorities on emerging issues across the internal market with regard to matters covered by this Regulation;' For example, if a high-risk AI system is compliant but still poses a risk the provider is required to immediately inform the AI Board. The national supervisory authorities must also regularly report back to the AI Board about the results of their market surveillance and more. So the AI Board both gets the mandate and the information to monitor how AI progresses in the EU. And they have to do so to carry out their task effectively even if it's not directly stated anywhere that they are required to do so. I hope this clears it up, I'm happy that you found the post useful!
How to succeed as an early-stage researcher: the “lean startup” approach

One reason to publish papers (specifically) about AI governance (specifically) is if you want to build an academic field working on AI governance. This is good both to get more brainpower and to get more people (who otherwise wouldn't read EA research) to take the research seriously, in the long term. C.f. the last section here

Moral dilemma

Sorry to hear you're struggling! As others have said, getting to a less tormented state of mind should likely be your top priority right now.

(I think this would be true even if  you only cared about understanding these issues and acting accordingly, because they're difficult enough that it's hard to make progress without being able to think clearly about them. I think that focusing on getting better would be your best bet even if there's some probability that you'll care less about these issues in the future, as you mentioned worrying about in a diffe... (read more)

Most research/advocacy charities are not scalable

With a bunch of unrealistic assumptions (like constant cost-effectiveness), the counterfactual impact should be (impact/resource  -  opportunitycost/resource)  *  resource.

If impact/resource  is much bigger than opportunitycost/resource  (so that the latter is negligible) this is roughly equal to impact/resource * resource, which is one reading of cost-effectiveness * scale.

If so, assuming that resource=$ in this case, this roughly translates to the heuristic "if the opportunity cost of money isn't that high (compared to your project), you should optimise for total impact without thinking much about  the monetary costs".

Good point. We could also read "impact/resource - opportunitycost/resource" as a cost-effectiveness estimate that takes opportunity costs into account. I think Charity Entrepreneurship has been optimizing for this (at least sometimes, based on the work I've seen in the animal space) and they refer to it as a cost-effectiveness estimate, but I think this is not typical in EA. Also, this is looking more like cost-benefit analysis than cost-effectiveness analysis.
Most research/advocacy charities are not scalable

Based on vaguely remembered hearsay, my heuristic has been that the large AI  labs like DeepMind and OpenAI spend roughly as much on compute as they do on people, which would make for a ~2x increase in costs. Googling around doesn't immediately get me any great sources, although this page says "Cloud computing services are a major cost for OpenAI, which spent $7.9 million on cloud computing in the 2017 tax year, or about a quarter of its total functional expenses for that year".

I'd be curious to get a better estimate, if anyone knows anything relevant.

Most research/advocacy charities are not scalable

There may be reasons why building such 100m+ projects are different both from many smaller  "hits based" funding of Open Phil projects (as a high chance of failure is unacceptable) and also different than the GiveWell-style interventions.

One reason is that orgs like OpenAI and CSET require such scale just to get started, e.g. to interest the people involved

This sounds like CSET is a 100m+ project. Their OpenPhil grant was for $11m/year for 5 years, and wikipedia says they got a couple of millions from other sources, so my guess is they're currently sp... (read more)

1Charles He1y
Thank you for pointing this out. You are right, and I think maybe even a reasonable guess is that CSET funding is starting out at less than 10M a year.
Yes, I wouldn't say CSET is a mega project, though more CSET-like things would also be amazing.
Further thoughts on charter cities and effective altruism

this page has some statistics on openphil's giving (though it is noted to be preliminary)

[Future Perfect] How to be a good ancestor

Sweden has a “Ministry of the Future,”

Unfortunately, this is now a thing of the past. It only lasted 2014-2016. (Wikipedia on the minister post: )

What are some key numbers that (almost) every EA should know?

The last two should be 10^11 - 10^12 and 10^11, respectively?

Oops, that's why you don't try to do mental arithmetic that will shape the future of our lightcone at 1AM in the morning.
A ranked list of all EA-relevant (audio)books I've read

This has been discussed on lw here:

Strong opinions on both sides, with a majority of people currently thinking about current karma levels occasionally but not always.

Were the Great Tragedies of History “Mere Ripples”?

It seems fine to switch between critiquing the movement and critiquing the philosophy, but I think it'd be better if the switch was made clear.


There are many longtermists that don't hold these views (eg. Will MacAskill is literally about to publish the book on longtermism and doesn't think we're at an especially influential time in history, and patient philanthropy gets taken seriously by lots of longtermists).

Yeah this seems right, maybe with the caveat that Will has (as far as I know) mostly expressed skepticism about this being the most in... (read more)

Were the Great Tragedies of History “Mere Ripples”?

Granted, there are probably longtermists that do hold these views, but these views are not longtermism. I don’t know whether Bostrom (whose views seems to be the focus of the book) holds these views. Even if he does, these views are not longtermism

I haven't read the top-level post (thanks for summarising!); but in general, I think this is a weak counterargument. If most people in a movement (or academic field, or political party, etc) holds a rare belief X, it's perfectly fair to criticise the movement for believing X. If the movement claims that X isn'... (read more)

4Alex HT2y
Thanks for you comment, it makes a good point . My comment was hastily written and I think my argument that you're referring to is weak, but not as weak as you suggest. At some points the author is specifically critiquing longtermism the philosophy (not what actual longtermists think and do) eg. when talking about genocide. It seems fine to switch between critiquing the movement and critiquing the philosophy, but I think it'd be better if the switch was made clear. There are many longtermists that don't hold these views (eg. Will MacAskill is literally about to publish the book on longtermism and doesn't think we're at an especially influential time in history, and patient philanthropy gets taken seriously by lots of longtermists). I'm also not sure that lots of longtermists (even of the Bostrom/hinge of history type) would agree that the quoted claim accurately represent their views But, I do agree that some longtermists do think * there are likely to be very transformative events soon eg. within 50 years * in the long run, if they go well, these events will massively improve the human condition And there's some criticisms you can make of that kind of ideology that are similar to the criticisms the author makes.
Scope-sensitive ethics: capturing the core intuition motivating utilitarianism

As a toy example, say that  is some bounded sigmoid function, and my utility function is to maximize ; it's always going to be the case that  so I am in some sense scope sensitive, but I don't think I'm open to Pascal's mugging

This seems right to me.

I think it means that there is something which we value linearly, but that thing might be a complicated function of happiness, preference satisfaction, etc.

Yeah, I have no quibbles with this. FWIW, I personally didn't  interpret the passage as sayi... (read more)

That makes sense; your interpretation does seem reasonable, so perhaps a rephrase a would be helpful.
Lessons from my time in Effective Altruism

I agree it's partly a lucky coincidence, but I also count it as some general evidence. Ie., insofar as careers are unpredictable, up-skilling in a single area may be a bit less reliably good than expected, compared with placing yourself in a situation where you get exposed to lots of information and inspiration that's directly relevant to things you care about. (That last bit is unfortunately vague, but seems to gesture at something that there's more of in direct work.)

Yepp, I agree with this. On the other hand, since AI safety is mentorship-constrained, if you have good opportunities to upskill in mainstream ML, then that frees up some resources for other people. And it also involves building up wider networks. So maybe "similar expected value" is a bit too strong, but not that much.
Scope-sensitive ethics: capturing the core intuition motivating utilitarianism

Endorsing actions which, in expectation, bring about more intuitively valuable aspects of individual lives (e.g. happiness, preference-satisfaction, etc), or bring about fewer intuitively disvaluable aspects of individual lives

If this is the technical meaning of "in expectation", this brings in a lot of baggage. I think it implicitly means that you value those things ~linearly in their amount (which makes the second statement superfluous?), and it opens you up to pascal's mugging.

I think it means that there is something which we value linearly, but that thing might be a complicated function of happiness, preference satisfaction, etc. As a toy example, say thatS(x)is some bounded sigmoid function, and my utility function is to maximizeE[S(x)]; it's always going to be the case thatE[S(x1)]≥E[ S(x2)]⇔x1≥x2so I am in some sense scope sensitive, but I don't think I'm open to Pascal's mugging. (Correct me if this is wrong though.)
Load More