Lukas_Finnveden

Sorted by New

# Wiki Contributions

External Evaluation of the EA Wiki

Here's one idea: Automatic or low-effort linking to wiki-tags when writing posts or comments. A few different versions of this:

• When you write a comment or post that has contains the exact name of a tag/wiki article, those words automatically link to that tag. (This could potentially be turned on/off in the editor or in your personal prefs.)
• The same as the above except it only happens if you do something special to the words, e.g. enclose them in [[double brackets]], surround them by [tag] [/tag], or capitalise correctly. (Magic the gathering forums often have something like this for linking to cards.)
• The same as the above, except there's some helpful search function that helps you find relevant wiki articles. E.g. you type [[ or you click some particular button in the editor, and then a box for searching for tags pops up. (Similar to linking to another page in Roam. This could also be implemented for linking to posts.)
What is the EU AI Act and why should you care about it?

I think this is a better link to FLI's position on the AI act: https://ec.europa.eu/info/law/better-regulation/have-your-say/initiatives/12527-Artificial-intelligence-ethical-and-legal-requirements/F2665546_en

(The one in the post goes to their opinion on liability rules. I don't know the relationship between that and the AI act.)

How many EA 2021 $s would you trade off against a 0.01% chance of existential catastrophe? Seems better than the previous one, though imo still worse than my suggestion, for 3 reasons: • it's more complex than asking about immediate extinction. (Why exactly 100 year cutoff? why 50%?) • since the definition explicitly allows for different x-risks to be differently bad, the amount you'd pay to reduce them would vary depending on the x-risk. So the question is underspecified. • The independence assumption is better if funders often face opportunities to reduce a Y%-risk that's roughly independent from most other x-risk this century. Your suggestion is better if funders often face opportunities to reduce Y percentage points of all x-risk this century (e.g. if all risks are completely disjunctive, s.t. if you remove a risk, you're guaranteed to not be hit by any other risk). • For your two examples, the risks from asteroids and climate change are mostly independent from the majority of x-risk this century, so there the independence assumption is better. • The disjunctive assumption can happen if we e.g. study different mutually exclusive cases, e.g. reducing risk from worlds with fast AI take-off vs reducing risk from worlds with slow AI take-off. • I weakly think that the former is more common. • (Note that the difference only matters if total x-risk this century is large.) Edit: This is all about what version of this question is the best version, independent of inertia. If you're attached to percentage points because you don't want to change to an independence assumption after there's already been some discussion on the post, then this your latest suggestion seems good enough. (Though I think most people have been assuming low total amount of x-risk, so probably independence or not doesn't matter that much for the existing discussion.) How many EA 2021$s would you trade off against a 0.01% chance of existential catastrophe?

Currently, the post says:

A risk of catastrophe where an adverse outcome would permanently cause Earth-originating intelligent life's astronomical value to be <50% of what it would otherwise be capable of.

I'm not a fan of this definition, because I find it very plausible that the expected value of the future is less than 50% of what humanity is capable of. Which e.g. raises the question: does even extinction fulfil the description? Maybe you could argue "yes": but the mix of causing  an actual outcome compared with what intelligent life is "capable of" makes all of this unnecessarily dependant on both definitions and empirics about the future.

For purposes of the original question, I don't think we need to deal with all the complexity around "curtailing potential". You can just ask: How much should a funder be willing to pay to remove an 0.01% risk of extinction that's independent from all other extinction risks we're facing. (Eg., a giganormous asteroid is on its way to Earth and has an 0.01% probability of hitting us, causing guaranteed extinction. No on else will notice this in time. Do we pay $X to redirect it?) This seems closely analogous to questions that funders are facing (are we keen to pay to slightly reduce one, contemporary extinction risk). For non-extinction x-risk reduction, this extinction-estimate will be informative as a comparison point, and it seems completely appropriate that you should also check "how bad is this purported x-risk compared to extinction" as a separate exercise. Listen to more EA content with The Nonlinear Library I see you've started including some text from the post in each episode description, which is useful! Could you also include the URL to the post, at the top of the episode description? I often want to check out comments on interesting posts. Opportunity Costs of Technical Talent: Intuition and (Simple) Implications For example, I can't imagine any EA donor paying a non-ML engineer/manager$400,000, even if that person could make $2,000,000 in industry. Hm, I thought lightcone infrastructure might do that. Our current salary policy is to pay rates competitive with industry salary minus 30%. Given prevailing salary levels in the Bay Area for the kind of skill level we are looking at, we expect salaries to start at$150k/year plus healthcare (but we would be open to paying $315k for someone who would make$450k in industry).

https://www.lesswrong.com/posts/eR7Su77N2nK3e5YRZ/the-lesswrong-team-is-now-lightcone-infrastructure-come-work-3

Preprint is out! 100,000 lumens to treat seasonal affective disorder

For 100,000 LM, 12 hours a day, that would be 1000W * 12h/day * 20c/kwh = \$2.4.

Why aren't you freaking out about OpenAI? At what point would you start?

The website now lists Helen Toner, but do not list Holden, so it seems he is no longer on the board.

We're Redwood Research, we do applied alignment research, AMA

Hm, could you expand on why collusion is one of the most salient ways in which "it’s possible to build systems that are performance-competitive and training-competitive, and do well on average on their training distribution" could fail?

Is the thought here that — if models can collude — then they can do badly on the training distribution in an unnoticeable way, because they're being checked by models that they can collude with?

When pooling forecasts, use the geometric mean of odds

My answer is that we need to understand the resilience of the aggregated prediction to new information.

This seems roughly right to me. And in particular, I think this highlights the issue with the example of institutional failure. The problem with aggregating predictions to a single guess p of annual failure, and then using p to forecast, is that it assumes that the probability of failure in each year is independent from our perspective. But in fact, each year of no failure provides evidence that the risk of failure is low. And if the forecasters' estimates initially had a wide spread, then we're very sensitive to new information, and so we should update more on each passing year. This would lead to a high probability of failure in the first few years, but still a moderately high expected lifetime.