There might not be any real disagreement. I'm just saying that there's no direct conflict between "present people having material wealth beyond what they could possibly spend on themselves" and "virtually all resources are used in the way that totalist axiologies would recommend".
What's the argument for why an AI future will create lots of value by total utilitarian lights?
At least for hedonistic total utilitarianism, I expect that a large majority of expected-hedonistic-value (from our current epistemic state) will be created by people who are at least partially sympathetic to hedonistic utilitarianism or other value systems that value a similar type of happiness in a scope-sensitive fashion. And I'd guess that humans are more likely to have such values than AI systems. (At least conditional on my thinking that such values are a g...
I find it plausible that future humans will choose to create much fewer minds than they could. But I don't think that "selfishly desiring high material welfare" will require this. Just the milky way has enough stars for each currently alive human to get an entire solar system each. Simultaneously, intergalactic colonization is probably possible (see here) and I think the stars in our own galaxy is less than 1-in-a-billion of all reachable stars. (Most of which are also very far away, which further contributes to them not being very interesting to use for s...
compared to MIRI people, or even someone like Christiano, you, or Joe Carlsmith probably have "low" estimates
Christiano says ~22% ("but you should treat these numbers as having 0.5 significant figures") without a time-bound; and Carlsmith says ">10%" (see bottom of abstract) by 2070. So no big difference there.
I'll hopefully soon make a follow-up post with somewhat more concrete projects that I think could be good. That might be helpful.
Are you more concerned that research won't have any important implications for anyone's actions, or that the people whose decisions ought to change as a result won't care about the research?
Similary, 'Politics is the Mind-Killer' might be the rationalist idea that has aged worst - especially for its influences on EA.
What influence are you thinking about? The position argued in the essay seems pretty measured.
Politics is an important domain to which we should individually apply our rationality—but it’s a terrible domain in which to learn rationality, or discuss rationality, unless all the discussants are already rational. [...]
...I’m not saying that I think we should be apolitical, or even that we should adopt Wikipedia’s ideal of the Neu
I'm relying on my social experience and intuition here, so I don't expect I've got it 100% right, and others may indeed have different interpretations of the community's history with engaging with politics.
But concern about people over-extrapolating from Eliezer's initial post (many such cases) and treating it more of a norm to ignore politics full-stop seems to have been an established concern many years ago (related discussion here). I think that there's probably an interaction effect with the 'latent libertarianism' in early LessWrong/Rationalist space ...
I think the strongest argument against EV-maximization in these cases is the Two-Envelopes Problem for Uncertainty about Brain-Size Valuation.
I liked this recent interview with Mark Dybul who worked on PEPFAR from the start: https://www.statecraft.pub/p/saving-twenty-million-lives
One interesting contrast with the conclusion in this post is that Dybul thinks that PEPFAR's success was a direct consequence of how it didn't involve too many people and departments early on — because the negotiations would have been too drawn out and too many parties would have tried to get pieces of control. So maybe a transparent process that embraced complexity wouldn't have achieved much, in practice.
(At other par...
FWIW you can see more information, including some of the reasoning, on page 655 (# written on pdf) / 659 (# according to page searcher) of the report. (H/t Isabel.) See also page 214 for the definition of the question.
Some tidbits:
Experts started out much higher than superforecasters, but updated downwards after discussion. Superforecasters updated a bit upward, but less:
(Those are billions on the y-axis.)
This was surprising to me. I think the experts' predictions look too low even before updating, and look much worse after updating!
The part of the ...
It's the crux between you and Ajeya, because you're relatively more in agreement on the other numbers. But I think that adopting the xpt numbers on these other variables would slow down your own timelines notably, because of the almost complete lack of increase in spending.
That said, if the forecasters agreed with your compute requirements, they would probably also forecast higher spending.
The XPT forecasters are so in the dark about compute spending that I just pretend they gave more reasonable numbers. I'm honestly baffled how they could be so bad. The most aggressive of them thinks that in 2025 the most expensive training run will be $70M, and that it'll take 6+ years to double thereafter, so that in 2032 we'll have reached $140M training run spending... do these people have any idea how much GPT-4 cost in 2022?!?!? Did they not hear about the investments Microsoft has been making in OpenAI? And remember that's what the most aggressive among them thought! The conservatives seem to be living in an alternate reality where GPT-3 proved that scaling doesn't work and an AI winter set in in 2020.
in terms of saving “disability-adjusted life years” or DALYs, "a case of HIV/AIDS can be prevented for $11, and a DALY gained for $1” by improving the safety of blood transfusions and distributing condoms
These numbers are wild compared to eg current givewell numbers. My guess would be that they're wrong, and if so, that this was a big part of why PEPFAR did comparatively better then expected. Or maybe that they were significantly less scalable (measured in cost of marginal life saved as a function of lives saved so far) than PEPFAR.
If the numbers were r...
Nice, gotcha.
Incidentally, as its central estimate for algorithmic improvement, the takeoff speeds model uses AI and Efficiency's ~1.7x per year, and then halves it to ~1.3x per year (because todays' algorithmic progress might not generalize to TAI). If you're at 2x per year, then you should maybe increase the "returns to software" from 1.25 to ~3.5, which would cut the model's timelines by something like 3 years. (More on longer timelines, less on shorter timelines.)
Yeah sorry, I didn't mean to say this directly contradicted anything you said. It just felt like a good reference that might be helpful to you or other people reading the thread. (In retrospect, I should have said that and/or linked it in response to the mention in your top-level comment instead.)
(Also, personally, I do care about how much effort and selection is required to find good retrodictions like this, so in my book "I didn't look up the data on Google beforehand" is relevant info. But it would have been way more impressive if someone had been able ...
and notably there's been perhaps a 2x speedup in algorithmic progress since 2022
I don't understand this. Why would there be a 2x speedup in algorithmic progress?
And, as I think Eliezer said (roughly), there don't seem to be many cases where new tech was predicted based on when some low-level metric would exceed the analogous metric in a biological system. [...] And the way in which machines perform tasks usually looks very different than how biological systems do it (bird vs. airplanes, etc.).
From Birds, Brains, Planes, and AI:
...This data shows that Shorty [hypothetical character introduced earlier in the post] was entirely correct about forecasting heavier-than-air flight. (For details about the data, see appendix.
I think my biggest disagreement with the takeoff speeds model is just that it's conditional on things like: no coordinated delays, regulation, or exogenous events like war, and doesn't take into account model uncertainty.
Cool, I thought that was most of the explanation for the difference in the median. But I thought it shouldn't be enough to explain the 14x difference between 28% and 2% by 2030, because I think there should be a ≥20% chance that there are no significant coordinated delays, regulation, or relevant exogenous events if AI goes wild in the nex...
My own distribution over the training FLOP for transformative AI is centered around ~10^32 FLOP using 2023 algorithms, with a standard deviation of about 3 OOM.
Thanks for the numbers!
For comparison, takeoffspeeds.com has an aggressive monte-carlo (with a median of 10^31 training FLOP) that yields a median of 2033.7 for 100% automation — and a p(TAI < 2030) of ~28%. That 28% is pretty radically different from your 2%. Do you know your biggest disagreements with that model?
The 1 OOM difference in training FLOP presumably doesn't explain that much. (Althou...
The quote continues:
Of the remaining 5 %, around 70 % would eventually be reached by other civilisations, while 30 % would have remained empty in our absence.
I think the 70%/30% numbers are the relevant ones for comparing human colonization vs. extinction vs. misaligned AGI colonization. (Since 5% cuts the importance of everything equally.)
...assuming defensive dominance in space, where you get to keep space that you acquire first. I don't know what happens without that.
This would suggest that if we're indifferent between space being totally uncoloni...
If AGI systems had goals that were cleanly separated from the rest of their cognition, such that they could learn and self-improve without risking any value drift (as long as the values-file wasn't modified), then there's a straightforward argument that you could stabilise and preserve that system's goals by just storing the values-file with enough redundancy and digital error correction.
So this would make section 6 mostly irrelevant. But I think most other sections remain relevant, insofar as people weren't already convinced that being able to build stabl...
I really like the proposed calibration game! One thing I'm curious about is whether real-world evidence more often looks like a likelihood ratio or like something else (e.g. pointing towards a specific probability being correct). Maybe you could see this from the structure of priors+likelihoodratios+posteriors in the calibration game — e.g. check whether the long-run top-scorers likelihood ratios correlated more or less than their posterior probabilities.
(If someone wanted to build this: one option would be to start with pastcasting and then give archived ...
And it would probably be a huge mistake to seek out an adderall prescription.
...unless you have other reasons to believe that an Adderall prescription might be good for you. Saliently: if you have adhd symptoms.
Depends on how much of their data they'd have to back up like this. If every bit ever produced or operated on instead had to be be 25 bits — that seems like a big fitness hit. But if they're only this paranoid about a few crucial files (e.g. the minds of a few decision-makers), then that's cheap.
And there's another question about how much stability contributes to fitness. In humans, cancer tends to not be great for fitness. Analogously, it's possible that most random errors in future civilizations would look less like slowly corrupting values and more like...
This is a great question. I think the answer depends on the type of storage you're doing.
If you have a totally static lump of data that you want to encode in a harddrive and not touch for a billion years, I think the challenge is mostly in designing a type of storage unit that won't age. Digital error correction won't help if your whole magnetism-based harddrive loses its magnetism. I'm not sure how hard this is.
But I think more realistically, you want to use a type of hardware that you regularly use, regularly service, and where you can copy the informati...
I'm not sure how literally you mean "disprove", but at it's face, "assume nothing is related to anything until you have proven otherwise" is a reasoning procedure that will never recommend any action in the real world, because we never get that kind of certainty. When humans try to achieve results in the real world, heuristics, informal arguments, and looking at what seems to have worked ok in the past are unavoidable.
Global poverty probably have slower diminishing marginal returns, yeah. Unsure about animal welfare. I was mostly thinking about longtermist causes.
Re 80,000 Hours: I don't know exactly what they've argued, but I think "very valuable" is compatible with logarithmic returns. There are also diminishing marginal returns to direct workers in any given cause, so logarithmic returns on money doesn't mean that money becomes unimportant compared to people, or anything like that.
Because utility and integrity are wholly independent variables, so there is no reason for us to assume a priori that they will always correlate perfectly. So if we wish to believe that integrity and expected value correlated for SBF, then we must show it. We must actually do the math.
This feels a bit unfair when people (i) have argued that utility and integrity will correlate strongly in practical cases (why use "perfectly" as your bar?), and (ii) that they will do so in ways that will be easy to underestimate if you just "do the math".
You might think t...
Because a double-or-nothing coin-flip scales; it doesn't stop having high EV when we start dealing with big bucks.
Risky bets aren't themselves objectionable in the way that fraud is, but to just address this point narrowly: Realistic estimates puts risky bets at much worse EV when you control a large fraction of the altruistic pool of money. I think a decent first approximation is that EA's impact scales with the logarithm of its wealth. If you're gambling a small amount of money, that means you should be ~indifferent to 50/50 double or nothing (note th...
I think marginal returns probably don't diminish nearly as quickly as the logarithm for neartermist cause areas, but maybe that's true for longtermist ones (where FTX/Alameda and associates were disproportionately donating), although my impression is that there's no consensus on this, e.g. 80,000 Hours has been arguing for donations still being very valuable.
(I agree that the downside (damage to the EA community and trust in EAs) is worse than nothing relative to the funds being gambled, but that doesn't really affect the spirit of the argument. It's very easy to underappreciate the downside in practice, though.)
conflicts of interest in grant allocation, work place appointments should be avoided
Worth flagging: Since there are more men than women in EA, I would expect a greater fraction of EA women than EA men to be in relationships with other EAs. (And trying to think of examples off the top of my head supports that theory.) If this is right, the policy "don't appoint people for jobs where they will have conflicts of interest" would systematically disadvantage women.
(By contrast, considering who you're already in a work-relationship with when choosing who to date ...
Yeah, I agree that multipolar dynamics could prevent lock-in from happening in practice.
I do think that "there is a non-trivial probability that a dominant institution will in fact exist", and also that there's a non-trivial probability that a multipolar scenario will either
If re-running evolution requires simulating the weather and if this is computationally too difficult then re-running evolution may not be a viable path to AGI.
There are many things that prevent us from literally rerunning human evolution. The evolution anchor is not a proof that we could do exactly what evolution did, but instead an argument that if something as inefficient as evolution spit out human intelligence with that amount of compute, surely humanity could do it if we had a similar amount of compute. Evolution is very inefficient — it has itself be...
For instance we might get WBEs only in hypothetical-2080 but get superintelligent LLMs in 2040, and the people using superintelligent LLMs make the world unrecognisably different by 2042 itself.
I definitely don't just want to talk about what happens / what's feasible before the world becomes unrecognisably different. It seems pretty likely to me that lock-in will only become feasible after the world has become extremely strange. (Though this depends a bit on details of how to define "feasible", and what we count as the start-date of lock-in.)
And I think th...
Chaos theory is about systems where tiny deviations in initial conditions cause large deviations in what happens in the future. My impression (though I don't know much about the field) is that, assuming some model of a system (e.g. the weather), you can prove things about how far ahead you can predict the system given some uncertainty (normally about the initial conditions, though uncertainty brought about by limited compute that forces approximations should work similarly). Whether the weather corresponds to any particular model isn't really susceptible to proofs, but that question can be tackled by normal science.
Quoting from the post:
Thus, we suspect that an adequate solution to AI alignment could be achieved given sufficient time and effort. (Though whether that will actually happen is a different question, not addressed since our focus is on feasibility rather than likelihood.)
AI doomers tend to agree with this claim. See e.g. Eliezer in list of lethalities:
...None of this is about anything being impossible in principle. The metaphor I usually use is that if a textbook from one hundred years in the future fell into our hands, containing all of the simpl
Thanks Lizka. I think about section 0.0 as being a ~1-page summary (in between the 1-paragraph summary and the 6-page summary) but I could have better flagged that it can be read that way. And your bullet point summary is definitely even punchier.
Thanks!
You've assumed from the get go that AIs will follow similar reinforcement-learning like paradigms like humans and converge on similar ontologies of looking at the world as humans. You've also assumed these ontologies will be stable - for instance a RL agent wouldn't become superintelligent, use reasoning and then decide to self modify into something that is not an RL agent.
Something like that, though I would phrase it as relying on the claim that it's feasible to build AI systems like that, since the piece is about the feasibility of lock-in. And in...
I broadly agree with this. For the civilizations that want to keep thinking about their values or the philosophically tricky parts of their strategy, there will be an open question about how convergent/correct their thinking process is (although there's lots you can do to make it more convergent/correct — eg. redo it under lots of different conditions, have arguments be reviewed by many different people/AIs, etc).
And it does seem like all reasonable civilizations should want to do some thinking like this. For those civilizations, this post is just saying t...
We used the geometric mean of the samples with the minimum and maximum removed to better deal with extreme outliers, as described in our previous post
I don't see how that's consistent with:
...What is the probability that Russia will use a nuclear weapon in Ukraine in the next MONTH?
- Aggregate probability: 0.0859 (8.6%)
- All probabilities: 0.27, 0.04, 0.02, 0.001, 0.09, 0.08, 0.07
What is the probability that Russia will use a nuclear weapon in Ukraine in the next YEAR?
- Aggregate probability: 0.2294 (23%)
- All probabilities: 0.38, 0.11, 0.11, 0.005, 0.42, 0.2, 0
On the other hand, the critic updated me towards higher numbers on p(nuke london|any nuke). Though I assume Samotsvety have already read it, so not sure how to take that into account. But given that uncertainty, given that that number only comes into play in confusing worlds where everyone's models are broken, and given Samotsvety's 5x higher unconditional number, I will update at least a bit in that direction.
Thanks for the links! (Fyi the first two points to the same page.)
The critic's 0.3 assumes that you'll stay until there's nuclear exchanges between Russia and NATO. Zvi was at 75% if you leave as soon as a conventional war between NATO and Russia starts.
I'm not sure how to compare that situation with the current situation, where it seems more likely that the next escalatory step will be a nuke on a non-NATO target than conventional NATO-Russia warfare. But if you're happy to leave as soon as either a nuke is dropped anywhere or conventional NATO/Russia warfare breaks out, I'm inclined to aggregate those numbers to something closer to 75% than 50%.
Thanks for doing this!
In this squiggle you use "ableToEscapeBefore = 0.5". Does that assume that you're following the policy "escape if you see any tactical nuclear weapons being used in Ukraine"? (Which someone who's currently on the fence about escaping London would presumably do.)
If yes, I would have expected it to be higher than 50%. Do you think very rapid escalation is likely, or am I missing something else?
I think this particular example requires an assumption of logarithmically diminishing returns, but is right with that.
(I think the point about roughly quadratic value of information applies more broadly than just for logarithmically diminishing returns. And I hadn't realised it before. Seems important + underappreciated!)
One quirk to note: If a funder (who I want to be well-informed) is 50/50 on S vs L, but my all-things-considered belief is 60/40, then I would value the first 1% they shift towards my position much more than they do (maybe 10x more?)  ...
I think that's right other than that weak upvotes never become worth 3 points anymore (although this doesn't matter on the EA forum, given that no one has 25,000 karma), based on this lesswrong github file linked from the LW FAQ.
Nitpicking:
A property of making directional claims like this is that MacAskill always has 50% confidence in the claim I’m making, since I’m claiming that his best-guess estimate is too high/low.
This isn't quite right. Conservation of expected evidence means that MacAskill's current probabilities should match his expectation of the ideal reasoning process. But for probabilities close to 0, this would typically imply that he assigns higher probability to being too high than to being too low. For example: a 3% probability is compatible with 90% probability th...
The term "most important century" pretty directly suggests that this century is unique, and I assume that includes its unusually large amount of x-risk (given that Holden seems to think that the development of TAI is both the biggest source of x-risk this century and the reason for why this might be the most important century).
Holden also talks specifically about lock-in, which is one way the time of perils could end.
See e.g. here:
...It's possible, for reasons outlined here, that whatever the main force in world events is (perhaps digital people, misaligned A
The page for the Century Fellowship outlines some things that fellows could do, which are much broader than just university group organizing:
...When assessing applications, we will primarily be evaluating the candidate rather than their planned activities, but we imagine a hypothetical Century Fellow may want to:
- Lead or support student groups relevant to improving the long-term future at top universities
- Develop a research agenda aimed at solving difficult technical problems in advanced deep learning models
- Start an organization that teaches crit
I'm not saying it's infinite, just that (even assuming it's finite) I assign non-0 probability to different possible finite numbers in a fashion such that the expected value is infinite. (Just like the expected value of an infinite st petersburg challenge is infinite, although every outcome has finite size.)
The topic under discussion is whether pascalian scenarios are a problem for utilitarianism, so we do need to take pascalian scenarios seriously, in this discussion.
I simply don’t believe that infinities exist, and even though 0 isn’t a probability, I reject the probabilistic argument that any possibility of infinity allows them to dominate all EV calculations.
Problems with infinity doesn't go away just because you assume that actual infinities don't exist. Even with just finite numbers, you can face gambles that have infinite expected value, if increasingly good possibilities have insufficiently rapidly diminishing probabilities. And this still causes a lot of problems.
(I also don't think that's an esoteric possib...
Here's one line of argument:
- Positive argument in favor of humans: It seems pretty likely that whatever I'd value on-reflection will be represented in a human future, since I'm a human. (And accordingly, I'm similar to many other humans along many dimensions.)
- If AI values where sampled ~randomly (whatever that means), I think that the above argument would be basically enough to carry the day in favor of humans.
- But here's a salient positive argument in favor of why AIs' values will be similar to mine: People will be training AIs to be nice and helpful, which
... (read more)