I'm annoyed at vague "value" questions. If you ask a specific question the puzzle dissolves. What should you do to make the world go better? Maximize world-EV, or equivalently maximize your counterfactual value (not in the maximally-naive way — take into account how "your actions" affect "others' actions"). How should we distribute a fixed amount of credit or a prize between contributors? Something more Shapley-flavored, although this isn't really the question that Shapley answers (and that question is almost never relevant, in my possibly controversial op...
My current impression is that there is no mechanism and funders will do whatever they feel like and some investors will feel misled...
I now agree funders won't really lose out, at least.
Hmm. I am really trying to fill in holes, not be adversarial, but I mostly just don't think this works.
the funder probably recognizes some value in [] the projects the investors funded that weren't selected for retrofunding
No. If the project produces zero value, then no value for funder. If the project produces positive value, then it's retrofunded. (At least in the simple theoretical case. Maybe in practice small-value projects don't get funded. Then profit-seeking investors raise their bar: they don't just fund everything that's positive-EV, only stuff t...
This is ultimately up to retro funders, and they each might handle cases like this differently.
Oh man, having the central mechanism unclear makes me really uncomfortable for the investors. They might invest reasonably, thinking that the funders would use a particular process, and then the funders use a less generous process...
...In my opinion, by that definition of true value which is accounting for other opportunities and limited resources, they should just pay $100 for it. If LTFF is well-calibrated, they do not pay any more (in expectation) in the impact m
Actually I'm confused again. Suppose:
Bob has a project idea. The project would cost $10. A funder thinks it has a 99% chance of producing $0 value and a 1% chance of producing $100 value, so its EV is $1, and that's less than its cost, so it's not funded in advance. A super savvy investor thinks the project has EV > $10 and funds it. It successfully produces $100 value.
How much is the funder supposed to give retroactively?
I feel like ex-ante-funder-beliefs are irrelevant and the right question has to be "how much would you pay for the project if you kne...
Ah, hooray! This resolves my concerns, I think, if true. It's in tension with other things you say. For example, in the example here, "The Good Foundation values the project at $18,000 of impact" and funds the project for $18K. This uses the true-value method rather than the divide-by-P(success) method.
In this context "project's true value (to a funder) = $X" means "the funder is indifferent between the status quo and spending $X to make the project happen." True value depends on available funding and other available opportunities; it's a marginal analysis question.
I agree this would be better — then the funders would be able to fund Alice's project for $1 rather than $10. But still, for projects that are retroactively funded, there's no surplus-according-to-the-funder's-values, right?
Related, not sure: maybe it's OK if the funder retroactively gives something like cost ÷ ex-ante-P(success). What eliminates the surplus is if the funder retroactively gives ex-post-value.
Edit: no, this mechanism doesn't work. See this comment.
Yes. Rather than spending $1 on a project worth $10, the funder is spending $10 on the project — so the funder's goals aren't advanced. (Modulo that the retroactive-funding-recipients might donate their money in ways that advance the funder's goals.)
Thanks.
So if project-doers don't sell all of their equity, do they get retroactive funding for the rest, or just moral credit for altruistic surplus? The former seems very bad to me. To illustrate:
Alice has an idea for a project that would predictably [produce $10 worth of impact / retrospectively be worth $10 to funders]. She needs $1 to fund it. Under normal funding, she'd be funded and there'd be a surplus worth $9 of funder money. In the impact market, she can decline to sell equity (e.g. by setting the price above $10 and supplying the $1 costs herself) and get $10 retroactive funding later, capturing all of the surplus.
The latter... might work, I'll think about it.
Oh wait I forgot about the details at https://manifund.org/about/impact-certificates. Specific criticism retracted until learning more; skepticism remains. What happens if a project is funded at a valuation higher than its funding-need? If Alice's project is funded for $5, where does $4 go?
Designing an impact market well is an open problem, I think. I don't think your market works well, and I think the funders were mistaken to express interest. To illustrate:
Alice has an idea for a project that would predictably [produce $10 worth of impact / retrospectively be worth $10 to funders]. She needs $1 to fund it. Under normal funding, she'd be funded and there'd be a surplus worth $9 of funder money. In the impact market, whichever investor reads and understands her project first funds it and later gets $10.
More generally, in your market, all sur...
My favorite AI governance research since this post (putting less thought into this list):
I mostly haven't really read recent research on compute governance (e.g. 1, 2) or international governance (e.g. 1, ...
I appreciate it; I'm pretty sure I have better options than finishing my Bachelor's; details are out-of-scope here but happy to chat sometime.
TLDR: AI governance; maybe adjacent stuff.
Skills & background: AI governance research; email me for info on my recent work.
Location: flexible.
LinkedIn: linkedin.com/in/zsp/.
Email: zacharysteinperlman at gmail.
Other notes: no college degree.
You don't need EA or AI safety motives to explain the event. Later reporting suggested that it was caused by (1) Sutskever and other OpenAI executives telling the board that Altman often lied (WSJ, WaPo, New Yorker) and (2) Altman dishonestly attempting to remove Toner from the board (over the obvious pretext that her coauthored paper Decoding Intentions was too critical of OpenAI, plus allegedly falsely telling board members that McCauley wanted Toner removed) (NYT, New Yorker). As far as I know, there's ~no evidence that EA or AI safety motives were rele...
Thanks!
General curiosity. Looking at it, I'm interested in my total-hours and karma-change. I wish there was a good way to remind me of... all about how I interacted with the forum in 2022, but wrapped doesn't do that (and probably ~can't do it; probably I should just skim my posts from that year...)
I object to your translation of actual-votes into approval-votes and RCV-votes, at least in the case of my vote. I gave almost all of my points to my top pick, almost all of the rest to my second pick, almost all of the rest to my third pick, and so forth until I was sure I had chosen something that would make top 3. But e.g. I would have approved of multiple. (Sidenote: I claim my strategy is optimal under very reasonable assumptions/approximations. You shouldn't distribute points like you're trying to build a diverse portfolio.)
we are convinced this push towards decentralization will make the EA ecosystem more resilient and better enable our projects to pursue their own goals.
I'm surprised. Why? What was wrong with the EV sponsorship system?
(I've seen Elizabeth's and Ozzie's posts on this topic and didn't think the downsides of sponsorship were decisive. Curious which downsides were decisive for you.)
[Edit: someone offline told me probably shared legal liability is pretty costly.]
There's no inherent contradiction between the current sponsorship program isn't a good fit for this sponsor and these sponsored orgs and fiscal sponsorship is often a good thing.
There are a number of specific reasons I could see that this wasn't a good fit:
My take is basically that (a) the projects have been run so independently that there was minimal benefit from being within the same legal entity, (b) organizations with very different legal risk profiles sharing a legal entity requires either excessive caution from some or excessive risk to others, and (c) board oversight is important for nonprofits and overseeing so many independent projects with their own CEOs didn't make sense for part-time volunteer board members.
(Also seconding Elizabeth's post.)
An undignified way for everyone to die: an AI lab produces clear, decisive evidence of AI risk/scheming/uncontrollability, freaks out, and tells the world. A less cautious lab ends the world a year later.
A possible central goal of AI governance: cause an AI lab produces decisive evidence of AI risk/scheming/uncontrollability, freaks out, and tells the world to quickly result in rules that stop all labs from ending the world.
I don't know how we can pursue that goal.
I don't want to try to explain now, sorry.
(This shortform was intended more as starting-a-personal-list than as a manifesto.)
What's the best thing to read on "Zvi's writing on EAs confusing the map for the territory"? Or at least something good?
Thanks for the engagement. Sorry for not really engaging back. Hopefully someday I'll elaborate on all this in a top-level post.
Briefly: by axiological utilitarianism, I mean classical (total, act) utilitarianism, as a theory of the good, not as a decision procedure for humans to implement.
Thanks. I agree that the benefits could outweigh the costs, certainly at least for some humans. There are sophisticated reasons to be veg(etari)an. I think those benefits aren't cruxy for many EA veg(etari)ans, or many veg(etari)ans I know.
Or me. I'm veg(etari)an for selfish reasons — eating animal corpses or feeling involved in the animal-farming-and-killing process makes me feel guilty and dirty.
I certainly haven't done the cost-benefit analysis on veg(etari)anism, on the straightforward animal-welfare consideration or the considerations you mention...
(I agree it is reasonable to have a bid-ask spread when betting against capable adversaries. I think the statements-I-object-to are asserting something else, and the analogy to financial markets is mostly irrelevant. I don't really want to get into this now.)
Thanks. I agree! (Except with your last sentence.) Sorry for failing to communicate clearly; we were thinking about different contexts.
Thanks.
Some people say things like "my doom-credence fluctuates between 10% and 25% day to day"; this is dutch-book-able and they'd make better predictions if they reported what they feel like on average rather than what they feel like today, except insofar as they have new information.
Common beliefs/attitudes/dispositions among [highly engaged EAs/rationalists + my friends] which seem super wrong to me:
Meta-uncertainty:
(meta musing) The conjunction of the negations of a bunch of statements seems a bit doomed to get a lot of disagreement karma, sadly. Esp. if the statements being negated are "common beliefs" of people like the ones on this forum.
I agreed with some of these and disagreed with others, so I felt unable to agreevote. But I strongly appreciated the post overall so I strong-upvoted.
Giving a range of probabilities when you should give a probability + giving confidence intervals over probabilities + failing to realize that probabilities of probabilities just reduce to simple probabilities
This is just straightforwardly correct statistics. For example, ask a true bayesian to estimate the outcome of flipping a coin of unknown bias, and they will construct a probability distribution of coin flip probabilites, and only reduce this to a single probability when forced to make a bet. But when not taking a bet, they should be doing update...
I have high credence in basically zero x-risk after [the time of perils / achieving technological maturity and then stabilizing / 2050]. Even if it was pretty low, "pretty low" * 10^70 ≈ 10^70. Most value comes from the worlds with extremely low longterm rate of x-risk, even if you think they're unlikely.
(I expect an effective population much much larger than 10^10 humans, but I'm not sure "population size" will be a useful concept (e.g. maybe we'll decide to wait billions of years before converting resources to value), but that's not the crux here.)
It takes like 20 hours of focused reading to get basic context on AI risk and threat models. Once you have that, I feel like you can read everything important in x-risk-focused AI policy in 100 hours. Same for x-risk-focused AI corporate governance, AI forecasting, and macrostrategy.
[Edit: read everything important doesn't mean you have nothing left to learn; it means something like you have context to appreciate ~all papers, and you can follow ~all conversations in the field except between sub-specialists, and you have the generators of good overviews lik...
Here are some of the curricula that HAIST uses:
The HAIST website also has a resources tab with lists of technical and policy papers.
I disagree-voted because I feel like I've done much more than 100-hours of reading on AI Policy (including finishing the AI Safety Fundamentals Governance course) and still have a strong sense there's a lot I don't know, and regularly come across new work that I find insightful. Very possibly I'm prioritising reading the wrong things (and would really value a reading list!) but thought I'd share my experience as a data point.
(I agree that the actual ratio isn't like 10^20. In my view this is mostly because of the long-term effects of neartermist stuff,* which the model doesn't consider, so my criticism of the model stands. Maybe I should have said "undervalue longterm-focused stuff by a factor of >10^20 relative to the component of neartermist stuff that the model considers.")
*Setting aside causing others to change prioritization, which it feels wrong for this model to consider.
Thanks. I respect that the model is flexible and that it doesn't attempt to answer all questions. But at the end of the day, the model will be used to "help assess potential research projects at Rethink Priorities" and I fear it will undervalue longterm-focused stuff by a factor of >10^20.
I believe Marcus and Peter will release something before long discussing how they actually think about prioritization decisions.
AFAICT, the model also doesn't consider far future effects of animal welfare and GHD interventions. And against relative ratios like >10^20 between x-risk and neartermist interventions, see:
I haven't engaged with this. But if I did, I think my big disagreement would be with how you deal with the value of the long-term future. My guess is your defaults dramatically underestimate the upside of technological maturity (near-lightspeed von neumann probes, hedonium, tearing apart stars, etc.) [edit: alternate frame: underestimate accessible resources and efficiency of converting resources to value], and the model is set up in a way that makes it hard for users to fix this by substituting different parameters.
...The significance of existential risk dep
I think you're right that we don't provide a really detailed model of the far future and we underestimate* expected value as a result. It's hard to know how to model the hypothetical technologies we've thought of, let alone the technologies that we haven't. These are the kinds of things you have to take into consideration when applying the model, and we don't endorse the outputs as definitive, even once you've tailored the parameters to your own views.
That said, I do think the model has a greater flexibility than you suggest. Some of these options are hidd...
This was the press release; the actual order has now been published.
One safety-relevant part:
...4.2. Ensuring Safe and Reliable AI. (a) Within 90 days of the date of this order, to ensure and verify the continuous availability of safe, reliable, and effective AI in accordance with the Defense Production Act, as amended, 50 U.S.C. 4501 et seq., including for the national defense and the protection of critical infrastructure, the Secretary of Commerce shall require:
(i) Companies developing or
Both. As you note, Scanlonian contractualism is about reasonable-rejection.
(Personally, I think it's kinda appealing to consider contractualism for deriving principles, e.g. via rational-rejection or more concretely via veil-of-ignorance. I'm much less compelled by thinking in terms of claims-to-aid. I kinda assert that deriving-principles is much more central to contractualism; I notice that https://plato.stanford.edu/entries/contractualism/ doesn't use "claim," "aid," or "assistance" in the relevant sense, but does use "principle.")
(Probably not going to engage more on this.)
I just read the summary but I want to disagree with:
Contractualism says: When your actions could benefit both an individual and a group, don't compare the individual's claim to aid to the group's claim to aid, which assumes that you can aggregate claims across individuals. Instead, compare an individual's claim to aid to the claim of every other relevant individual in the situation by pairwise comparison. If one individual's claim to aid is a lot stronger than any other's, then you should help them.
"Contractualism" is a broad family of theories, many ...
Yeah, I agree; I think the geometric mean is degenerate unless your probability distribution quickly approaches density-0 around 0% and 100%. This is an intuition pump for why the geometric mean is the wrong statistic.
Also if you're taking the geometric mean I think you should take it of the odds ratio (as the author does) rather than the probability; e.g. this makes probability-0 symmetric with probability-1.
[To be clear I haven't read most of the post.]
No, I don’t have a take on deference in EA. I meant: post contests generally give you evidence about which posts to pay attention to, especially if they’re run by OP. I am sharing that I have reason to believe that (some of) these winners are less worth-paying-attention-to than you’d expect on priors.
(And normally this reason would be very weak because the judges engaged much more deeply than I did, but my concerns with the posts I engaged with seem unlikely to dissolve upon deeper engagement.)
Congratulations to the winners.
I haven't engaged deeply with any of the winning posts like the judges have, but I engaged shallowly with 3–4 when they were written. I thought they were methodologically doomed (‘Dissolving’ AI Risk) or constituted very weak evidence even if they were basically right (AGI and the EMH and especially Reference Class-Based Priors). (I apologize for this criticism-without-justification, but explaining details is not worth it and probably the comments on those posts do a fine job.)
Normally I wouldn't say this. But OP is high-stat...
(I agree that geometric-mean-of-odds is an irrelevant statistic and ‘Dissolving’ AI Risk's headline number should be the mean-of-probabilities, 9.7%. I think some commenters noticed that too.)
Yeah, I think the real reason is we think we're safer than OpenAI (and possibly some wanting-power but that mostly doesn't explain their behavior).
See Dario's Senate testimony from two months ago:
...With the fast pace of progress in mind, we can think of AI risks as falling into three buckets:
● Short-term risks are those present in current AI systems or that imminently will be present. This includes concerns like privacy, copyright issues, bias and fairness in the model’s outputs, factual accuracy, and the potential to generate misinformation or propaganda.
● Medium-term risks are those we will face in two to three years. In that time period, Anthropic’s projections suggest that AI systems ma
See https://ea-internships.pory.app/board, you can filter for volunteer.
It would be helpful to mention if you have background or interest in particular cause areas.