Lifeextension cites this https://pubmed.ncbi.nlm.nih.gov/24715076/ claiming "The results showed that when the proper dose of zinc is used within 24 hours of first symptoms, the duration of cold miseries is cut by about 50%" I'd be interested if you do a dig through the citation chain. The lifeextension page has a number of further links.
Epistemic status: Fun speculation. I know nothing about public health, and grabbed numbers from the first source I could find for every step of the below. I link to the sources which informed my point estimates.
Here’s my calculation broken down into steps:
...Disagree. The natural, no-Anthropic, counterfactual is one in which Amazon invests billions into an alignment-agnostic AI company. On this view, Anthropic is levying a tax on AI-interest where the tax pays for alignment. I'd put this tax at 50% (rough order of magnitude number).
If Anthropic were solely funded by EA money, and didn't capture unaligned tech funds this would be worse. Potentially far worse since Anthropic impact would have to be measured against the best alternative altruistic use of the money.
I suppose you see this Amazon investment as evide...
I think the modal no-Anthropic counterfactual does not have an alignment-agnostic AI company that's remotely competitive with OpenAI, which means there's no external target for this Amazon investment. It's not an accident that Anthropic was founded by former OpenAI staff who were substantially responsible for OpenAI's earlier GPT scaling successes.
My deeply concerning impression is that OpenPhil (and the average funder) has timelines 2-3x longer than the median safety researcher. Daniel has his AGI training requirements set to 3e29, and I believe the 15th-85th percentiles among safety researchers would span 1e31 +/- 2 OOMs. On that view, Tom's default values are off in the tails.
My suspicion is that funders write off this discrepancy, if noticed, as inside-view bias i.e. thinking safety researchers self-select for scaling optimism. My, admittedly very crude, mental model of an OpenPhil f...
In my opinion, the applications of prediction markets are much more general than these. I have a bunch of AI safety inspired markets up on Manifold and Metaculus. I'd say the main purpose of these markets is to direct future research and study. I'd phrase this use of markets as "A sub-field prioritization tool". The hope is that markets would help me integrate information such as (1) methodology's scalability e.g. in terms of data, compute, generalizability (2) research directions' rate of progress (3) diffusion of a given research direction through the re...
Seems to me safety timeline estimation should be grounded by a cross-disciplinary, research timeline prior. Such a prior would be determined by identifying a class of research proposals similar to AI alignment in terms of how applied/conceptual/mathematical/funded/etc. they are and then collecting data on how long they took.
I'm not familiar with meta-science work, but this would probably involve doing something like finding an NSF (or DARPA) grant category where grants were made public historically and then tracking down what became of those lines of...
Do you have a sense of which argument(s) were most prevalent and which were most frequently the interviewees crux?
It would also be useful to get a sense of which arguments are only common among those with minimal ML/safety engagement. If basic AI safety engagement reduces the appeal of a certain argument, then there's little need for further work on messaging in that area.
A few thoughts on ML/AI safety which may or may not generalize:
You should read successful candidates' SOPs to get a sense of style, level of detail, and content c.f. 1, 2, 3. Ask current EA PhDs for feedback on your statement. Probably avoid writing a statement focused on an AI safety/EA idea which is not in the ML mainstream e.g. IDA, mesa-optimization, etc. If you have multiple research ideas, considering writing more than one (i.e. tailored) SOP and submit the SOP which is most relevant to faculty at each university.
Look at groups' pages to get a sens...
On-demand Software Engineering Support for Academic AI Safety Labs
AI safety work, e.g. in RL and NLP, involves both theoretical and engineering work, but academic training and infrastructure does not optimize for engineering. An independent non-profit could cover this shortcoming by providing software engineers (SWE) as contractors, code-reviewers, and mentors to academics working on AI safety. AI safety research is often well funded, but even grant-rich professors are bottlenecked by university salary rules and professor hours which makes hiring competent...
Re: feasibility of AI alignment research, Metaculus already has Control Problem solved before AGI invented . Do you have a sense of what further questions would be valuable?
Ok, seems like this might have been more a terminological misunderstanding on my end. I think I agree with what you say here, 'What if the “Inner As AGI” criterion does not apply? Then the outer algorithm is an essential part of the AGI’s operating algorithm'.
Ok, interesting. I suspect the programmers will not be able to easily inspect the inner algorithm, because the inner/outer distinction will not be as clear cut as in the human case. The programmers may avoid sitting around by fiddling with more observable inefficiencies e.g. coming up with batch-norm v10.
Good clarification. Determining which kinds of factoring are the ones which reduce valence is more subtle than I had thought. I agree with you that the DeepMind set-up seems more analogous to neural nociception (e.g. high heat detection). My proposed set-up (Figure 5) seems significantly different from the DM/nociception case, because it factors the step where nociceptive signals affect decision making and motivation. I'll edit my post to clarify.
Your new setup seems less likely to have morally relevant valence. Essentially the more the setup factors out valence-relevant computation (e.g. by separating out a module, or by accessing an oracle as in your example) the less likely it is for valenced processing to happen within the agent.
Just to be explicit here, I'm assuming estimates of goal achievement are valence-relevant. How generally this is true is not clear to me.
Thanks for the link. I’ll have to do a thorough read through your post in the future. From scanning it, I do disagree with much of it, many of those points of disagreement were laid out by previous commenters. One point I didn’t see brought up: IIRC the biological anchors paper suggests we will have enough compute to do evolution-type optimization before the end of the century. So even if we grant your claim that learning to learn is much harder to directly optimize for, I think it’s still a feasible path to AGI. Or perhaps you think evolution like optimization takes more compute than the biological anchors paper claims?
Certainly valenced processing could emerge outside of this mesa-optimization context. I agree that for "hand-crafted" (i.e. no base-optimizer) systems this terminology isn't helpful. To try to make sure I understand your point, let me try to describe such a scenario in more detail: Imagine a human programmer who is working with a bunch of DL modules and interpretability tools and programming heuristics which feed into these modules in different ways -- in a sense the opposite end of the spectrum from monolithic language models. This person might program s...
Your interpretation is a good summary!
Re comment 1: Yes, sorry this was just meant to point at a potential parallel not to work out the parallel in detail. I think it'd be valuable to work out the potential parallel between the DM agent's predicate predictor module (Fig12/pg14) with my factored-noxiousness-object-detector idea. I just took a brief look at the paper to refresh my memory, but if I'm understanding this correctly, it seems to me that this module predicts which parts of the state prevent goal realization.
Re comment 2: Yes, this should read "(p...
Ah great, I have pledged. Is this new this year? Or maybe I didn't fill out the pledge last year; I don't remember.
Would it make sense for the Giving Tuesday organization to send out an annual reminder email? I have re-categorized all of my EA newsletters, and so they don't go to my main inbox. Maybe most people have calendar events, or the like, set up. Maybe though for people who almost forgot about Giving Tuesday (like me) a reminder email could be useful!
The question of how to aggregate over time may even have important consequences for population ethics paradoxes. You might be interested in reading Vanessa Kosoy's theory here in which she sums an individual's utility over time with an increasing penalty over life-span. Although I'm not clear on the justification for these choices, the consequences may be appealing to many: Vanessa, herself, emphasizes the consequences on evaluating astronomical waste and factory farming.
Agreed, I've been trying to help out a bit with Matt Barnett's new question here. Feedback period is still open, so chime in if you have ideas!
I suspect most Metaculites are accustomed to paying attention to how a question's operationalization deviates from its intent FWIW. Personally, I find the Montezuma's revenge criterion quite important without which the question would be far from AGI.
My intent with bringing up this question, was more to ask about how Linch thinks about the reliability of long-term predictions with no obvious frequentist-friendly trac...
Sure at an individual level deference usually makes for better predictions, but at a community level deference-as-the-norm can dilute the weight of those who are informed and predict differently from the median. Excessive numbers of deferential predictions also obfuscate how reliable the median prediction is, and thus makes it harder for others to do an informed update on the median.
As you say, it's better if people contribute information where their relative value-add is greatest, so I'd say it's reasonable for people to have a 2:1 ratio of questions on ...
Do your opinion updates extend from individual forecasts to aggregated ones? In particular how reliable do you think is the Metaculus median AGI timeline?
On the one hand, my opinion of Metaculus predictions worsened as I saw how the 'recent predictions' showed people piling in on the median on some questions I watch. On the other hand, my opinion of Metaculus predictions improved as I found out that performance doesn't seem to fall as a function of 'resolve minus closing' time (see https://twitter.com/tenthkrige/status/1296401128469471235). Are there some observations which have swayed your opinion in similar ways?
What kinds of evidence and experience could induce you to update for/against the importance of severe suffering?
Do you believe that exposure to or experience of severe suffering would cause the average EA to focus more heavily on it?
Edit: Moving the question "Thinking counterfactually, what evidence and experiences caused you to have the views you do on severe suffering?" down here because it looks like other commenters already asked another version of it.
Out of the rejection pool, are there any avoidable failure modes that come to mind -- i.e. mistakes made by otherwise qualified applicants which caused rejection? For example, in a previous EA-org application I found out that I ought to have included more detail regarding potential roadblocks to my proposed research project. This seemed like a valuable point in retrospect, but somewhat unexpected given my experience with research proposals outside of EA.
EDIT: (Thanks to Rose for for answering this question individually and agreeing to let me share her ans...
Thanks for the lively discussion! We've covered a lot of ground, so I plan to try to condense what was said into a follow-up blog post making similar points as the OP but taking into account all of your clarifications.
I’m not sure how broadly you’re construing ‘meta-reactions’, i.e. would this include basically any moral view which a person might reach based on the ordinary operation of their intuitions and reason and would all of these be placed on an equal footing?
'Meta-reactions' are the subset of our universalizable preferences which express prefer
...[From a previous DM comment]
...For moral talk to be capable of serving this practical purpose we just need some degree of people being inclined to respond to the same kinds of things or to be persuaded to share the same attitudes. But this doesn’t require any particularly strong, near-universal consensus or consensus on a particular single thing being morally good/bad. [...] This seems compatible with very, very widespread disagreement in fact: it might be that people are disposed to think that some varying combinations of “fraternity, blood revenge, family
Thanks for the long reply. I feel like our conversation becomes more meaningful as it goes on.
...Thanks for clarifying. This doesn't change my response though since I don't think there's a particularly notable convergence in emotional reactions to observing others in pain which would serve to make valenced emotional reactions a particularly central part of the meaning of moral terms. For example, it seems to me like children (and adults) often think that seeing others in pain is funny (c.f. punch and judy shows or lots of other comedy), fun to inflict and o
I don't think there's a particularly noteworthy consensus about it being bad for other people to be in pain
Sorry, I should've been more clear about what I'm referring to. When you say "People routinely seem to think" and "People sometimes try to argue", I suspect we're talking past each other. I am not concerned with such learned behaviors, but rather with our innate neurologically shared emotional response to seeing someone suffering. If you see someone dismembered it must be viscerally unpleasant. If you see someone strike your mother as a toddler it
...I had been accepted to study for a PhD on the implications of Wittgensteinian meta-philosophy for ethics.
Well, I for one, would've liked to have read the thesis! Wonderful, I suppose then most of my background talk was redundant. When it comes to mathematics, I found the arguments in Kripke's 'Wittgenstein on Rules and Private Language' quite convincing. I would love to see someone do an in depth translation applying everything Kripke says about arithmetic to total utilitarianism. I think this would be quite useful, and perhaps work well with my ideas h
...Thanks for the clarification, this certainly helps us get more concrete.
We don't need people to agree even slightly about whether chocolate/durian are tasty or yucky to learn the meanings of terms.
I agree that I was exaggerating my case. In durian-type-food-only worlds we would merely no longer expect 'X is tasty' to convey information to the listener about whether she/he should eat it. This difference does the work in the analogy with morality. Moral language is distinct from expression of other preferences in that we expect morality-based talk to be
...Here's another way of explaining where I'm coming from. The meaning of our words is set by ostensive definition plus our inductive bias. E.g. when defining red and purple we agree upon some prototypical cases of red and purple by perhaps pointing at red and saying 'red'. Then upon seeing maroon for the first time, we call it red because our brains process maroon in a similar way to how they process red. (Incidentally, the first part -- pointing at red -- is also only meaningful because we share inductive biases around pointing and object boundaries.) Of co
...Thank you for following up, and sorry that I haven't been able to respond as succinctly or clearly as I would've liked. I hope to write a follow up post which more clearly describes the flow of ideas from those contained in my comments to the original blog post as your comments have helped me see where my background assumption are likely do differ from others'.
I see now that it would be better to take a step back to explain at a higher level where I'm coming from. My line of reasoning follows from the ideas of the later Wittgenstein: many words have meanin
...Yes, thanks for clarifying. I believe that it is necessarily harder to make correct judgements in the domain of population ethics. My stronger claim is that any such judgements, even if correct, only carry force as mediated through our 'call to universality' meta-emotion. Hence, even if we have the right population axiology, this likely should not over-ride our more mundane moral intuitions.
Thanks for bringing up these points, I should've been more careful with these distinctions.
The learned meaning of moral language refers to our recollection/reaction to experiences. These reactions include approval, preferences and beliefs. I suspect that of these, approval is learned first. I imagine a parent harshly pronouncing 'Bad!' after a toddler gets singed wandering to close to a fire. Preferences enter the picture when we try to extend our use of moral language beyond the simple cases learned as a child. When we try to compare two things that are a
...I think the 'Diets of EAs' question could be a decent proxy for the prominence of animal welfare within EA. I think there are similar questions on metaculus for the general US population https://www.metaculus.com/questions/?order_by=-activity&search=vegetarian
I don't see the ethics question as all that useful, since I think most of population ethics presupposes some form of consequentialism.
Somewhat unrelated, but I'll leave this thought here anyway: Maybe EA metaculus users could perhaps benefit from posting question drafts as short-form posts on the EA forum.
Thanks for doing this, great idea! I think Metaculus could provide some valuable insight into how society's/EA's/philosophy's values might drift or converge over the coming decades.
For instance, I'm curious about where population ethics will be in 10-25 years. Something like, 'In 2030 will the consensus within effective altruism be that "Total utilitarianism is closer to describing our best moral theories than average utilitarianism and person affecting views"?'
Having your insight on how to operationalize this would be useful, since I'm not very happy with
...You discuss at one point in the podcast the claim that as AI systems take on larger and larger real world problems, the challenge of defining the reward function will become more and more important. For example for cleaning, the simple number-of-dust-particles objective is inadequate because we care about many other things e.g. keeping the house tidy and many side constraints e.g. avoiding damaging household objects. This isn't quite an argument for AI alignment solving itself, but it is an argument that the attention and resources poured into AI alignment
...Perhaps include a short form subsection under the Forum Favorites section? It seems to me that most short form posts have very low visibility.
If the forum admins have traffic statistics, they should be able to get a better sense of the visibility issue than I can. In particular, I suspect the short form section receives a fraction of the traffic of the frontpage, but this should be verified empirically.
I enjoyed reading this post! I like Wittgensteinian arguments, and applying them to ethics, so hurrah for this. There was also some lively discussion of it on the EA corner chat.
Another possible misleading motivation for irreducible normativity may be linguistic. It seems to me plausible that anyone who uses the word agony in the standard sense is committing her/himself to agony being undesirable. This is not an argument for irreducible normativity, but it may give you a feeling that there is some intrinsic connection underlying the set of self-evident cas
...Yes, I recently asked a metaculus mod about this, and they said they're hoping to bring back the ai.metaculus sub-domain eventually. For now, I'm submitting everything to the metaculus main domain.
Medium term AI forecasting with Metaculus
I'm working on a collection of metaculus.com questions intended to generate AI domain specific forecasting insights. These questions are intended to resolve in the 1-15 year range, and my hope is that if they're sufficiently independent, we'll get a range of positive and negative resolutions which will inform future forecasts.
I've already gotten a couple of them live, and am hoping for feedback on the rest:
1. When will AI out-perform humans on argument reasoning tasks?
...True, it seems like solar-aid's own estimate these days suggests around $5 per tonne. I can't find a more recent external review unfortunately.
This is a great point!
I'm somewhat hesitant about the CATF recommendation though. After a brief skim of the Founder's Pledge report, looks like they broke down CATF's efforts into three projects which have/are working out well. If we assume that Founder's Pledge reviewed a number of public advocacy/lobbying groups there's likely to have been a multiple testing issue. In that light, the retrospective cost of CO2e/$ estimate may not be predictive of their future CO2e/$ ratio. That said so long as the majority of their funds go into t...
Good catch, thanks.