This post summarizes a new meta-analysis from the Humane and Sustainable Food Lab. We analyze the most rigorous randomized controlled trials (RCTs) that aim to reduce consumption of meat and animal products (MAP). We conclude that no theoretical approach, delivery mechanism, or persuasive message should be considered a well-validated means of reducing MAP consumption. By contrast, reducing consumption of red and processed meat (RPM) appears to be an easier target. However, if RPM reductions lead to more consumption of chicken and fish, this is likely bad for animal welfare and doesn’t ameliorate zoonotic outbreak or land and water pollution. We also find that many promising approaches await rigorous evaluation.
This post updates a post from a year ago. We first summarize the current paper, and then describe how the project and its findings have evolved.
What is a rigorous RCT?
We operationalize “rigorous RCT” as any study that:
* Randomly assigns participants to a treatment and control group
* Measures consumption directly -- rather than (or in addition to) attitudes, intentions, or hypothetical choices -- at least a single day after treatment begins
* Has at least 25 subjects in both treatment and control, or, in the case of cluster-assigned studies (e.g. university classes that all attend a lecture together or not), at least 10 clusters in total.
Additionally, studies needed to intend to reduce MAP consumption, rather than (e.g.) encouraging people to switch from beef to chicken, and be publicly available by December 2023.
We found 35 papers, comprising 41 studies and 112 interventions, that met these criteria. 18 of 35 papers have been published since 2020.
The main theoretical approaches:
Broadly speaking, studies used Persuasion, Choice Architecture, Psychology, and a combination of Persuasion and Psychology to try to change eating behavior.
Persuasion studies typically provide arguments about animal welfare, health, and environmental welfare reason
warning - mildly spicy take
In the wake of the release, I was a bit perplexed by how much of Tech Twitter (answered by own question there) really thought this a major advance.
But in actuality a lot of the demo was, shall we say, not consistently candid about Gemini's capabilities (see here for discussion and here for the original).
At the moment, all Google have released is a model inferior to GPT-4 (though the multi-modality does look cool), and have dropped an I.O.U for a totally-superior-model-trust-me-bro to come out some time next year.
Previously some AI risk people confidently thought that Gemini would be substantially superior to GPT-4. As of this year, it's clearly not. Some EAs were not sceptical enough of a for-profit company hosting a product announcement dressed up as a technical demo and report.
There have been a couple of other cases of this overhype recently, notably 'AGI has been achieved internally' and 'What did Ilya see?!!?!?' where people jumped to assuming a massive jump in capability on the back on very little evidence, but in actuality there hasn't been. That should set off warning flags about 'epistemics' tbh.
On the 'Benchmarks' - I think most 'Benchmarks' that large LLMs use, while the contain some signal, are mostly noisy due to the significant issue of data contamination (papers like The Reversal Curse indicate this imo), and that since LLMs don't think as humans do we shouldn't be testing them in similar ways. Here are two recent papers - one from Melanie Mitchell, one about LLMs failing to abstract and generalise, and another by Jones & Bergen[1] from UC San Diego actually empirically performing the Turing Test with LLMs (the results will shock you)
I think this announcement should make people think near term AGI, and thus AIXR, is less likely. To me this is what a relatively continuous takeoff world looks like, if there's a take off at all. If Google had announced and proved a massive leap forward, then people would have shrunk their timelines even further. So why, given this was a PR-fueled disappointment, should we not update in the opposite direction?
Finally, to get on my favourite soapbox, dunking on the Metaculus 'Weakly General AGI' forecast:
tl;dr - Gemini release is disappointing. Below many people's expectations of its performance. Should downgrade future expectations. Near term AGI takeoff v unlikely. Update downwards on AI risk (YMMV).
I originally thought this was a paper by Mitchell, this was a quick system-1 take that was incorrect, and I apologise to Jones and Bergen.
Thanks for the response!
A few quick responses:
Good to know! That certainly changes my view of whether or not this will happen soon, but also makes me think the resolution criteria is poor.
You might be interested in the recent OpenPhil RFP on benchmarks and forecasting.
... (read more)