All of geoffrey's Comments + Replies

Until recently, I always had the impression that there was a glut of animal activists and there'd be little point in me participating. It's not something I ever bothered to check.

For one thing, I heard plenty of stories of how hard it is to get a job at an animal organization. So I figured that would be same for animal activism and that each campaign was saturated with volunteers.

And for another thing, I usually don't hear about pressure campaigns unless they're successful or have tons of people. Understandably nobody wants to promote the mediocre attempts... (read more)

2
Zachary Segall
I had a similar experience when I started volunteering for The Humane League. I joined because I was looking for unskilled EA-approved volunteering and I was under the impression that THL would have plenty of people. It wasn't until I became a local organizer and started digging into the numbers that I realized the activist population is tiny and there was an opportunity for individuals to significantly improve the movement base. Part of my hope in writing this post was to tell more people about the need for non-crazy people with a pulse. I do think getting the word out about that while retaining the idea that we can make real change is a big challenge for organizers. 

Don’t have answers but just wanted to say I really appreciate this mini-compilation of what’s already been done 

For what it's worth, I read that abstract as saying something like, "within the class of interventions studied so far, the literature has yet to settle onto any intervention that can reliably reduce animal product consumption by a meaningful amount, where meaningful amount might be a 1% reduction at Costco scale or long-term 10% reduction at a single cafeteria. The class of interventions being studied tends to be informational and nudge-style interventions like advertising, menu design, and media pamphlets. When effect sizes differ for a given type of inte... (read more)

2
Seth Ariel Green 🔸
It's an interesting question.  From the POV of our core contention --  that we don't currently have a validated, reliable intervention to deploy at scale -- whether this is because of absence of evidence (AoE) or evidence of absence (EoA) is hard to say. I don't have an overall answer, and ultimately both roads lead to "unsolved problem."  We can cite good arguments for EoA (these studies are stronger than the norm in the field but show weaker effects, and that relationship should be troubling for advocates) or AoE (we're not talking about very many studies at all), and ultimately I think the line between the two is in the eye of the beholder. Going approach by approach, my personal answers are  1. choice architecture  is probably AoE, it might work better than expected but we just don't learn very much from 2 studies (I am working on something about this separately) 2. the animal welfare appeals are more EoA, esp. those from animal advocacy orgs 3. social psych approaches, I'm skeptical of but there weren't a lot of high-quality papers so I'm not so sure (see here for a subsequent meta-analysis of dynamic norms approaches). 4. I would recommend health for older folks, environmental appeals for Gen Z. So there I'd say we have evidence of efficacy, but to expect effects to be on the order of a few percentage points. Were I discussing this specifically with a funder, I would say, if you're going to do one of the meta-analyzed approaches -- psych, nudge, environment, health, or animal welfare, or some hybrid thereof -- you should expect small effect sizes unless you have some strong reason to believe that your intervention is meaningfully better than the category average. For instance, animal welfare appeals might not work in general, but maybe watching Dominion is unusually effective. However, as we say in our paper, there are a lot of cool ideas that haven't been tested rigorously yet, and from the point of view of knowledge, I'd like to see those get funded

Seth, for what it's worth, I found your hourly estimates (provided in these forum comments but not something I saw in the evaluator response) on how long the extensions would take to be illuminating. Very rough numbers like this meta-analysis taking 1000 hours for you or a robustness check taking dozens / hundreds of hours more to do properly helps contextualize how reasonable the critiques are.

It's easy for me (even now while pursuing research, but especially before when I was merely consuming it) to think these changes would take a few days. 

It's al... (read more)

9
Seth Ariel Green 🔸
Love talking nitty gritty of meta-analysis 😃  1. IMHO, the "math hard" parts of meta-analysis are figuring out what questions you want to ask, what are sensible inclusion criteria, and what statistical models are appropriate. Asking how much time this takes is the same as asking, where do ideas come from? 2. The "bodybuilding hard" part of meta-analysis is finding literature. The evaluators didn't care for our search strategy, which you could charitably call "bespoke" and uncharitably call "ad hoc and fundamentally unreplicable." But either way, I read about 1000 papers closely enough to see if they qualified for inclusion, and then, partly to make sure I didn't duplicate my own efforts, I recorded notes on every study that looked appropriate but wasn't. I also read, or at least read the bibliographies of, about 160 previous reviews. Maybe you're a faster reader than I am, but ballpark, this was 500+ hours of work. 3. Regarding the computational aspects, the git history tells the story, but specifically making everything computationally reproducible, e.g. writing the functions, checking my own work, setting things up to be generalizable -- a week of work in total? I'm not sure. 4. The paper went through many internal revisions and changed shape a lot from its initial draft when we pivoted in how we treated red and processed meat.  That's hundreds of hours. Peer review was probably another 40 hour workweek. 5. As I reread reviewer 2's comments today, it occurred to me that some of their ideas might be interesting test cases for what Claude Code is and is not capable of doing. I'm thinking particularly of trying to formally incorporate my subjective notes about uncertainty (e.g. the many places where I admit that the effect size estimates involved a lot of guesswork) into some kind of...supplementary regression term about how much weight an estimate should get in meta-analysis? Like maybe I'd use Wasserstein-2 distance, as my advisor Don recently proposed? Or B

Would you recommend that I share any such posts with both the authors and the evaluators before making them?

Yes. But zooming back out, I don't know if these EA Forum posts are necessary.

A practice I saw i4replication (or some other replication lab) is that the editors didn't provide any "value-added" commentary on any given paper. At least, I didn't see these in any tweets they did. They link to the evaluation reports + a response from the author and then leave it at that.

Once in a while, there will be a retrospective on how the replications are going as a... (read more)

3
david_reinstein
Thanks for the thoughts. Note that I'm trying to engage/report here because we're working hard to make our evaluations visible and impactful, and this forum seems like one of the most promising interested audiences. But also eager to hear about other opportunities to promote and get engagement with this evaluation work, particularly in non-EA academic and policy circles. I generally aim to just summarize and synthesize what the evaluators had written and the authors' response, bringing in what seemed like some specific relevant examples, and using quotes or paraphrases where possible. I generally didn't give these as my opinions but rather, the author and the evaluators'. Although I did specifically give 'my take' in a few parts. If I recall my motivation I was trying to make this a little bit less dry to get a bit more engagement within this forum. But maybe that was a mistake. And to this I added an opportunity to discuss the potential value of doing and supporting rigorous, ambitious, and 'living/updated' meta-analysis here and in EA-adjacent areas. I think your response was helpful there, as was the authors. I'd like to see others' takes   Some clarifications: The i4replication groups does put out replication papers/reports in each case and submits these to journals, and reports on this outcome on social media . But IIRC they only 'weigh in' centrally  when they find a strong case suggesting systematic issues/retractions.  Note that their replications are not 'opt-in':  they aimed to replicate every paper coming out in a set of 'top journals'. (And now, they are moving towards a research focusing on a set of global issues like deforestation, but still not opt-in). I'm not sure what works for them would work for us, though. It's a different exercise. I don't see an easy route towards our evaluations getting attention through 'submitting them to journals' (which naturally, would also be a bit counter to our core mission of moving research output and rewards

Chiming in here with my outsider impressions on how fair the process seems

@david_reinstein If I were to rank the evaluator reports, evaluation summary, and the EA Forum post in which ones seemed the most fair, I would have ranked the Forum post last. It wasn't until I clicked through to the evaluation reports that I felt the process wasn't so cutting.

Let me focus on one very specific framing in the Forum post, since it feels representative. One heading includes the phrase "this meta-analysis is not rigorous enough". This has a few connotations that you pro... (read more)

2
david_reinstein
Thanks for the detailed feedback, this seems mostly reasonable. I'll take a look again at some of the framings, and try to adjust. (Below and hopefully later in more detail). This was my take on how to succinctly depict the evaluators' reports (not my own take), in a way the casual reader would be able to digest. Maybe this was rounding down too much, but not by a lot, I think. Some quotes from Janés evaluation that I think are representative: This doesn't seem to reflect 'par for the course' to me, but it depends on what the course is; i.e., what the comparison group. My own sense/guess is that this more rigorous and careful than most work in this area of meat consumption interventions (and adjacent) but less rigorous than the meta-analyses the evaluators are used to seeing in their academic contexts and the practices they espouse. But academic meta-analysts will tend to focus on areas where they can find a proliferation of high-quality more homogenous research, not necessarily the highest impact areas. Note that the evaluators rated this 40th and 25th percentile for methods and 75th and 39th percentile overall. To be honest I'm having trouble pinning down what the central claim of the meta-analysis is. Is it a claim that "the main approaches being used to motivate reduced meat consumption don't seem to work", i.e., that we can bound the effects as very small, at best? That's how I'd interpret the reporting of the pooled effects 95% CI as standardized mean effect of 0.02 and 0.12. I would say that both evaluators are sort of disputing that claim. However the authors hedge this in places and sometimes it sounds more like they're saying that ~"even the best meta-analysis possible leaves a lot of uncertainty" ... An absence of evidence more than an evidence of absence, and this is something the evaluators seem to agree with. That is/was indeed challenging. Let me try to adjust this post to note that. My goal for this post was to fairly represent the evaluator's

Really enjoyed this. Not much public debate in this space as far as I can see. To two of your cruxes:

Is meta-analysis even useful in these contexts, with heterogeneous interventions, outcomes, and analytical approaches?

Will anyone actually do/fund/reward rigorous continued work?

I've sometimes wondered if it'd be worth funding a "mega study" like Milkman et al. (2021). They tested 54 different interventions to boost exercise among 61,000 members. Something similar for meat reduction could allow for some clean apples-to-apples comparisons.

I've seen the numbe... (read more)

1
Seth Ariel Green 🔸
@geoffrey We'd love to run a megastudy! My lab put in a grant proposal with collaborators at a different Stanford lab to do just that but we ultimately went a different direction. Today, however, I generally believe that we don't even know what is the right question to be asking -- though if I had to choose one it would be, what ballot intiative does the most for animal welfare while also getting the highest levels of public support, e.g. is there some other low-hanging fruit equivalent to "cage free" like "no mutilation" that would be equally popular. But in general I think we're back to the drawing board in terms of figuring out what is the study we want to run and getting a version of it off the ground, before we start thinking about scaling up to tens of thousands of people.  @david_reinstein, I suppose any press is good press so I should be happy that you are continuing to mull on the lessons of our paper 😃 but I am disappointed to see that the core point of my responses is not getting through. I'll frame it explicitly here: when we did one check and not another, or one one search protocol and not another, the reason, every single time, is opportunity costs.  When I say "we thought it made more sense to focus on the risks of bias that seemed most specific to this literature," I am using the word 'focus' deliberately, in the sense of "focus means saying no," i.e. 'we are always triaging.' At every juncture, navigating the explore/exploit dilemma requires judgment calls. You don't have to like that I said no to you, but it's not a false dichotomy, and I do not care for that characterization. To the second question of whether anyone will do the kind of extension work, I personally see this as a great exercise for grad students. I did all kinds of replication and extension work in grad school. A deep dive into a subset of contact hypothesis literature I did in a political psychology class in 2014, which started with a replication attempt, eventually morphed into
3
david_reinstein
This does indeed look interesting, and promising. Some quick (maybe naive) thoughts on that particular example, at a skim. * An adaptive/reinforcement learning design could make a mega study like this cheaper ... You end up putting more resources into the arms that start to become more valuable/where more uncertainty needs to be resolved.   * I didn't see initially how they corrected did things like multiple hypothesis correction, although I'd prefer something like a Bayesian approach, perhaps with multiple levels of the model... effect category, specific intervention * Was their anything about the performance of their successful interventions out-of-sample/in different environments?  I'd want to build in some ~gold standard calidation.    The "cost of convincing researchers to work on it" Is uncertain to me. If it was already a very well-funded high-quality study in an interesting area that is 'likely to publish well' (apologies), I assume that academics would have some built-in 'publish or perish' incentives from their universities. Certainly there is some trade-off here: Of course investing resources, intellectual and time, into more careful, systematic, and robust meta-analysis of a large body of work of potentially varying quality and great heterogeneity comes at the cost of academics and interested researchers organizing better and more systematic new studies. There might be some middle ground where a central funder requires future studies to follow common protocols and reporting standards to enable better future meta-analysis (perhaps as well as outreach to authors of past research to try systematically dig out missing information.) Seems like there are some key questions here * Is there much juice to squeeze out of better meta-analyses of the work that's already been done? * Even if more meta-analysis doesn't yield much direct value, could it help foster protocols for future trials  to be more reliable and systematic? * Is there a realistic pat

Ooh this is neat.

I like how it neutralizes the certainty-seeking part of me since it's only me, the difference maker, that has the option of a guaranteed 100% outcome. For the beneficiary, it's always a gamble.

Agree it's more about upbringing and messaging. And also relate a lot to this.

But also I think it's really hard to tell the "cause" of any given problem at an individual level. As recently as a few years ago, I would have put 80% weight on upbringing / messaging (which I agree aren't the identities themselves but something associated with them). Nowadays I'm more agnostic about it.

I think it's fine to seek out affinity groups and culturally-relevant advice to some degree. But also, there's a tradeoff between exploring identities versus applying generic mental health advice. Especially when you get to intersectionality-type stuff like trifectas where the number of things to explore is gets incredibly vast very quickly.

1
Angel Lau
thank you both for sharing your perspectives! :)

I can speak to two of those three identities (EA and Asian). I think one possibility that took me an unusually long time to consider was that maybe my identities didn't matter and I'd still feel the same problems if I was the "default person" in society. And I was working through a lot of identities.

It's a weird way of framing things since we can't have our identities counterfactually removed. Even if we did, we wouldn't be the same person. But I think it's a framework that usually doesn't get mentioned much in mental health circles , especially on the int... (read more)

1
emily.fan
As someone who fits into the trifecta I think it's less about identities and more about upbringing / messaging associated with identities. I grew up in a family context with strong filial piety, respect for authority, lots of comparisons with other people which leads to feeling not good enough, EA is dismissed as something that's stupid to do by "authority figures". I do think that I need to unlearn a lot from my family's messaging. For me personally, I'd say it's much more than 10-25% as a source of stress given how much I need to re-program my brain from all the messaging I received. I'd maybe say like 80%?

Agree the value is high. But practically, there's two big questions that pop to mind since I work / study around this area:

  • If aggregating existing datasets, what's your value-add over what J-PAL, World Bank, IDInsight, Our World in Data, and what numerous un-affliated academics are already doing? (See Best of EconTwitter Substack for "public goods" which are sometimes publicly accessible datasets)
  • If gaining access to new datasets, what are you offering to LMIC governments in exchange? Even making a single batch of batch of data publicly accessible is a lif
... (read more)

What I'm suggesting is really about:

  • Creating new data sets
  • Making inaccessible data sets accessible
  • Making lightweight, live comparison testing more readily available to the sector as a whole 

On the new data sets front, I've been looking at last-mile health record digitization and interoperability. There are some promising cases of traction via smartphone-compatibility like UCS in Tanzania, or MedTrack in Ghana (who I've been directly working with). Speaking for MedTrack, I can say that we're already working with the goal of creating a usable administra... (read more)

This is really good. 

What struck me was all the concrete detail. While it is personal, it's also in service of giving useful lessons to other people. It helps establish how generalizable the career advice would be to other people and it reframes some standard career advice in a way that centers the constraints as a first-order consideration.

I would not have taken the adversity quotient framing seriously otherwise.

The one addition that might help is mentioning whether there were aspects of your career path that felt unusually lucky or aspects of your l... (read more)

8
Rika Gabriel
Thank you so much for the feedback and the kind words! I was trying to strike that balance between personal narrative and broader relatability, so it's validating to hear it landed for you. Great point about sharing what felt unusually lucky or advantageous—I'll definitely weave more of both sides into future posts.  A few things come to mind where I felt unusually lucky in: I had English fluency from a young age (schools in the Philippines taught English, and I read constantly) and access to top schools partly due to doing well on standardized exams, which also helped me get scholarships. Being in these schools meant I was surrounded by more privileged peers who sometimes covered my expenses, including food. They also showed me (through how they lived/carried themselves) the power of having money and what it could afford which helped me grow in my ambitions. (Not in this post, but I initially studied business and engineering in uni because I was working toward earning enough to live on 10% and donate 90%—even before discovering EA.)  I discovered EA while still in university, giving me time and psychological bandwidth to pivot without the sunk costs or maybe lifestyle fears that come later. Regarding lifestyle-related fears: interestingly, I think growing up constrained was also an advantage in some ways, so taking career risks didn't feel as scary. These subtle advantages (and not-so-subtle advantages, especially around education which I was extremely privileged in) absolutely shaped my path. Again, I very much appreciate the prompt to include this side more!

Do any of you have heuristics for when to “give up” or “pivot” in a job search? Examples could be aiming lower / differently if no response after 10 applications.

Thankfully this is not something I have to worry about for a long time. But I think it’s useful to have some balance to the usual advice of “just keep trying; job searching takes a long time”. Sometimes a job really is unrealistic for a person’s current profile (let’s operationalize that as 1000 job searching hours would still only result in a 1% chance of getting a certain set of jobs).

6
Rika Gabriel
This doesn't directly answer your question, but building on Calum's and Dee's points - I think it might also be helpful to first clarify what approach you're taking in your job search, since different strategies have very different success rates and timelines. It might be helpful to think of job searching as existing on a spectrum - on one end, you have applying through job boards and official channels, and on the other end, you have less structured approaches like networking, volunteering and turning that into a full-time role, creating your own side projects, etc. For the more structured application approach, I personally use a tiered approach that balances my long-term career vision (the one I spend 100s of hours discerning) with current personal constraints (needing a salary, visa sponsorship, supporting family). It may look like something like this, where I allocate a percentage of my job hunting time to specific roles under the following tiers: * Tier 1: Organizations I genuinely want to work for or roles I want to be in - these align with my 10-year career goals (70% of the time I budget for structured applications) * Tier 2: Roles that could be stepping stones to Tier 1 positions i.e., can help me build career capital for my dream role (20% of my time) * Tier 3: Positions that meet my immediate financial or other personal needs (10% of my time) However, the actual number of roles I apply for varies significantly - I have a very specific career path in mind, so there aren't many positions that fit Tier 1, especially when factoring in my personal constraints. This means I might apply to fewer Tier 1 roles in absolute numbers compared to Tier 2 or 3, but invest more time crafting each Tier 1 application. This approach of focusing on time invested rather than sheer number of applications is useful because it takes into account the realities of the job market (e.g., how many jobs that fit Tier 1 are actually available right now), the actual amount of time

hey geoffrey, here are a few drafty thoughts that boil down to “You should probably invest a bunch of time before giving up” and “It’s hard to get useful data from rejections":

  • Like Dee, I spent months and hundreds of hours applying to ~80 jobs before I found my current role. If I were job hunting right now, I would probably invest a similar or greater amount of effort. My impression from many conversations in my personal life is that more people under-apply than over-apply. There’s almost certainly some amount of effort that’s too much, but I’d guess most
... (read more)

Thanks for asking, Geoffrey – I think this is a helpful and important question. My own personal heuristic after switching jobs as a mid-career professional ~2 years ago was something like: if I spend ~100h and get no signal or make any progress, I should either pivot or give up. Now, I think that number could be meaningfully lower or higher for different people and would depend on internal factors like a) time/capacity to search for a job, b) finances (if searching without a steady stream of income in place), and c) intrinsic motivation, and external ... (read more)

0[comment deleted]

Thanks for this. I'm surprised how consistently the studies point in favor of vegan diets being cheaper on the whole (though I'll caveat none of these are too convincing: the headline RCT is testing a low-fat vegan diet instead of a general vegan diet and the rest are descriptive regressions / modeling exercises).

All that said, I'm wondering if perception of vegan diets being more expensive could be explained by:

  • Fully plant-based diets are cheaper but various "halfway points" are more expensive.
  • Meat-eaters mostly get exposure to the "halfway points". These
... (read more)

Agreed, but I'd be careful not to confuse good mentorship with good management. These usually go hand-in-hand. But sometimes a manager is good because they sacrifice some of your career growth for the sake of the company.

I like the archetypes of ruthless versus empathetic managers described here. It's an arbitrary division and many managers do fulfill both archetypes. But I think it also captures an important dynamic, where managers have to tradeoff between their own career, your career, and the organization as a whole. Mentorship and career development fa... (read more)

Not sure what the right numbers are but I really like the back-of-the-envelope approach you're taking here. It's simple and concrete enough that it's probably going to bounce around in my head for a while

Good point. In a toy model, it'd depend on relative cuts to labor versus non-labor inputs. Now that I think about it, it probably points towards exiting being better in mission-driven fields. People are more attached to their careers so the non-labor resources get cut deeply while all the staff try to hold onto their jobs.

Maybe I'd amend it to... if you're willing to switch jobs, then you can benefit from increasing marginal returns in some sub-cause areas. Because maybe there's a sub-cause area where lots of staff are quitting (out of fear the cause area ... (read more)

Marginal returns to work (probably) go up with funding cuts, not down.

It can be demoralizing when a field you’re working in gets funding cuts. Job security goes down, less stuff is happening in your area, and people may pay you less attention since they believe others are doing more important work. But assuming you have job security and mostly make career decisions on inside views (meaning you’re not updating too heavily on funders de-prioritizing your cause area), then your skills are more valuable than they were previously.

Lots of caveats apply of course... (read more)

7
Joel Tan🔸
It's true that there are diminishing marginal returns, and with less funding and fewer projects/people around, there is now a bunch of opportunities for impact which you can exploit (where previously someone else would have done it anyway). However, there's also a countervailing reduction in marginal value of labour due to reduced availability of capital and non-labour input, especially since cuts aren't necessarily well targeted (e.g. keeping staff while capital investment is slashed). Loss of infrastructure field-wide is also a critical problem (e.g. all those interventions and programmes piggybacking on AIDS clinics)

My immediate hesitation is whether fresh college graduates would be useful enough to hosting organizations to make this program sustainable.

Last I checked, Peace Corps invests 3 months of formal training into each applicant and requires a minimum 2-year commitment in a role (to allow people to grow into competency). But this version of Animal Advocacy Corps has college undergraduates rotate thru multiple organizations for much shorter periods without any training. And I’m not sure how much demand there is for that kind of worker in animal advocacy even if it’s provided for free.

Agreed. I’d extend the claim past ideas and say that EA is very non-elitist (or at least better than most professional fields) at any point of evaluation. 

Maybe because of that, it seems more elitist elsewhere. 

Like the idea but it might be at odds with the recent AI governance shift. In general, policy folks of all stripes, especially the more senior and insider ones, practice a lot of selective disclosure.

Having done a lot of this advice in my 20s, I'd recommend just getting started with an online training program you find interesting, seems career relevant, and also not too pie-in-the-sky as a near-term plan. Throughout my life, I think there were one or two that felt unusually good or bad all-things-considered. Even then, training programs are short (~6 weeks) and have no stakes if you stop them.

(The exception is if the training somehow includes hands-on training from someone actively trying to progress in one of your desired career paths. Good mentorship... (read more)

I've done a lot of partially blind hiring processes both within EA and outside it [1]. And as much as I like them (and feel like I've benefited from them), I think there's good reasons why they aren't done more.

  • It seems really hard to create a good blind hiring process. Most of the ones I felt good about were constructed with an immense amount of care to balance not rejecting good candidates but still having enough subtlety to distinguish candidates that would pass a final stage. Even then I still felt like there would be exceptional candidates that w
... (read more)

A quick drive-by comment on "4. Missed RCT Opportunity": The sample size seems way too small for a RCT to be worth it. There's not much statistical power to work with when researchers are studying a messy intervention with only 6 countries. And I imagine they'd struggle to attribute changes to the Technical Support Units unless it was something truly transformative (at least within the framework of the RCT).[1]

More broadly, I'm not aware of any commonly accepted way to do "small n" impact evaluation yet, especially with something as customized as Technical... (read more)

6
NickLaing
Drive by hit job successful 🤣 Thanks I'm going to edit that I think you are right.  To make an RCT work sample success wise you would need district level randomisation probably to work, and that wouldn't make sense here when it's only a central government level intervention.

Yeah this was what I found too when I looked into private US long-term disability insurance a while back. My recollection was:

  •  there's a surprising number of health exclusions, even for things that happened in your childhood or adolescence
  • it's a lot more expensive in percentage terms if you're at a lower income
  • many disability cases are ambiguous so the insurance company may have you jump through a lot of hoops and paperwork (a strange role-reversal in which the bureaucracy wants to affirm your agency)

I had the impression that it was a great product fo... (read more)

I find searching for in-depth content on the EA Forum vastly better than Reddit. This isn't just relating to EA topics. There are a few academic-ish subreddits that I like and will search when I'm interested in what the amateur experts think on a given topic. Finding relevant posts is about the same on Reddit but finding in-depth comments + related posts is very hard. I usually have to do some Google magic to make that happen.

Also on rare occasion, I end up liking a person's writing style or thinking methods and want to deep dive into what else they've wri... (read more)

I'm loving this series so far. I got two questions if you've got some time to answer them.

What categories do you use for time-tracking? I find research tasks unusually hard to categorize.

Do you find that earlier stages in the Ideation -> Exploration -> Understanding -> Distillation pipeline take more time to get good at? My experience is that I improve at later stages far earlier and far faster than earlier stages (passable at Distillation before Understanding, passable at Understanding before Exploration, passable at Exploration before Ideation). And anecdotally, I heard people can take a very long time to come up with a good research idea.

I found it very valuable but (barring any major changes like there being 5 new organizations of similar impact) I wouldn’t find an expanded or updated version that useful.

Interesting! I think my intuition going into this has always been stretching so that's something I could rethink

Ah I missed the point about the relationship getting flatter before. Thanks for flagging that.

I think I'm more confused about our disagreement now. Let me give you a toy example to show you how I'm thinking about this. So there's three variables here:

  • latent life satisfaction, which ranges from 0 to infinity
  • reported life satisfaction, which ranges from 0 to 10 and increases with latent life satisfaction
  • probability of divorce, which ranges from 0% to 100% and decreases with latent life satisfaction

And we assume for the sake of contradiction that rescaling is... (read more)

4
Zachary Brown🔸
I think rescaling could make it steeper or flatter, depending on the particular rescaling. Consider that there is nothing that requires the rescaling to be a linear transformation of the original scale (like you've written in your example). A rescaling that compresses the life satisfaction scores that were initially 0-5 into the range 0-3, while leaving the life satisfaction score of 8-10 unaffected will have a different effect on the slope than if we disproportionately compress the top end of life satisfaction scores. Sorry if I expressed this poorly -- it's quite late :)

This is really neat analysis idea. 

At the same time, my hunch is that all three of these exit actions have gotten easier to do and more common from 1990 to 2022. I believe divorce has gotten less stigmatized, the job market rewards more hopping around, and (I think) hospitalization has been recommended more.

If that "easier-to-do" effect is large enough, then it'd be compatible with a very wide range of happiness trends (rising/falling/stable + rescaling/no rescaling) Wondering if you have any thoughts on that.

6
Charlie Harrison
Hi Geoffrey, Thank you! It's possible that these 3 exit actions have gotten easier to do, over time. Intuitively, though, this would be pushing in the same direction as rescaling: e.g., if getting a divorce is easier, it takes less unhappiness to push me to do it. This would mean the relationship should (also) get flatter. So, still surprising, that the relationship is constant (or even getting stronger). 

Hoping to post some shallow takes on global health and development this year!

This is incredible. I skimmed all the sections and I'm impressed with the quality, scope, concreteness, and kindness throughout. 

This is an area where Probably Good blows the 80,000 Hours version out of the water. I'll be pointing people here for a one-stop shop for all their job searching needs and almost certainly coming back to this in the future.

Quick anecdote: I found this dynamic surprising in all the paths you mentioned: academia, research, EA research, and non-profit work. But I realized it very quickly for academia (one month) and painfully slowly for non-profits (several years).

In academia, the markers of success are transparent, the divide between "good" and "bad" jobs is sharp, and even very successful professors complain about the system.

On the other extreme, the non-profit sector is much fuzzier about credentials and career progression. So I might see an employee who graduated from a sch... (read more)

2
AnonymousTurtle
Another random anecdote: I was reading the Wikipedia page of an ultramarathon runner, and apparently her father is a famous mathematician

Hey Ozzie, a few quick notes on why I react but try not to comment on community based stuff these days:

  • I try to limit how many meta-level comments I make. In general I’d like to see more object-level discussion of things and so I’m trying (to mixed success) to comment mostly about cause areas directly.
  • Partly it’s a vote for the person I’d like to be. If I talk about community stuff, part of my headspace will be thinking about it for the next few days. (I fully realize the irony of making this comment.)
  • It’s emotionally tricky since I feel responsibility for
... (read more)

I also share these frustrations with career advice from 80,000 Hours and the EA Forum. There was time about 2 years back where my forum activity was a lot of snarky complaints (of questionable insight) about career advice and diversity.

Like you mentioned, the career advice usually leaves a lot to be desired in the concrete details of navigating a lack of mentors, lack of credentials, lack of financial runway, family obligations, etc. I've sometimes wondered about writing an article to fill in the gap, but it's not exactly a "one article" sized hole. Maybe ... (read more)

3
gogreatergood
All makes sense Geoffery and glad it's not just me who thinks about these things, especially on the 80k advice. I agree that this list that Julia presents is very impressive and way better than what a lot might do, in some contexts. Your point is well taken and your initial comment was good too, I maybe could have read the meaning a little better so maybe it was me that boxed it in. Thanks so much, these threads I am posting on here, are I think the first time I am having productive back and forths on the forum so that's kinda cool :) 

This is great stuff. I often find it hard to remember a lot of initiatives have happened (despite having read 80% of this list already) so this timeline is a good reference

As an aside, I think others may benefit from reading about diversity initiatives outside EA to remember this is hard problem. It's totally consistent for EA to be above-the-curve on this and still not move the needle much (directionally I think those two things are true but not confident on magnitudes), so linking some stuff I've been reading lately:

... (read more)

EA is "above-the-curve" on DEI stuff?

No, EA is the only place in the entire world I have been (and I have been many varied places) where I - a white straight male - am considered diverse, or at least semi-diverse... simply because I come from a typical white background that's not super wealthy; I'm the first to go to college in my family (and not ivy league); etc. (Or at least, there are "socioeconomic diversity" meetups at EAGs where they list me as diverse for these reasons. So I'm going off their definitions.)

And EA is aimed in many ways at maintaining ... (read more)

Echoing what Eva said, I think you should consider waiting a year then apply for IDE / applied econ masters. An IDE program is probably the right fit given your goals, but I don't know any beyond Yale's IDE which expects you to already have worked in development first.

For Applied Econ, I like University of Maryland's Applied Economics Master's program. The program only requires Calc I and is very transparent about what it can do. Dev / global health placements, content, and networking will take a huge hit compared to IDE programs though.

You can use the yea... (read more)

1
rl1004
Understood, thanks so much for the advice!

This is my favorite in the series so far. I really enjoy the tacit knowledge flavor of it and that some of these lessons generalize beyond the CBA domain.

1
Richard Bruns
Thanks! That was what I was hoping for. I've learned things since I started this series, and one of the main ones was to be less academic and more practical.

Would also love this. I think a useful contrast will be A/B testing in big tech firms. My amateur understanding is big tech firms can and should run hundreds of “RCTs” because:

  • No need to acquire subjects.
  • Minimal disruption to business since you only need to siphon off a minuscule portion of your user base.
  • Tech experiments can finish in days while field experiments need at least a few weeks and sometimes years.
  • If we assume treatments are heavy tailed, then a big tech firm running hundreds of A/B tests is more likely to learn of a weird trick that grows the business when compared to a NGO who may only get one shot. 
2
Fernando Irarrázaval 🔸
Yes, exactly. The marginal cost of an A/B test in tech is incredibly low, while for NGOs an RCT represents a significant portion of their budget and operational capacity.  This difference in costs explains why tech can use A/B tests for iterative learning, trying hundreds of small variations, while NGOs need to be much more selective about what to test.  And despite A/B testing being nearly free, most decisions at big tech firms aren't driven by experimental evidence.

Something I've noticed more in the EA Forum is the increase in drive-by professional posts. Organizations will promote a idea, a job posting, or something else. Then they'll engage as long as they're on the front page before bouncing.

That's fine in small amounts or if the author is a regular contributor. But if the author is just stopping by to do their public engagement, then it breaks the illusion of a community.

And for me, that is the aesthetic draw of the forum. It's a place where expects and amateurs alike coexist in the same space, say things that ar... (read more)

Great comment — this gets at a lot of things that I've been thinking about. And I appreciate you sharing your personal perspective.

drive-by professional posts

I like your description that "it breaks the illusion of a community", that resonates with me. I also think that posts that feel too professional discourage discussion. The flip side is that there are various ways that these kinds of posts create value:

  1. Things like job postings and org calls for donations can pretty straightforwardly contribute to making the world better by encouraging impactful action
  2. A
... (read more)
7
Joseph
Thanks for giving me a term (or perhaps a concept handle?) for this thing. I was vaguely aware of it, and it feels sort of spammy when I see it, but I didn't have a clear vocabulary to describe it previously. I see these kind of "drive-by posts" a lot in subreddits and Slack workspaces, and even a few WhatsApp group chats that I'm in: people will join, post one advertisement/announcement, and then never be heard from again (unless they end up posting another advertisement/announcement after a few months).

This is a class act in reasoning transparency. I love how easy it is to skim and drill down into things for more detail. Same goes for the pre-print and replication code.

Nits:

  • I was confused what calibration meant since I think of this exercise as simulation. To me, calibration is taking some observed data and reverse engineering the elasticity. But this is starting with the elasticity values -- taken from the Bouyssou et al. (2024) meta-analysis -- and seeing how that goes forward in a pretend-tax situation.
  • I would have loved to hear about how trustworthy
... (read more)
4
Soemano Zeijlmans
Thank you for the kind words and the useful feedback! * My understanding is that calibration is picking numbers for parameters in a otherwise pre-defined model to make it match the real world, and what you're describing I think of as estimation rather than calibration. I'll add a footnote to make my use of the word a little clearer! * You're right on the validity of the Bouyssou et al. (2024) paper. There are quite some differences in the measured elasticity estimates in the literature, unfortunately. I'm more worried about the own-price elasticity of supply, though, as the literature is much more scarce and old. 

I love this tiny hearts on pink banner aesthetic. Especially in the middle of winter where I'm at. 

I'm on a pause with donations for some personal and financial reasons, but if I was on the fence or was procrastinating, this absolutely would have pushed me to do so.

Personal reasons why I wished I delayed donations: I started donating 10% of my income about 6 years back when I was making Software Engineer money. Then I delayed my donations when I moved into a direct work path, intending to make up the difference later in life. I don't have any regrets about 'donating right away' back then. But if I could do it all over again with the benefit of hindsight, I would have delayed most of my earlier donations too.

First, I've been surprised by 'necessary expenses'. Most of my health care needs have been in therapy and denta... (read more)

Hey John, this is very cool to read. I like the focus on what surprised you as a founder (and maybe newcomer?) in the mental health field.

I'm curious to hear more about the implementation details. Could you tell me more about the length, intensity, and duration of a typical treatment program? I saw 6 sessions in a graph which makes me think this is once-a-week program for 1-2 hour sessions over 1-2 months

Less sessions is a reliable way to reduce cost, but my understanding is there’s a U-shaped curve to cost-effectiveness here. 1 session doesn't have enough... (read more)

4
John Salter
I am! Just under two years delivering psychotherapy interventions, ~5 years in mental health more generally We offer a minimum of six weeks, with no arbitrary cap. It's once (or rarely twice) a week for ~1 hour at a time. I'd suggest that six weeks is the most cost-effective if you are limited by supply, but in practice it tends to be longer because often you have spare capacity.   That sounds about right.    Depends on the client. Mostly our counselling is bespoke, but we have some programmes for more specialised issues (e.g. chronic insomnia, addiction, phobia)  

Quickly throwing in a related dynamic. I suspect animal welfare folks have more free time to post online.

Career advancement in animal welfare is much more generalist than global health & development. This means there's not as many career goals to 'grind' towards, leaving more free time for public engagement. Alternative proteins feel like a space where one can specialize, but that's all I can think of. I'd love to know of other examples.

In contrast, global health & development has many distinct specialities that you have to focus on if you want to ... (read more)

Thanks, this is exactly what I'm looking for. 

Accuracy isn't too important here. More interested in how people approach this

This advice sounds right to me if you already have the signal in hand and deciding whether to job search.

But if you're don't have the signal, then you need to spend time getting it. And then I think the advice hinges on how long it takes to get the signal. Short time-capped projects are great (like OP's support on 80,000 hours CEO search). But for learning and then demonstrating new skills on your own, it's not always clear how long you'll need.

Ooh good idea. I should do more of that.

I do think this can run into Goodhart's Law problems. For example, in the early 2010s, back when being a self-taught software engineer was much more doable, and it was a strong sign when someone had a GitHub profile with some side projects with a few months of work behind each of them. GitHub profile correlated with a lot of other desirable things. But then everyone got that advice (including me) and that signal got rapidly discounted.

So I guess I'd qualify that with: press really hard on why the signal is impressive... (read more)

3
Neel Nanda
I think this is a valid long term concern but takes at least a few months to properly propagate - if someone qualified tells you that when hiring they look at a github profile, that's probably pretty good for the duration of your job search

I like this advice a lot but want to add a quick asterisk when transitioning to a new field.

It’s really really hard to tell what an expensive signal is without feedback. If you’re experienced in a field or you hang out with folks who work in a field, then you’ve probably internalized what counts as an “impressive project” to some degree.

In policy land, this cashes out as advice to take a job you don’t want in the organization you do want. Because that’s how you’ll learn what’s valuable and what’s not. Or taking low paid internships and skilled volunteering... (read more)

3
Neel Nanda
I imagine you can get a lot of the value here more cheaply by reaching out to people in the field and asking them a bunch of questions about what signals do and do not impress them? Doing internships etc is valuable to get the supervision to DO the impressive projects, of course. EDIT: Speaking as someone who does hiring of interpretability researchers, I think there's a bunch of signals I look for and ones I don't care about, and sometimes people new to the field have very inaccurate guesses here
Load more