All of Jan_Kulveit's Comments + Replies

The Cost of Rejection

I mostly agree with the problem statement.

With the proposed solution of giving people feedback - I've historically proposed this on various occasions, and from what I have heard, one reason for not giving feedback on the side of organizations is something like "feedback opens up space for complaints, drama on social media, or even litigation". The problem looks very different from the side of the org: when evaluating hundreds of applications, it is basically certain some errors are made, some credentials misunderstood, experiences not counted as they shou... (read more)

5Daystar Eld19dYeah, this seems a hard problem to do well and safely from an organizational standpoint. I'm very sympathetic to the idea that it is an onerous cost on the organization's side; what I'm uncertain about is whether it ends up being more beneficial to the community on net.

My vague understanding is that there's likely no legal issues with giving feedback as long as it's impartial. It's instead one of those things where lawyers reasonably advise against doing anything not required since literally anything you do exposes you to risk. Of course you could give feedback that would obviously land you in trouble, e.g. "we didn't hire you because you're [ethnicity]/[gender]/[physical attribute]", but I think most people are smart enough to give feedback of the form "we didn't hire you because legible reason X".

And it's quickly becom... (read more)

How to succeed as an early-stage researcher: the “lean startup” approach

I would guess the 'typical young researcher fallacy' also applies to Hinton  - my impression is he is  basically advising his past self, similarly to Toby. As a consequence,  the advice is likely  sensible for people-much-like-past-Hinton, but  not a good general advice for everyone.

In  ~3 years most people are able to re-train their intuitions a lot (which is part of the point!). This seems particularly dangerous in cases where expertise in the thing you are actually interested in does not exist, but expertise in something so... (read more)

3rohinmshah2moI agree substituting the question would be bad, and sometimes there aren't any relevant experts in which case you shouldn't defer to people. (Though even then I'd consider doing research in an unrelated area for a couple of years, and then coming back to work on the question of interest.) I admit I don't really understand how people manage to have a "driving question" overwritten -- I can't really imagine that happening to me and I am confused about how it happens to other people. (I think sometimes it is justified, e.g. you realize that your question was confused, and the other work you've done has deconfused it, but it does seem like often it's just that they pick up the surrounding culture and just forget about the question they cared about in the first place.) So I guess this seems like a possible risk. I'd still bet pretty strongly against any particular junior researcher's intuition being better, so I still think this advice is good on net. (I'm mostly not engaging with the quantum example because it sounds like a very just-so story to me and I don't know enough about the area to evaluate the just-so story.)
How to succeed as an early-stage researcher: the “lean startup” approach

Let's start with the third caveat: maybe the real crux is what we think are the best outputs;  what I consider some of the best outputs by young researchers of AI alignment is easier to point at via examples - so it's e.g. the mesa-optimizers paper or multiple LW posts by John Wentworth.  As far as I can tell, none of these seems to be following the proposed 'formula for successful early-career research'. 

My impression is PhD students in AI in Berkeley need to optimise, and actually optimise a lot for success in an established field (ML/AI),... (read more)

2rohinmshah2moI think the mesa optimizers paper fits the formula pretty well? My understanding is that the junior authors on that paper interacted a lot with researchers at MIRI (and elsewhere) while writing that paper. I don't know John Wentworth's history. I think it's plausible that if I did, I wouldn't have thought of him as a junior researcher (even before seeing his posts). If that isn't true, I agree that's a good counterexample. I agree the advice is particularly suited to this audience, for the reasons you describe. That sounds like the advice in this post? You've added a clause about being picky about the selection of people, which I agree with, but other than that it sounds pretty similar to what Toby is suggesting. If so I'm not sure why a caveat is needed. Perhaps you think something like "if someone [who is better or who is comparable and has spent more time thinking about something than you] provides feedback, then you should update, but it isn't that important and you don't need to seek it out"? I agree that's more clearly targeting the right thing, but still not great, for a couple of reasons: * The question is getting pretty complicated, which I think makes answers a bit more random. * Many students are too deferential throughout their PhD, and might correctly say that they should have explored their own ideas more -- without this implying that the advice in this post is wrong. * Lots of people do in fact take an approach that is roughly "do stuff your advisor says, and over time become more independent and opinionated"; idk what they would say. I do predict though that they mostly won't say things like "my ideas during my first year were good, I would have had more impact had I just followed my instincts and ignored my advisor". (I guess one exception is that if they hated the project their advisor suggested, but slogged through it anyway, then they might say that -- but I feel like that's more about motivation rather than impact.)
Announcing the launch of EA Impact CoLabs (beta) + request for projects, volunteers and feedback

It's good to see a new enthusiastic team  working on this! My impression, based on working on the problem ~2 years ago is this has good chances to provide value in global health a poverty, animal suffering, or parts of meta- cause areas; in case of x-risk focused projects, something like a 'project platform' seems almost purely bottlenecked by vetting. In the current proposal this seems to mostly depend on "Evaluation Commission"->  as a result,  the most important part for x-risk projects seems judgement of members of this commission and/or it's ability to seek external vetting

8Mats Olsen2moThanks Jan! Yes, we even reference your post in our detailed write-up [] and agree that vetting will be critical and a bottle-neck to maximum positive impact, particularly related to x-risk. Currently we have implemented a plan that we believe is manageable exclusively by a small group of volunteers, and have included a step in the process that involves CEA's Community Health team. Having said that, we don't think that is an ideal stopping point, we hope to expand into other forms of vetting pending general interest in the project, vetting volunteer interest and the building of other functionality or establishment of partnership with outside orgs. You can read more in sections IV.9 and VI.11 of the write-up about our thinking on these topics. Lastly, given your fantastic analysis in the past, if you would like to help out we would welcome any new team members that are interested in or familiar with this metaproject -- you can email [] anytime!
How to succeed as an early-stage researcher: the “lean startup” approach

In my view this text should come with multiple caveats.

- Beware 'typical young researcher fallacy'. Young researchers are very diverse, and while some of them will benefit from the advice, some of them will not. I do not  believe there is a general 'formula for successful early-career research'. Different people have different styles of doing research, and even different metrics for  what 'successful research' means. While certainly many people would benefit from the advice 'your ideas are bad', some young researchers actually have great ideas, s... (read more)

I'm not going to go into much detail here, but I disagree with all of these caveats. I think this would be a worse post if it included the first and third caveats (less sure about the second).

First caveat: I think > 95% of incoming PhD students in AI at Berkeley have bad ideas (in the way this post uses the phrase). I predict that if you did a survey of people who have finished their PhD in AI at Berkeley, over 80% of them would think their initial ideas were significantly worse than their later ideas. (Note also that AI @ Berkeley is a very selective p... (read more)

2tobyshevlane2moThanks for the caveats Jan, I think that's helpful. It's true that my views have been formed from within the field of AI governance, and I am open to the idea that they won't fully generalise to other fields. I have inserted a line in the introduction that clarifies this.
EA Group Organizer Career Paths Outside of EA

Contrary to what seems an implicit premise of this post,  my impression is 

- most EA group organizers  should have this as a side-project, and should not think about "community building" as about their "career path" where they could possibly continue to do it in a company like Salesforce
- the label "community building" is unfortunate for what most of the EA group organizing work should consist of
- most of the tasks in "EA community building" involve skills which are pretty universal a generally useable in most other fields, like "strategizin... (read more)

How much does performance differ between people?


For different take on very similar topic check  this discussion between me and Ben Pace  (my reasoning was  based on the same Sinatra paper).

For practical purposes, in case of scientists, one of my conclusions was

Translating into the language of digging for gold, the prospectors differ in their speed and ability to extract gold from the deposits (Q). The gold in the deposits actually is randomly distributed. To extract exceptional value, you have to have both high Q and be very lucky. What is encouraging in selecting the talent is the Q se

... (read more)
Some thoughts on EA outreach to high schoolers

First EuroSPARC was in 2016. Targeting 16-19 year olds, my prior would be participants should still mostly study, and not work full-time on EA, or only exceptionally.

Long feedback loops are certainly a disadvantage.

Also in the meantime ESPR underwent various changes and actually is not optimising for something like "conversion rate to an EA attractor state".

The case of the missing cause prioritisation research

Quick reaction:

I. I did spent a considerable amount of time thinking about prioritisation (broadly understood)

My experience so far is

  • some of the foundations / low hanging sensible fruits were discovered
  • when moving beyond that, I often run into questions which are some sort of "crucial consideration" for prioritisation research, but the research/understanding is often just not there.
  • often work on these "gaps" seems more interesting and tractable than trying to do some sort of "lets try to ignore this gap and move on" move

f... (read more)

'Existential Risk and Growth' Deep Dive #2 - A Critical Look at Model Conclusions

I posted a short version of this, but I think people found it unhelpful, so I'm trying to post somewhat longer version.

  • I have seen some number of papers and talks broadly in the genre of "academic economy"
  • My intuition based on that is, often they seem to consist of projecting complex reality into a space of single-digit real number dimensions and a bunch of differential equations
  • The culture of the field often signals solving the equations is profound/important, and the how you do the projection "world -> 10d" is less interestin
... (read more)
Neglected EA Regions

I'm not sure you've read my posts on this topic? (1,2)

In the language used there, I don't think the groups you propose would help people overcome the minimum recommended resources, but are at the risk of creating the appearance some criteria vaguely in that direction are met.

  • e.g., in my view, the founding group must have a deep understanding of effective altruism, and, essentially, the ability to go through the whole effective altruism prioritization framework, taking into account local specifics to reach conclusions valid at their region.
... (read more)
7DavidNash2yI think I agree with the minimum recommended resources you suggest, but I don't see Facebook group membership requirements as the only filter. It's more likely to be based on seeing what people write/projects they do/future attendance at EA events. Sometimes obstacles can be good but maybe there are people who would be really great organisers if they just knew one other person who was interested or were encouraged to go to EAG. A tangential issue that might be part of this disagreement is that anyone can decide to become a group leader, create a meetup page and start telling people about their version of EA as there is no official licence/certification. That would require more thought as to whether having official groups is a good idea.
Neglected EA Regions

FWIW the Why not to rush to translate effective altruism into other languages post was quite influential but is often wrong / misleading / advocating some very strong prior on inaction, in my opinion

Neglected EA Regions

I don't think this is actually neglected

  • in my view, bringing effective altruism into new countries/cultures is in initial phases best understood as a strategy/prioritisation research, not as "community building"
    • importance of this increases with increasing distance (cultural / economic / geographical / ...) from places like Oxford or Bay

(more on the topic here)

  • I doubt the people who are plausibly good founders would actually benefit from such groups, and even less from some vague coordination due to facebook groups
    • actually I think on the marg
... (read more)
3DavidNash2yI agree that Facebook groups are most likely not the ideal coordination tool, but I haven't found a platform that is as widely used without having bigger flaws. I also agree that the impact could be negative if there are people who would build communities just because they met via Facebook but I think a lot of that depends on how it is used. One check is ensuring that people who join understand EA and have a connection to that region. Another is having filters and coaching for people do want to organise, which should reduce the chance of a negative outcome whilst making it easier for a positive one. I think having someone involved in EA create the various focal points means that we are less likely in the future to see groups appear that have no connection to the wider EA network and research but have already become the default organisation in their area.
AI safety scholarships look worth-funding (if other funding is sane)
  • I don't think it's reasonable to think about FHI DPhil scholarships and even less so RSP as a mainly a funding program. (maybe ~15% of the impact comes from the funding)
  • If I understand the funding landscape correctly, both EA funds and LTFF are potentially able to fund single-digit number of PhDs. Actually has someone approached these funders with a request like "I want to work on safety with Marcus Hutter, and the only thing preventing me is funding"? Maybe I'm too optimistic, but I would expect such requests to have decent chance of success.
I'm Buck Shlegeris, I do research and outreach at MIRI, AMA



For example, CAIS and something like "classical superintelligence in a box picture" disagree a lot on the surface level. However, if you look deeper, you will find many similar problems. Simple to explain example: problem of manipulating the operator - which has (in my view) some "hard core" involving both math and philosophy, where you want the AI to somehow communicate with humans in a way which at the same time allows a) the human to learn from the AI if the AI knows something about the world b) the operator's values are ... (read more)

I'm Buck Shlegeris, I do research and outreach at MIRI, AMA

I think the picture is somewhat correct, and we surprisingly should not be too concerned about the dynamic.

My model for this is:

1) there are some hard and somewhat nebulous problems "in the world"

2) people try to formalize them using various intuitions/framings/kinds of math; also using some "very deep priors"

3) the resulting agendas look at the surface level extremely different, and create the impression you have

but actually

4) if you understand multiple agendas deep enough, you get a sense

  • how they are sometimes "reflecting" t
... (read more)

Thanks for the reply! Could you give examples of:

a) two agendas that seem to be "reflecting" the same underlying problem despite appearing very different superficially?

b) a "deep prior" that you think some agenda is (partially) based on, and how you would go about working out how deep it is?

Update on CEA's EA Grants Program

Re: future of the program & ecosystem influences.

What bad things will happen if the program is just closed

  • for the area overlapping with something "community building-is", CBG will become the sole source of funding, as meta-fund does not fund that. I think at least historically CBG had some problematic influence on global development of effective altruism not because of the direct impact of funding, but because of putting money behind some specific set of advice/evaluation criteria. (To clarify what I mean: I would expect the space would be he
... (read more)
3Nicole_Ross2yThanks for the thoughtful comment. I agree with most of your points, (though am a bit confused on your first one and would like to understand it better if you’d have the time to elaborate. EA Grants didn’t, when I was involved, have an overlapping funding mandate with CBGs, although I think that the distinction was a bit blurrier in the past). I am keen to work with others in the funding ecosystem so it can adapt in a good, healthy way. If you have more specific thoughts on how to make this happen, would love to hear them here or in a call.
Which Community Building Projects Get Funded?

As a side-note: In case of the Bay area, I'd expect some funding-displacement effects. BERI grant-making is strongly correlated with geography and historically BERI funded some things which could be classified as community building. LTFF is also somewhat Bay-centric, and also there seem to be some LTFF grants which could be hypothetically funded by several orgs. Also some things were likely funded informally by local philantrophists.

To make the model more realistic one should note

  • there is some underlying distribution of "worthy things to fund&quo
... (read more)
EA Hotel Fundraiser 6: Concrete outputs after 17 months

meta: I considered commenting, but instead I'm just flagging that I find it somewhat hard to have an open discussion about the EA hotel on the EA forum in the fundraising context. The feeling part is

  • there is a lot of emotional investment in EA hotel,
  • it seems if the hotel runs out of runway, for some people it could mean basically loosing their home.

Overall my impression is posting critical comments would be somewhat antisocial, posting just positives or endorsements is against good epistemics, so the personally safest thing to do for many is not to s... (read more)

7Greg_Colbourn2yRegarding emotional investment, I agree that there is a substantial amount of it in the EA Hotel. But I don't think there is significantly more than there is for any new EA project that several people put a lot of time and effort into. And for many people, not being able to do the work they want to do (i.e. not getting funded/paid to do it) is at least as significant as not being able to live where they want to live. Still, you're right in that critical comments can (often) be perceived as being antisocial. I think part of the reason that EA is considered by new people/outsiders to not be so welcoming can be explained by this.

Flagging that there has been a post specifically soliciting reasons against donating to the EA Hotel:

$100 Prize to Best Argument Against Donating to the EA Hotel

And also a Question which solicited critical responses:

Why is the EA Hotel having trouble fundraising?

I agree that the "equilibrium" you describe is not great, except I don't think it is an equilibrium; more that, due to various factors, things have been moving slower than they ideally should have.

EA hotel struggles to collect low tens of $

I'm guessing you meant tens-of-thousan... (read more)

5RomeoStevens2yThanks for fleshing this out.

I agree that the epistemic dynamics of discussions about the EA Hotel aren't optimal. I would guess that there are selection effects; that critics aren't heard to the same extent as supporters.

Relatedly, the amount of discussion about the EA Hotel relative to other projects may be a bit disproportionate. It's a relatively small project, but there are lots of posts about it (see OP). By contrast, there is far less discussion about larger EA orgs, large OpenPhil grants, etc. That seems a bit askew to my mind. One might wonder about the cost-effectiveness of relatively long discussions about small donations, given opportunity costs.

Only a few people decide about funding for community builders world-wide

In practice, it's almost never the inly option - e.g. CZEA was able to find some private funding even before CBG existed; several other groups were at least partially professional before CBG. In general it's more like it's better if national-level groups are funded from EA

9Manuel_Allgaier2yInteresting! Up until now, my intuition was that private funding is only feasible after the group has been around for a few years, gathered sufficient evidence for their impact and some (former student) members earn enough and donate to it (at least this was the case for EA Norway, as far as I know). Somewhat off-topic, but if you have time, I'd be curious to hear how CZEA managed to secure early private funding. How long had CZEA been active when it first received funding, what kind of donor and what do you think convinced them? (If you'd rather not share that publicly, feel free to email me at [] and if you lack time to elaborate that's fine too!)
Long-Term Future Fund: August 2019 grant recommendations

The reason may be somewhat simple: most AI alignment researchers do not participate (post or comment) on LW/AF or participate only a little. For more understanding why, check this post of Wei Dai and the discussion under it.

(Also: if you follow just LW, your understanding of the field of AI safety is likely somewhat distorted)

With hypothesis 4.&5. I expect at least Oli to have strong bias of being more enthusiastic in funding people who like to interact with LW (all other research qualities being equal), so I'm pretty sure it's not the case

2.... (read more)

The reason may be somewhat simple: most AI alignment researchers do not participate (post or comment) on LW/AF or participate only a little.

I'm wondering how many such people there are. Specifically, how many people (i) don't participate on LW/AF, (ii) don't already get paid for AI alignment work, and (iii) do seriously want to spend a significant amount of time working on AI alignment or already do so in their free time? (So I want to exclude researchers at organizations, random people who contact 80,000 Hours for advice on how to get involved, people

... (read more)
Long-Term Future Fund: August 2019 grant recommendations

In my experience teaching rationality is more tricky than the reference class education, and is an area which is kind of hard to communicate to non-specialists. One of the main reasons seems to be many people have somewhat illusory idea how much they understand the problem.

Get-Out-Of-Hell-Free Necklace

I've suggested something similar for happiness ( ). If you don't want to introduce the weird asymmetry where negative counts and positive not, what you get out of that could be somewhat surprising - it possibly recovers more "common folk" altruism where helping people who are already quite well off could be good, and if you allow more speculative views on the space on mind-states, you are at risk of recovering something closely resembling some sort of "buddhist utilitarian calculus".

EA Forum 2.0 Initial Announcement

As humans, we are quite sensitive to signs of social approval and disapproval, and we have some 'elephant in the brain' motivation to seek social approval. This can sometimes mess up with epistemics.

The karma represents something like sentiment of people voting on a particular comment, weighted in a particular way. For me, this often did not seemed to be a signal adding any new information - when following the forum closely, usually I would have been able to predict what will get downvoted or upvoted.

What seemed problematic to me was 1. a numbe... (read more)

EA Forum 2.0 Initial Announcement

It's not an instance of complain, but take it as a datapoint: I've switched off the karma display on all comments and my experience improved. The karma system tends to mess up with my S1 processing.

It seems plausible karma is causing harm in some hard to perceive ways. (One specific way is by people updating on karma pattern mistaking them for some voice of the community / ea movement / ... )

2MichaelDickens2yCan you elaborate on how you turned off karma display? I would love to use your code if you're willing to share it. I strongly dislike posting on the EA Forum because of how the karma system works, and and my experience would be vastly improved if I couldn't see post/comment karma.
2Denise_Melchin2y>> I've switched off the karma display on all comments and my experience improved. The karma system tends to mess up with my S1 processing. Fully understand if you don't want to, but I'm curious if you could elaborate on this. I'm not entirely sure what you mean.
Is there an analysis that estimates possible timelines for arrival of easy-to-create pathogens?

I would expect if organizations working in the area have reviews of expected technologies and how they enable individuals to manufacture pathogens, which is likely the background necessary for constructing timelines, they would not publish too specific documents.

What new EA project or org would you like to see created in the next 3 years?

If people think this is generally good idea I would guess CZEA can make it running in few weeks. Most of the work likely comes from curating the content, not from setting up the service

Long-Term Future Fund: April 2019 grant recommendations

To clarify - agree with the benefits of splitting the discussion threads for readability, but I was unenthusiastic about the motivation be voting.

Long-Term Future Fund: April 2019 grant recommendations

I don't think karma/voting system should be given that much attention or should be used as a highly visible feedback on project funding.

I do think that it would help independently of that by allowing more focused discussion on individual issues.

Long-Term Future Fund: April 2019 grant recommendations

I don't think anyone should be trying to persuade IMO participants to join the EA community, and I also don't think giving them "much more directly EA content" is a good idea.

I would prefer Math Olympiad winners to think about long-term, think better, and think independently, than to "join the EA community". HPMoR seems ok because it is not a book trying to convince you to join a community, but mostly a book about ways how to think, and a good read.

(If they readers eventually become EAs after reasoning independently, it'... (read more)

8Habryka3yAgree with this. I do think there is value in showing them that there exists a community that cares a lot about the long-term-future, and do think there is some value in them collaborating with that community instead of going off and doing their own thing, but the first priority should be to help them think better and about the long-term at all. I think none of the other proposed books achieve this very well.
How x-risk projects are different from startups

I don't think risk of this type is given too much weight now. In my model, considerations like this got at some point in the past rounded of to some over-simplified meme like "do not start projects, they fail and it is dangerous". This is wrong and led to some counterfactual value getting lost.

This was to some extent reaction to the previous mood, which was more like "bring in new people; seed groups; start projects; grow everything". Which was also problematic.

In my view we are looking at something like pendulum swings, where we ... (read more)

Just wanted to say I appreciate the nuance you're aiming at here. (Getting that nuance right is real hard)

How x-risk projects are different from startups

Discussing specific examples seems very tricky - I can probably come up with a list of maybe 10 projects or actions which come with large downside/risks, but I would expect listing them would not be that useful and can cause controversy.

Few hypothetical examples

  • influencing mayor international regulatory organisation in a way leading to creating some sort of "AI safety certification" in a situation where we don’t have the basic research yet, creating false sense of security/fake sense of understanding
  • creating a highly distorted version of effecti
... (read more)
Request for comments: EA Projects evaluation platform

My impression is you have in mind something different than what was intended in the proposal.

What I imagined was 'priming' the argument-mappers with prompts like

  • Imagine this projects fails. How?
  • Imagine this project works, but has some unintended bad consequences. What they are?
  • What would be a strong reason not to associate this project with the EA movement?

(and the opposites). When writing their texts the two people would be communicating and looking at the arguments from both sides.

The hope is this would produce more complete argument map. One... (read more)

Request for comments: EA Projects evaluation platform

My impression was based mostly on our conversations several months ago - quoting the notes from that time

lot of the discussion and debate derives from differing assumptions held by the participants regarding the potential for bad/risky projects: Benjamin/Brendon generally point out the lack of data/signal in this area and believe launching an open project platform could provide data to reduce uncertainty, whereas Jan is more conservative and prioritizes creating a rigorous curation and evaluation system for new projects.

I think it is fair to say you expect... (read more)

3Brendon_Wong3yThe community has already had many instances of openly writing about ideas, seeking funding on the EA Forum, Patreon, and elsewhere, and posting projects in places like the .impact hackpad and the currently active EA Work Club. Since posting about projects and making them known to community members seems to be a norm, I am curious about your assessment of the risk and what, if anything, can be done about it. Do you propose that all EA project leaders seek approval from a central evaluation committee or something before talking with others about and publicizing the existence of their project? This would highly concern me because I think it's very challenging to predict the outcomes of a project, which is evidenced by the fact that people have wildly different opinions on how good of an idea or how good of a startup something is. Such a system could be very negative EV by greatly reducing the number of projects being pursued by providing initial negative feedback that doesn't reflect how the project would have turned out or decreasing the success of projects because other people are afraid to support a project that did not get backing from an evaluation system. I expect significant inaccuracy from my own project evaluation system as well as the project evaluation systems of other people and evaluation groups. I wrote about the chicken and the egg problem here [] . As noted in my comments on the announcement post, the angels have significant amounts of funding available. Other funders do not disclose some of these statistics, and while we may do so in the future, I do not think it is necessary before soliciting proposals. The time cost of applying is pretty low, particularly if people are recycling content they have already written. I think we are the first grantmaking group to give all applicants feedback on their application which I th
Request for comments: EA Projects evaluation platform

Summary impressions so far: object-level

  • It seems many would much prefer expediency in median project cases to robustness and safety in rare low frequency possibly large negative impact cases. I do not think this is the right approach, when the intention is also to evaluate long-term oriented, x-risk, meta-, cause-X, or highly ambitious projects.
  • I'm afraid there is some confusion about project failure modes. I'm more worried about projects which would be successful in having a team, working successfully in some sense, changing the world, but achie
... (read more)
Request for comments: EA Projects evaluation platform

Thanks Sundanshu! Sorry for not replying sooner, I was a bit overwhelmed by some of the negative feedback in the comments.

I don't think step 1b. has the same bottleneck as current grant evaluator face, because it is less dependent on good judgement.

With your proposal, I think part of it may work, I would be worried about other parts. With step 2b I would fear nobody would feel responsible for producing the content.

With 3a or any automatic steps like that, what does that lack is some sort of (reasonably) trusted expert judgement. In my view this is ac... (read more)

Request for comments: EA Projects evaluation platform

FWIW, part of my motivation for the design, was

1. there may be projects, mostly in long-term, x-risk, meta- and outreach spaces, which are very negative, but not in an obvious way

2. there may be ideas, mostly in long-term and x-risk, which are infohazard

The problem with 1. is most of the EV can be caused by just one project, with large negative impact, where the downside is not obvious to notice.

It seems to me standard startup thinking does not apply here, because startups generally can not go way bellow zero.

I also do not trust arbitrary set of forum us... (read more)

Request for comments: EA Projects evaluation platform

It is possible my reading of your post somewhat blended with some other parts of the discussion, which are in my opinion quite uncharitable reading of the proposal. Sorry for that.

Actually from the list, I talked about it and shared the draft with people working on EA grants, EA funds, and Brendon, and historically I had some interactions with BERI. What I learned is people have different priors over existence of bad projects, ratio of good projects, number of projects which should or should not get funded. Also opinions of some of the funders are at odds ... (read more)

Request for comments: EA Projects evaluation platform
You are missing one major category here: projects which are simply bad because they do have approximately zero impact, but aren't particularly risky. I think this category is the largest of the the four.

I agree that's likely. Please take the first paragraphs more as motivation than precise description of the categories.

Which projects have a chance of working and which don't is often pretty clear to people who have experience evaluating projects quite quickly (which is why Oli suggested 15min for the initial investigation above).

I think we a... (read more)

Request for comments: EA Projects evaluation platform

I'm not sure if we agree or disagree, possibly we partially agree, partially disagree. In case of negative feedback, I think as a funder, you are in greater risk of people over-updating in the direction "I should stop trying".

I agree friends and social neighbourhood may be too positive (that's why the proposed initial reviews are anonymous, and one of the reviewers is supposed to be negative).

When funders give general opinions on what should or should not get started or how you value or not value things, again, I think you are at greater risk of having too much of an influence on the community. I do not believe the knowledge of the funders is strictly better than the knowledge of grant applicants.

(I still feel like I don’t really understand where you’re coming from.)

I am concerned that your model of how idea proposals get evaluated (and then plausibly funded) is a bit off. From the original post:

hard to evaluate which project ideas are excellent , which are probably good, and which are too risky for their estimated return.

You are missing one major category here: projects which are simply bad because they do have approximately zero impact, but aren't particularly risky. I think this category is the largest of the the four.

Which projects have ... (read more)

Request for comments: EA Projects evaluation platform

On a meta-level

I'm happy to update the proposal to reflect some of the sentiments. Openly, I find some of them quite strange - e.g. it seems, coalescing the steps into one paragraph and assuming all the results (reviews, discussion, "authoritative" summary of the discussion) will just happen may make it look more flexible. Ok, why not.

Also it seems you and Oli seem to be worried that I want to recruit people who are currently not doing some high-impact direct work ... instead of just asking a couple of people around me, which would often mea... (read more)

We had some discussion with Brendon, and I think his opinion can be rounded to "there are almost no bad projects, so to worry about them is premature". I disagree with that.

I do not think your interpretation of my opinion on bad projects in EA is aligned with what I actually believe. In fact, I actually stated my opinion in writing in a response to you two days ago which seems to deviate highly from your interpretation of my opinion.

I never said that there are "almost no bad projects." I specifically said I don't think that "m... (read more)

This is an uncharitable reading of my comment in many ways.

First, you suggest that I am worried that you want to recruit people not currently doing direct work. All things being equal, of course I would prefer to recruit people with fewer alternatives. But all things are not equal. If you use people you know for the initial assessments, you will much more quickly be able to iron out bugs in the process. In the testing stages, it's best to have high-quality workers that can perceive and rectify problems, so this is a good use of time for smart, trusted... (read more)

I think it much harder to give open feedback if it is closely tied with funding. Feedback from funders can easily have too much influence on people, and should be very careful and nuanced, as it comes from some position of power. I would expect adding financial incentives can easily be detrimental for the process. (For self-referential example, just look on this discussion: do you think the fact that Oli dislikes my proposal and suggest LTF can back something different with $20k will not create at least some unconscious incentives?)

I'm a bit confused... (read more)

Request for comments: EA Projects evaluation platform

It is very easy to replace this stage with e.g. just two reviews.

Some of the arguments for the contradictory version

  • the point of this stage is not to produce EV estimate, but to map the space of costs, benefits, and considerations
  • it is easier to be biased in a defined way than unbiased
  • it removes part of the problem with social incentives

Some arguments against it are

  • such adversarial setups for truth-seeking are uncommon outside of judicial process
  • it may contribute to unnecessary polarization
  • the splitting may feel unnatural

3dpiepgrass3yBased on Habryka's point, what if "stage 1b" allowed the two reviewers to come to their own conclusions according to their own biases, and then at the end, each reviewer is asked to give an initial impression as to whether it's fund-worthy (I suppose this means its EV is equal to or greater than typical GiveWell charity) or not (EV may be positive, but not high enough). This impression doesn't need to be published to anyone if, as you say, the point of this stage is not to produce an EV estimate. But whenever both reviewers come to the same conclusion (whether positive or not), a third reviewer is asked to review it too, to potentially point out details the first reviewers missed. Now, if all three reviewers give a thumbs down, I'm inclined to think ... the applicant should be notified and suggested to go back to the drawing board? If it's just two, well, maybe that's okay, maybe EV will be decidedly good upon closer analysis. I think reviewers need to be able (and encouraged) to ask questions of the applicant, as applications are likely to be have some points that are fuzzy or hard to understand. It isn't just that some proposals are written by people with poor communication skills; I think this will be a particular problem with ambitious projects whose vision is hard to articulate. Perhaps the Q&As can be appended to the application when it becomes public? But personally, as an applicant, I would be very interested to edit the original proposal to clarify points at the location where they are first made. And perhaps proposals will need to be rate-limited to discourage certain individuals from wasting too much reviewer time?
Request for comments: EA Projects evaluation platform

I don't see why continuous coordination of a team of about 6 people on slack would be very rigid, or why people would have very narrow responsibilities.

For the panel, having some defined meeting and evaluating several projects at once seems time and energy conserving, especially when compared to the same set of people watching the forum often, being manipulated by karma, being in a way forced to reply to many bad comments, etc.

Request for comments: EA Projects evaluation platform

On the contrary: on slack, it is relatively easy to see the upper bound of attention spent. On the forum, you should look not on just the time spent to write comments, but also on the time and attention of people not posting. I would be quite interested how much time for example CEA+FHI+GPI employees spend reading the forum, in aggregate (I guess you can technically count this.)

4Habryka3y*nods* I do agree that you, as the person organizing the project, will have some sense of how much time has been spent, but I think it won't be super easy for you to communicate that knowledge, and it won't by default help other people get better at estimating the time spent on things like this. It also requires everyone watching to trust you to accurately report those numbers, which I do think I do, but I don't think everyone necessarily has reason to. I do think on Slack you also have to take into account the time of all the people not posting, and while I do think that there will be more time spent just reading and not writing on the EA Forum, I generally think the time spent reading is usually worth it for people individually (and importantly people are under no commitment to read things on the EA Forum, whereas the volunteers involved here would have a commitment to their role, making it more likely that it will turn out to be net-negative for them, though I recognize that there are some caveats where sometimes there are controversial topics that cause a lot of people to pay attention to make sure that nothing explodes).
Request for comments: EA Projects evaluation platform

I don't understand why you assume the proposal is intended as something very rigid, where e.g. if we find the proposed project is hard to understand, nobody would ask for clarification, or why you assume the 2-5h is some dogma. The back-and-forth exchange could also add to 2-5h.

With assigning two evaluators to each project you are just assuming the evaluators would have no say in what to work on, which is nowhere in the proposal.

Sorry but can you for a moment imagine also some good interpretation of the proposed schema, instead of just weak-manning every other paragraph?

I am sorry for appearing to be weak-manning you. I think you are trying to solve a bunch of important problems that I also think are really important to work on, which is probably why I care so much about solving them properly and have so many detailed opinions about how to solve them. While I do think we have strong differences in opinion on this specific proposal, we probably both agree on a really large fraction of important issues in this domain, and I don't want to discourage you from working in this domain, even if I do think this specific propo... (read more)

Request for comments: EA Projects evaluation platform

I would be curious about you model why the open discussion we currently have does not work well - like here, where user nonzerosum proposed a project, the post was heavily downvoted (at some point to negative karma) without substantial discussion of the problems. I don't think the fact that I read the post after three days and wrote some basic critical argument is a good evidence for an individual reviewer and a board is much less likely to notice problems with a proposal than a broad discussion with many people contributing would.

Also when you are ma... (read more)

I also don't see how complex discussion on the forum with the high quality reviews you imagine would cost 5 hours.

I think an initial version of the process, in which you plus maybe one or two close collaborators, would play the role of evaluators and participate in an EA Forum thread, would take less than 5 hours to set up and less than 15 hours of time to actually execute and write reviews on, and I think would give you significant evidence about what kind of evaluations will be valuable and what the current bottlenecks in this space are.

I would be
... (read more)
Request for comments: EA Projects evaluation platform

With the first part, I'm not sure what would you imagine as the alternative - having access to evaluators google drive so you can count how much time they spent writing? The time estimate is something like an estimate how much it can take for volunteer evaluators - if all you need is in the order of 5m you are either really fast or not explaining your decisions.

I expect much more time of experts will be wasted in forum discussions you propose.

8Habryka3yI think in a forum discussion, it's relatively easy to see how much someone is participating in the discussion, and to get a sense of how much time they spent on stuff. I am not super confident that less time would be wasted in the forum discussions I am proposing, but I am confident that I and others would notice if lots of people's time was wasted, which is something I am not at all confident about for your proposal and which strongly limits the downside for the forum case.
Request for comments: EA Projects evaluation platform

As I've already explained in the draft, I'm still very confused by what

An individual reviewer and a board is much less likely to notice problems with a proposal than a broad discussion with many people contributing would ...

should imply for the proposal. Do you suggest that steps 1b. 1d. 1e. are useless or harmful, and having just the forum discussion is superior?

The time of evaluators is definitely definitely definitely not free, and if you treat them as free then you end up exactly in the kind of situation that everyone is complaining about. P... (read more)

My overall sense of this is that I can imagine this process working out, but the first round of this should ideally just be run by you and some friends of yours, and should not require 100+ hours of volunteer time. My expectation is that after you spend 10 hours trying to actually follow this process, with just one or two projects, on your own or with some friends, that you will find that large parts of it won't work as you expected and that the process you designed is a lot too rigid to produce useful evaluations.

Generally almost all of the process is open, so I don't see what should be changed. If the complain is the process has stages instead of unstructured discussion, and this makes it less understandable for you, I don't see why.

One part of the process that is not open is the way the evaluators are writing their proposals, which is as I understand it where the majority of person-time is being spent. It also seems that all the evaluations are going to be published in one big batch, making it so that feedback on the evaluation process would take until ... (read more)

As I've already explained in the draft, I'm still very confused by what [...] should imply for the proposal. Do you suggest that steps 1b. 1d. 1e. are useless or harmful, and having just the forum discussion is superior?

I am suggesting that they are probably mostly superfluous, but more importantly, I am suggesting that a process that tries to separate the public discussion into a single stage, that is timeboxed at only a week, will prevent most of the value of public discussion, because there will be value from repeated back and forth at multipl... (read more)

Request for comments: EA Projects evaluation platform

To make the discussions more useful, I'll try to briefly recapitulate parts of the discussions and conversations I had about this topic in private or via comments in the draft version. (I'm often coalescing several views into more general claim)

There seems to be some disagreement about how rigorous and structured the evaluations should be - you can imagine a scale where on one side you have just unstructured discussion on the forum, and on the opposite side you have "due diligence", multiple evaluators writing detailed reviews, panel of... (read more)

To respond more concretely to the "due diligence" vs. unstructured discussion section, which I think refers to some discussion me and Jan had on the Google doc he shared:

I think the thing I would like to see is something that is just a bit closer towards structured discussion than what we currently have on the forum. I think there doesn't currently exist anything like an "EA Forum Project discussion thread" and in particular not one that has any kind of process like

"One suggestion for a project per top-level comment. If you ... (read more)

(Here are some of the comments that I left on the draft version of this proposal that I was sent, split out over multiple comments to allow independent voting):

[Compared to an open setup where any reviewer can leave feedback on any project in an open setting like a forum thread] An individual reviewer and a board is much less likely to notice problems with a proposal, and a one-way publishing setup is much more likely to cause people to implement bad projects than a setup where people are actively trying to coordinate work on a proposal in a consolidated t... (read more)

Here are some of the comments that I left on the draft version of this proposal that I was sent (split out over multiple comments to allow independent voting):

I continue to think that just having an open discussion thread, with reviewers participating in the discussion with optional private threads, will result in a lot more good than this.
Based on my experience with the LTF-Fund, I expect 90% of the time there will be one specific person who you need a 5 minute judgement from to judge a project, much more than you need a 2-5h evaluation. This makes an o
... (read more)
Load More