Topic Contributions


The biggest risk of free-spending EA is not optics or motivated cognition, but grift

Seems to me that scarcity can also be grift-inducing, e.g. if a tech company only hires the very top performers on its interview, it might find that most hires are people who looked up the questions beforehand and rehearsed the answers. But if the company hires any solid performer, that doesn't induce a rehearsal arms race -- if it's possible to get hired without rehearsing, some people will value their integrity enough to do this.

The CEEALAR model is interesting because it combines a high admission rate with low salaries. You're living with EAs in an undesirable city, eating vegan food, and getting paid peanuts. This seems unattractive to professional grifters, but it might be attractive to deadbeat grifters. Deadbeat grifters seem like a better problem to have since they're less sophisticated and less ambitious on average.

Another CEEALAR thing: living with someone helps you get to know them. It's easier to put up a facade for a funder than for your roommates.

...three conditions that sociologists since the 1950s have considered crucial to making close friends: proximity; repeated, unplanned interactions; and a setting that encourages people to let their guard down and confide in each other, said Rebecca G. Adams, a professor of sociology and gerontology at the University of North Carolina at Greensboro. This is why so many people meet their lifelong friends in college, she added.

Source. When I was at CEEALAR, it seemed to me like the "college dorm" atmosphere was generating a lot of social capital for the EA movement.

I don't think CEEALAR is perfect (and I also left years ago so it may have changed). But the overall idea seems good to iterate on. People have objected in the past because of PR weirdness, but maybe that's what we need to dissuade the most dangerous sort of grifter.

Why Helping the Flynn Campaign is especially useful right now

I think there is no harm in setting up an alert in case there are more threads about him. The earlier you arrive in a thread, the greater the opportunity to influence the discussion. If people are going to be reading a negative comment anyways, I don't think there is much harm in replying, at least on reddit -- I don't think reddit tends to generate more views for a thread with more activity, the way twitter can. In fact, replying to the older threads on reddit could be a good way to test out messaging, since almost no one is reading at this point, but you might get replies from people who left negative comments and learn how to change their mind. I've had success arguing for minority positions on my local subreddit by being friendly, respectful, and factual.

Beyond that I'm really not sure, creating new threads could be a high-risk/high-reward strategy to use if he's falling in the polls. Maybe get him to do an AMA?

My local subreddit's subscriber count is about 20% of the population of the city, and I've never seen a political candidate post there, even though there is lots of politics discussion. I think making an AMA saying what you've learned from talking to voters, and asking users what issues are most important to them, early in a campaign could be a really powerful strategy (edit: esp. if prearranged w/ subreddit moderators). I don't know if there is a comparable subreddit for District 6 though, e.g. this subreddit only has about 1% of the city population according to Wikipedia, and it's mostly pretty pictures right now so they might not like it if you started talking about politics.

Why Helping the Flynn Campaign is especially useful right now

Have you thought about crossposting this to some local subreddits? I searched for Carrick's name on reddit and he seems to be very unpopular there. People are tired of his ads and think he's gonna be a shill for the crypto industry. Maybe could make a post like "Why all of the Flynn ads? An explanation from a campaign volunteer"

Some clarifications on the Future Fund's approach to grantmaking

A model that I heard TripleByte used sounds interesting to me.

I wrote a comment about TripleByte's feedback process here; this blog post is great too. In our experience, the fear of lawsuits and PR disasters from giving feedback to rejected candidates was much overblown, even at a massive scale. (We gave every candidate feedback regardless of how well they performed on our interview.)

Something I didn't mention in my comment is that much of TripleByte's feedback email was composed of prewritten text blocks carefully optimized to be helpful and non-offensive. While interviewing a candidate, I would check boxes for things like "this candidate used their debugger poorly", and then their feedback email would automatically include a prewritten spiel with links on how to use a debugger well (or whatever). I think this model could make a lot of sense for the fund:

  • It makes giving feedback way more scalable. There's a one-time setup cost of prewriting some text blocks, and probably a minor ongoing cost of gradually improving your blocks over time, but the marginal cost of giving a candidate feedback is just 30 seconds of checking some boxes. (IIRC our approach was to tell candidates "here are some things we think it might be helpful for you to read" and then when in doubt, err on the side of checking more boxes. For funding, I'd probably take it a step further, and rank or score the text blocks according to their importance to your decision. At TripleByte, we would score the candidate on different facets of their interview performance and send them their scores -- if you're already scoring applications according to different facets, this could be a cheap way to provide feedback.)

  • Minimize lawsuit risk. It's not that costly to have a lawyer vet a few pages of prewritten text that will get reused over and over. (We didn't have a lawyer look over our feedback emails, and it turned out fine, so this is a conservative recommendation.)

  • Minimize PR risk. Someone who posts their email to Twitter can expect bored replies like "yeah, they wrote the exact same thing in my email." (Again, PR risk didn't seem to be an issue in practice despite giving lots of freeform feedback along with the prewritten blocks, so this seems like a conservative approach to me.)

If I were you, I think I'd experiment with hiring one of the writers of the TripleByte feedback emails as a contractor or consultant. Happy to make an intro.

A few final thoughts:

  • Without feedback, a rejectee is likely to come up with their own theory of why they were rejected. You have no way to observe this theory or vet its quality. So I think it's a mistake to hold yourself to a high bar. You just have to beat the rejectee's theory. (BTW, most of the EA rejectee theories I've heard have been very cynical.)

  • You might look into liability insurance if you don't have it already; it probably makes sense to get it for other reasons anyway. I'd be curious how the cost of insurance changes depending on the feedback you're giving.

Bad Omens in Current Community Building

Assume that people find you more authoritative, important, and hard-to-criticise than you think you are. It’s usually not enough to be open to criticism - you have to actually seek it out or visibly reward it in front of other potential critics.

Chapter 7 in this book had a number of good insights on encouraging dissent from subordinates, in the context of disaster prevention.

Why not offer a multi-million / billion dollar prize for solving the Alignment Problem?

My solution to this problem (originally posted here) is to run builder/breaker tournaments:

  • People sign up to play the role of "builder", "breaker", and/or "judge".
  • During each round of the tournament, triples of (builder, breaker, judge) are generated. The builder makes a proposal for how to build Friendly AI. The breaker tries to show that the proposal wouldn't work. ("Builder/breaker" terminology from this report.) The judge moderates the discussion.
    • Discussion could happen over video chat, in a Google Doc, in a Slack channel, or whatever. Personally I'd do text: anonymity helps judges stay impartial, and makes it less intimidating to enter because no one will know if you fail. Plus, having text records of discussions could be handy, e.g. for fine-tuning a language model to do alignment work.
  • Each judge observes multiple proposals during a round. At the end of the round, they rank all the builders they observed, and separately rank all the breakers they observed. (To clarify, builders are really competing against other builders, and breakers are really competing against other breakers, even though there is no direct interaction.)
  • Scores from different judges are aggregated. The top scoring builders and breakers proceed to the next round.
  • Prizes go to the top-ranked builders and breakers at the end of the tournament.

The hope is that by running these tournaments repeatedly, we'd incentivize alignment progress, and useful insights would emerge from the meta-game:

  • "Most proposals lack a good story for Problem X, and all the breakers have started mentioning it -- if you come up with a good story for it, you have an excellent shot at the top prize"
  • "Almost all the top proposals were variations on Proposal Z, but Proposal Y is an interesting new idea that people are having trouble breaking"
  • "All the top-ranked competitors in the recent tournament spent hours refining their ideas by playing with a language model fine-tuned on earlier tournaments plus the Alignment Forum archive"

I think if I was organizing this tournament, I would try to convince top alignment researchers to serve as judges, at least in the later rounds. The contest will have more legitimacy if prizes are awarded by experts. If you had enough judging capacity, you might even be able to have a panel of judges observe each proposal. If you had too little, you could force contestants to judge some matches they weren't participating in as a condition of entry. [Edit: This might not be the best idea because of perverse incentives. So probably just cash compensation to attract judges is a better idea.]

[Edit 2: One way things could be unfair is if e.g. Builder A happens to be matched with a strong Breaker A, and Builder B happens to be matched with a weaker Breaker B, it might be hard for a judge who observes both proposals to figure out which is stronger. To address this, maybe the judge could observe 4 pairings: Builder A with Breaker A, Builder A with Breaker B, Builder B with Breaker A, and Builder B with Breaker B. That way they'd get to see Builder A and Builder B face the same 2 adversaries, allowing for a more apples-to-apples comparison.]

Milan Griffes on EA blindspots

To be frank, I think most of these criticisms are nonsense and I am happy that the EA community is not spending its time engaging with whatever the 'metaphysical implications of the psychedelic experience' are.


If the EA community has not thought sufficiently about a problem, anyone is very welcome to spend time thinking about it and do a write-up of what they learned... I would even wager that if someone wrote a convincing case for why we should be 'taking dharma seriously', then many would start taking it seriously.

These two bits seem fairly contradictory to me.

If you think a position is "nonsense" and you're "happy that the EA community is not spending its time engaging with" it, is someone actually "very welcome" to do a write-up about it on the EA Forum?

In a world where a convincing case can be written for a weird view, should we really expect EAs to take that view seriously, if they're starting from your stated position that the view is nonsense and not worth the time to engage with? (Can you describe the process by which a hypothetical weird-but-correct view would see widespread adoption?)

And, who would take the time to try & write up such a case? Milan said he thinks EA "basically can't hear other flavors of important feedback", suggesting a sense in which he agrees with your first paragraph -- EAs tend to think these views are nonsense and not worth engaging with, therefore there is no point in defending them at length because no one is listening.

I'm reminded of this post which stated:

We were told by some that our critique is invalid because the community is already very cognitively diverse and in fact welcomes criticism... It was these same people that then tried to prevent this paper from being published.

Rhetorical Abusability is a Poor Counterargument

Another different and perhaps more relevant question is whether popularizing belief in consequentialism has net bad consequences on the margin.

EA/Rationalist Safety Nets: Promising, but Arduous

Good point. However, since Howie was employed at an EA organization, he might be eligible for the idea described here. One approach is to implement several overlapping ideas, and if there's an individual for whom none of the ideas work, they could go through the process Ozzie described in the OP (with the associated unfortunate downsides).

Democratising Risk - or how EA deals with critics

I believe these are authors already working at EA orgs, not "brave lone researchers" per se.

Load More