Hide table of contents

Epistemic status: personal observations, personal preferences extrapolated. Uses one small random sample and one hard data source, but all else is subjective.

 

I think the Forum could use some epistemic healthcare. As Rohin Shah puts it

for any particular high-karma post, knowing nothing else about the post besides that it is high karma, there is a non-trivial probability that I would find significant reasoning issues in that post. You can't rely solely on karma as a strong signal of epistemics.

I randomly sampled 20 such posts and guess this probability is about 30% [90% CI: 10%, 50%].

But we say we value rigour to an unusual extent. So why aren't we rigorous?

Well, rigour is rare because it's hard. But I argue that our platform can make it easier.

Many posts aren't about reasoning

One innocuous point first: there are lots of reasons to write a post besides asserting a novel claim.

  • Questions
  • Admin like announcements
  • New ideas to consider
  • Summarising your understanding of other work
  • Coordination, common knowledge
  • Fiction! <3
  • Practising writing

Karma means too much

Less innocuous is that our voting system doesn’t distinguish

  • "I agree"
  • "The argument is rigorous / the evidence is strong"
  • "Makes a novel point"
  • "This makes me feel good"
  • "I encourage you to post more"

or, lately:

  • "This post belongs to my faction on The Current Thing"
     

Is this a problem? That is: is anyone using karma as a measure of rigour or truth? 

Actually yes: via the halo effect (things good in one way get unthinkingly treated as good in other ways) I expect this to be somewhat true. And I expect high karma to cause a post to get read more, if only because of readers' fear of missing out.

I just want karma to mean "worth reading", which is a much easier objective than "is rigorously argued", but which we still often fail to meet. 

People have been arguing about having multiple types of upvote for a long time at  e.g. LessWrong ("I agree with this" vs "This persuaded me" vs "I want to encourage this person to keep posting", and the converses). I incline to think that this is too messy -but we can still try to decouple the truth economy from the vibe economy a bit, see below.


So here are some ideas. Some are unobtrusive enough that they're worth changing the UI for. Others are more like norms I'd like us to introduce, or reintroduce.

Feature ideas 

  • UI. Add an extra editor text field for an "Epistemic status" at the start of the post. Time cost: 2 mins per post.
  • UI: A Like button, separate from upvotes. To capture vibes and maybe leave karma to epistemics. Time cost: N/A.
  • UI: Make a post summary required if the post is >1000 words. [EDIT: Can even integrate NLP here for a terrible 80/20 version.] Time cost: 10mins.
  • UI: Add a “Limitations” text field to the editor. Time cost: 30 mins.
  • Speculative: NLP for claim detection: the site asks you for your probabilities about the main claims. Time cost: 30 mins.
  • Speculative: NLP to autosuggest prediction markets you could open. Time cost: 15 mins.
  • Speculative: Two comment tabs, one for vibes and one for critique.

Norms

  • You can implement all of the above yourself! The devs don't need to do it for us.
  • e.g. If you liked the post but it doesn’t update your views, just comment on it saying "Nice!".
  • Address criticism in advance. Simulate it if you are unfortunate enough to not have critics. 
    • If you mention someone / some specific view, send a draft to them before posting.
  • To date, most throwaway accounts have been used to write magnificent hard truths we wouldn't otherwise hear. So maybe these are underused.
  • I like Nuno's personal digest and want to see more of these as a workaround for the karma system not meaning what I want it to.

 

A big change, which I'll write up separately, is something like paid, open peer review for posts which seek to seriously update the community. Likely bringing in outsiders, likely post-publication. (This is close to red-teaming but using outsiders, come to think of it.) This will be expensive, but it passes a shallow cost-benefit test, to me.

These ideas increase the effort it takes to post something. I think it's well worth the time, but if you're on the margin of not posting, or if you're new and intimidated, it's likely better to skip them and just post. No excuse for the old-timers though.


Limitations

  • The above is not mostly based on hard data; instead it's 7 years of reading the forum and a tiny random sample of posts.
  • So I don't know that the above features will repay their time cost or even their dev cost.
  • I am very opinionated.

PS: I used the GraphQL backend (thanks devs!) to get some basic stats:

  • Average post length: 1235 words
  • Average length w/ 125+ karma: 2600 words

If good evidence or arguments took longer to state, then this would be weakly reassuring. But probably that correlation is too weak to say anything.

62

0
0

Reactions

0
0

More posts like this

Comments17
Sorted by Click to highlight new comments since: Today at 2:27 PM

I expect high karma to cause a post to get read more, if only because of readers' fear of missing out.

I would have phrased this claim a bit more confidently, as there are systems in place that basically ensure this will be the case, at least on average. For example, higher-karma posts stay on the front page longer and are more likely to be selected for curation in the EA Forum Newsletter, the EA Newsletter, and other off-site amplification channels.

Spelling some things out:

Summaries are relevant to epistemics because they make it 10x easier for others to 1) see all your claims, and 2) inspect your logic. This is even better if you post your summary in the form of a philosophy-style argument as I did here

The argument:

  1. EA safety is small, even relative to a single academic subfield.
  2. There is overlap between capabilities and short-term safety work.
  3. There is overlap between short-term safety work and long-term safety work.
  4. So AI safety is less neglected than the opening quotes imply.
  5. Also, on present trends, there’s a good chance that academia [EDIT: and industry] will do more safety over time, eventually dwarfing the contribution of EA.

Having a summary as an opener also means that people who swing past your post in 30 seconds (i.e., the majority of your audience) will nevertheless get the gist.

(This is a marginal comment that doesn’t need a reply, it is running into gadfly territory).

Obviously, writing a good summary is costly and the proposed philosophical/axiom like writing is further harder.

One guess of what you’re getting at, is that (somehow making) the rewarding of such summaries a strong norm would improve the forum and this would filter out bad thinking and the very act of such writing would improve discourse and even thought, a la sort of the point of philosophy in the first place.

I guess one issue is that I am skeptical this could be inculturated easily or at all.

Your response to my other comment (“my shorter UI suggestions (epistemic status and summary) could be required by the server. Behaviour change solved”) doesn’t address the quality thing—so I guess everything in that comment, especially fluency and seating, applies again.

I mean, one way to see this whole issue is basically EA’s worry about “Eternal September”, right? It seems unlikely that would be such meme/fear if you could just tell people to get good.

Another issue is that runs into the teeth of the major, needed effort to reduce frictions around posting, this seems sort of like the opposite direction.

It's very easy to write code that relaxes these constraints for new users, which should serve the friction reduction goal, if that's a goal we should have. 

I have no illusions about the easiness of norm setting, hence code first. This post is a nudge in the direction I want; this is all I wish to do at the mo. 

Good summaries are very hard, but a bad summary is better than no summary. These small changes do not need to solve the whole problem to be worthwhile.

Nice! Related to summary and limitation boxes in the editor, maybe the forum could offer post templates for different kinds of posts. For example a template for project proposals could involve a TLDR, summary, theory of change, cost-benefit estimate, next steps, key uncertainties, requests for feedback/co-founders. Other template candidates might be cause area explorations, criticism of EA, question posts, book reviews, project reviews. 

Edit: A mvp version of this might be suggesting a "role model" post for different content categories.

In my view, forum members are on average too inclined to upvote posts and comments that aren't well-argued because they agree with the conclusion. 

A big change, which I'll write up separately, is something like paid, open peer review for posts which seek to seriously update the community. Likely bringing in outsiders. This will be expensive, but it passes a shallow cost-benefit test, to me.

Do you mean pre-publication or post-publication peer view? If the latter, I suppose the author need not be involved (and need not pay for it). As such it should not in itself increase the effort it takes to post something. (Though people may spend more effort on the post if they know they'll be more scrutinised; however, in my view that's not necessarily a bad thing.)

Post! 

I edited the post to make this clearer

LessWrong has been A/B testing for a voting system separate from karma for  "agree/disagree". I would suggest contacting the LW team to know 1) the results from their experiments 2) how easy it would be to just copy the feature to the EAF (since codebases used to be the same).

I saw one of the experiments, it was really confusing.

Speculative: NLP for claim detection: the site asks you for your probabilities about the main claims. Time cost: 30 mins.

You think it'd take only 30 minutes to implement a feature that detects claims in forum posts? I'm not a web developer but that strikes me as wildly optimistic.

I think all time costs stated are time costs to author of the post.

From a product and ML implementation perspective and for the NLP component of the problem, I think in this case, it might be easy to build an 80% good solution.

It’s less that the system will find and understand all arguments but more like the author might be asked questions and it’s relatively easy to see if the answers cover the same space as the post content.

My guess is that manipulation won’t make sense or even enter into peoples minds (with other design choices not related to the NLP) so a useful system that say gives guide rails is much easier to implement.

Ah, you're right, I misinterpreted it since the epistemic status suggestion said time per post and that one didn't.

Sinclair has been working on allowing authors to embed Manifold prediction markets inside of a LessWrong/EA Forum post! See: https://github.com/ForumMagnum/ForumMagnum/pull/4907

So ideally, you could set up a prediction market for each of these things eg

  • "How many  epistemic corrections will the author issue in the next week?"
  • "Will this post win an EA Forum prize?"
  • "Will this post receive >50 karma?"
  • "Will a significant critique of this post receive >50 karma?"
  • "Will this post receive a citation in major news media?"

And then bet on these directly from within the Forum!

The main benefit of prediction markets in posts is not on betting on the performance of particular posts, but on betting on the claims in the post. I see it more like:

Epistemic status :
- I think 60% to 70% chance of X (and click to bet over/under)
- Y odds-ratio in favor of X (and link to my bet on existing market for X)
- I'm not betting on this, but

Post Summary -> testable prediction in market title
Epistemic comments vs Vibe comments -> comments with bets vs without
Epistemic likes vs Vibe likes -> market movement vs karma
Paid peer review for X -> market subsidy

As mentioned in the post, it seems possible that getting a lot of people to do a lot of work (and by the way, fighting/undoing alot of unseen/unspoken optimizations that exist) could be impractical.

I think solving this requires “fluency” in instigation and this is one of two limiting factors. Many solutions don’t make it past this step.

The other limiting factor that the seating of norms and consequent effects seems really hard.

For example, some the proposed features would produce artifacts that would bounce off talented newcomers.

I know someone who worked on social media platforms and design, who might have interesting proposals that probably solve the above (or even too much). I guess one reluctance is that this would pull attention from EA work but with the growing sentiment about this is less important.

I might have been unclear: my shorter UI suggestions (epistemic status and summary) could be required by the server. Behaviour change solved.

[comment deleted]2y2
0
0
More from Gavin
Curated and popular this week
Relevant opportunities