New & upvoted

Customize feedCustomize feed
CommunityCommunity
Personal+

Posts tagged community

Quick takes

Show community
View more
Marcus Daniell appreciation note @Marcus Daniell, cofounder of High Impact Athletes, came back from knee surgery and is donating half of his prize money this year. He projects raising $100,000. Through a partnership with Momentum, people can pledge to donate for each point he gets; he has raised $28,000 through this so far. It's cool to see this, and I'm wishing him luck for his final year of professional play!
38
harfe
1d
10
FHI has shut down yesterday: https://www.futureofhumanityinstitute.org/
An alternate stance on moderation (from @Habryka.) This is from this comment responding to this post about there being too many bans on LessWrong. Note how the LessWrong is less moderated than here in that it (I guess) responds to individual posts less often, but more moderated in that I guess it rate limits people more without reason.  I found it thought provoking. I'd recommend reading it. > Thanks for making this post!  > > One of the reasons why I like rate-limits instead of bans is that it allows people to complain about the rate-limiting and to participate in discussion on their own posts (so seeing a harsh rate-limit of something like "1 comment per 3 days" is not equivalent to a general ban from LessWrong, but should be more interpreted as "please comment primarily on your own posts", though of course it shares many important properties of a ban). This is a pretty opposite approach to the EA forum which favours bans. > Things that seem most important to bring up in terms of moderation philosophy:  > > Moderation on LessWrong does not depend on effort > > "Another thing I've noticed is that almost all the users are trying.  They are trying to use rationality, trying to understand what's been written here, trying to apply Baye's rule or understand AI.  Even some of the users with negative karma are trying, just having more difficulty." > > Just because someone is genuinely trying to contribute to LessWrong, does not mean LessWrong is a good place for them. LessWrong has a particular culture, with particular standards and particular interests, and I think many people, even if they are genuinely trying, don't fit well within that culture and those standards.  > > In making rate-limiting decisions like this I don't pay much attention to whether the user in question is "genuinely trying " to contribute to LW,  I am mostly just evaluating the effects I see their actions having on the quality of the discussions happening on the site, and the quality of the ideas they are contributing.  > > Motivation and goals are of course a relevant component to model, but that mostly pushes in the opposite direction, in that if I have someone who seems to be making great contributions, and I learn they aren't even trying, then that makes me more excited, since there is upside if they do become more motivated in the future. I sense this is quite different to the EA forum too. I can't imagine a mod saying I don't pay much attention to whether the user in question is "genuinely trying". I find this honesty pretty stark. Feels like a thing moderators aren't allowed to say. "We don't like the quality of your comments and we don't think you can improve". > Signal to Noise ratio is important > > Thomas and Elizabeth pointed this out already, but just because someone's comments don't seem actively bad, doesn't mean I don't want to limit their ability to contribute. We do a lot of things on LW to improve the signal to noise ratio of content on the site, and one of those things is to reduce the amount of noise, even if the mean of what we remove looks not actively harmful.  > > We of course also do other things than to remove some of the lower signal content to improve the signal to noise ratio. Voting does a lot, how we sort the frontpage does a lot, subscriptions and notification systems do a lot. But rate-limiting is also a tool I use for the same purpose. > Old users are owed explanations, new users are (mostly) not > > I think if you've been around for a while on LessWrong, and I decide to rate-limit you, then I think it makes sense for me to make some time to argue with you about that, and give you the opportunity to convince me that I am wrong. But if you are new, and haven't invested a lot in the site, then I think I owe you relatively little.  > > I think in doing the above rate-limits, we did not do enough to give established users the affordance to push back and argue with us about them. I do think most of these users are relatively recent or are users we've been very straightforward with since shortly after they started commenting that we don't think they are breaking even on their contributions to the site (like the OP Gerald Monroe, with whom we had 3 separate conversations over the past few months), and for those I don't think we owe them much of an explanation. LessWrong is a walled garden.  > > You do not by default have the right to be here, and I don't want to, and cannot, accept the burden of explaining to everyone who wants to be here but who I don't want here, why I am making my decisions. As such a moderation principle that we've been aspiring to for quite a while is to let new users know as early as possible if we think them being on the site is unlikely to work out, so that if you have been around for a while you can feel stable, and also so that you don't invest in something that will end up being taken away from you. > > Feedback helps a bit, especially if you are young, but usually doesn't > > Maybe there are other people who are much better at giving feedback and helping people grow as commenters, but my personal experience is that giving users feedback, especially the second or third time, rarely tends to substantially improve things.  > > I think this sucks. I would much rather be in a world where the usual reasons why I think someone isn't positively contributing to LessWrong were of the type that a short conversation could clear up and fix, but it alas does not appear so, and after having spent many hundreds of hours over the years giving people individualized feedback, I don't really think "give people specific and detailed feedback" is a viable moderation strategy, at least more than once or twice per user. I recognize that this can feel unfair on the receiving end, and I also feel sad about it. > > I do think the one exception here is that if people are young or are non-native english speakers. Do let me know if you are in your teens or you are a non-native english speaker who is still learning the language. People do really get a lot better at communication between the ages of 14-22 and people's english does get substantially better over time, and this helps with all kinds communication issues. Again this is very blunt but I'm not sure it's wrong.  > We consider legibility, but its only a relatively small input into our moderation decisions > > It is valuable and a precious public good to make it easy to know which actions you take will cause you to end up being removed from a space. However, that legibility also comes at great cost, especially in social contexts. Every clear and bright-line rule you outline will have people budding right up against it, and de-facto, in my experience, moderation of social spaces like LessWrong is not the kind of thing you can do while being legible in the way that for example modern courts aim to be legible.  > > As such, we don't have laws. If anything we have something like case-law which gets established as individual moderation disputes arise, which we then use as guidelines for future decisions, but also a huge fraction of our moderation decisions are downstream of complicated models we formed about what kind of conversations and interactions work on LessWrong, and what role we want LessWrong to play in the broader world, and those shift and change as new evidence comes in and the world changes. > > I do ultimately still try pretty hard to give people guidelines and to draw lines that help people feel secure in their relationship to LessWrong, and I care a lot about this, but at the end of the day I will still make many from-the-outside-arbitrary-seeming-decisions in order to keep LessWrong the precious walled garden that it is. > > I try really hard to not build an ideological echo chamber > > When making moderation decisions, it's always at the top of my mind whether I am tempted to make a decision one way or another because they disagree with me on some object-level issue. I try pretty hard to not have that affect my decisions, and as a result have what feels to me a subjectively substantially higher standard for rate-limiting or banning people who disagree with me, than for people who agree with me. I think this is reflected in the decisions above. > > I do feel comfortable judging people on the methodologies and abstract principles that they seem to use to arrive at their conclusions. LessWrong has a specific epistemology, and I care about protecting that. If you are primarily trying to...  > > * argue from authority,  > * don't like speaking in probabilistic terms,  > * aren't comfortable holding multiple conflicting models in your head at the same time,  > * or are averse to breaking things down into mechanistic and reductionist terms,  > > then LW is probably not for you, and I feel fine with that. I feel comfortable reducing the visibility or volume of content on the site that is in conflict with these epistemological principles (of course this list isn't exhaustive, in-general the LW sequences are the best pointer towards the epistemological foundations of the site). It feels cringe to read that basically if I don't get the sequences lessWrong might rate limit me. But it is good to be open about it. I don't think the EA forum's core philosophy is as easily expressed. > If you see me or other LW moderators fail to judge people on epistemological principles but instead see us directly rate-limiting or banning users on the basis of object-level opinions that even if they seem wrong seem to have been arrived at via relatively sane principles, then I do really think you should complain and push back at us. I see my mandate as head of LW to only extend towards enforcing what seems to me the shared epistemological foundation of LW, and to not have the mandate to enforce my own object-level beliefs on the participants of this site. > > Now some more comments on the object-level:  > > I overall feel good about rate-limiting everyone on the above list. I think it will probably make the conversations on the site go better and make more people contribute to the site.  > > Us doing more extensive rate-limiting is an experiment, and we will see how it goes. As kave said in the other response to this post, the rule that suggested these specific rate-limits does not seem like it has an amazing track record, though I currently endorse it as something that calls things to my attention (among many other heuristics). > > Also, if anyone reading this is worried about being rate-limited or banned in the future, feel free to reach out to me or other moderators on Intercom. I am generally happy to give people direct and frank feedback about their contributions to the site, as well as how likely I am to take future moderator actions. Uncertainty is costly, and I think it's worth a lot of my time to help people understand to what degree investing in LessWrong makes sense for them. 
Why are April Fools jokes still on the front page? On April 1st, you expect to see April Fools' posts and know you have to be extra cautious when reading strange things online. However, April 1st was 13 days ago and there are still two posts that are April Fools posts on the front page. I think it should be clarified that they are April Fools jokes so people can differentiate EA weird stuff from EA weird stuff that's a joke more easily. Sure, if you check the details you'll see that things don't add up, but we all know most people just read the title or first few paragraphs.
I am not confident that another FTX level crisis is less likely to happen, other than that we might all say "oh this feels a bit like FTX". Changes: * Board swaps. Yeah maybe good, though many of the people who left were very experienced. And it's not clear whether there are due diligence people (which seems to be what was missing). * Orgs being spun out of EV and EV being shuttered. I mean, maybe good though feels like it's swung too far. Many mature orgs should run on their own, but small orgs do have many replicable features. * More talking about honesty. Not really sure this was the problem. The issue wasn't the median EA it was in the tails. Are the tails of EA more honest? Hard to say * We have now had a big crisis so it's less costly to say "this might be like that big crisis". Though notably this might also be too cheap - we could flinch away from doing ambitious things * Large orgs seem slightly more beholden to comms/legal to avoid saying or doing the wrong thing. * OpenPhil is hiring more internally Non-changes: * Still very centralised. I'm pretty pro-elite, so I'm not sure this is a problem in and of itself, though I have come to think that elites in general are less competent than I thought before (see FTX and OpenAI crisis) * Little discussion of why or how the affiliation with SBF happened despite many well connected EAs having a low opinion of him * Little discussion of what led us to ignore the base rate of scamminess in crypto and how we'll avoid that in future

Popular comments

Recent discussion

Self-evaluation using LLMs is used in reward modeling, model-based benchmarks like GPTScore and AlpacaEval, self-refinement, and constitutional AI. LLMs have been shown to be accurate at approximating human annotators on some tasks.

But these methods are threatened by self...

Continue reading

Executive summary: Frontier language models exhibit self-preference when evaluating text outputs, favoring their own generations over those from other models or humans, and this bias appears to be causally linked to their ability to recognize their own outputs.

Key points:

  1. Self-evaluation using language models is used in various AI alignment techniques but is threatened by self-preference bias.
  2. Experiments show that frontier language models exhibit both self-preference and self-recognition ability when evaluating text summaries.
  3. Fine-tuning language models to
... (read more)
2
Hauke Hillebrandt
4h
Cool instance of black box evaluation - seems like a relatively simple study technically but really informative. Do you have more ideas for future research along those lines you'd like to see?
3
jimrandomh
14h
Interesting. I think I can tell an intuitive story for why this would be the case, but I'm unsure whether that intuitive story would predict all the details of which models recognize and prefer which other models. As an intuition pump, consider asking an LLM a subjective multiple-choice question, then taking that answer and asking a second LLM to evaluate it. The evaluation task implicitly asks the the evaluator to answer the same question, then cross-check the results. If the two LLMs are instances of the same model, their answers will be more strongly correlated than if they're different models; so they're more likely to mark the answer correct if they're the same model. This would also happen if you substitute two humans or two sittings of the same human implace of the LLMs.

Excerpt from Impact Report 2022-2023

"Creating a better future, for animals and humans

£242,510  of research funded

3 Home Office meetings attended

5 PhDs completed through the FRAME Lab

33 people attended our Training School in Norway and our experimental design training...

Continue reading

Executive summary: FRAME (Fund for the Replacement of Animals in Medical Experiments) is an impactful animal welfare charity working to end the use of animals in biomedical research and testing by funding research into non-animal methods, educating scientists, and advocating for policy changes.

Key points:

  1. In 2022, FRAME funded £242,510 of research into non-animal methods, supported 5 PhD students, and trained 33 people in experimental design.
  2. The FRAME Lab at the University of Nottingham focuses on developing and validating non-animal approaches in areas lik
... (read more)

Anders Sandberg has written a “final report” released simultaneously with the announcement of FHI’s closure. The abstract and an excerpt follow.


Normally manifestos are written first, and then hopefully stimulate actors to implement their vision. This document is the reverse

...
Continue reading

Executive summary: The Future of Humanity Institute (FHI) achieved notable successes in its mission from 2005-2024 through long-term research perspectives, interdisciplinary work, and adaptable operations, though challenges included university politics, communication gaps, and scaling issues.

Key points:

  1. Long-term research perspectives and pre-paradigmatic topics were key to FHI's impact, enabled by stable funding.
  2. An interdisciplinary and diverse team was valuable for tackling neglected research areas.
  3. Operations staff needed to understand the mission as it g
... (read more)
11
MathiasKB
5h
I'm awestruck, that is an incredible track record. Thanks for taking the time to write this out. These are concepts and ideas I regularly use throughout my week and which have significantly shaped my thinking. A deep thanks to everyone who has contributed to FHI, your work certainly had an influence on me.
4
Chris Leong
12h
For anyone wondering about the definition of macrostrategy, the EA forum defines it as follows:
Sign up for the Forum's email digest
You'll get a weekly email with the best posts from the past week. The Forum team selects the posts to feature based on personal preference and Forum popularity, and also adds some announcements and a classic post.

Summary

  1. Many views, including even some person-affecting views, endorse the repugnant conclusion (and very repugnant conclusion) when set up as a choice between three options, with a benign addition option.
  2. Many consequentialist(-ish) views, including many person-affecting
...
Continue reading
1
Kaspar Brandner
12h
I wouldn't agree on the first point, because making Desgupta's step 1 the "step 1" is, as far as I can tell, not justified by any basic principles. Ruling out Z first seems more plausible, as Z negatively affects the present people, even quite strongly so compared to A and A+. Ruling out A+ is only motivated by an arbitrary-seeming decision to compare just A+ and Z first, merely because they have the same population size (...so what?). The fact that non-existence is not involved here (a comparison to A) is just a result of that decision, not of there really existing just two options. Alternatively there is the regret argument, that we would "realize", after choosing A+, that we made a mistake, but that intuition seems not based on some strong principle either. (The intuition could also be misleading because we perhaps don't tend to imagine A+ as locked in). I agree though that the classification "person-affecting" alone probably doesn't capture a lot of potential intricacies of various proposals.
2
MichaelStJules
8h
We should separate whether the view is well-motivated from whether it's compatible with "ethics being about affecting persons". It's based only on comparisons between counterparts, never between existence and nonexistence. That seems compatible with "ethics being about affecting persons". We should also separate plausibility from whether it would follow on stricter interpretations of "ethics being about affecting persons". An even stricter interpretation would also tell us to give less weight to or ignore nonidentity differences using essentially the same arguments you make for A+ over Z, so I think your arguments prove too much. For example, 1. Alice with welfare level 10 and 1 million people with welfare level 1 each 2. Alice with welfare level 4 and 1 million different people with welfare level 4 each You said "Ruling out Z first seems more plausible, as Z negatively affects the present people, even quite strongly so compared to A and A+." The same argument would support 1 over 2. Then you said "Ruling out A+ is only motivated by an arbitrary-seeming decision to compare just A+ and Z first, merely because they have the same population size (...so what?)." Similarly, I could say "Picking 2 is only motivated by an arbitrary decision to compare contingent people, merely because there's a minimum number of contingent people across outcomes (... so what?)" So, similar arguments support narrow person-affecting views over wide ones. I think ignoring irrelevant alternatives has some independent appeal. Dasgupta's view does that at step 1, but not at step 2. So, it doesn't always ignore them, but it ignores them more than necessitarianism does.   I can further motivate Dasgupta's view, or something similar: 1. There are some "more objective" facts about axiology or what we should do that don't depend on who presently, actually or across all outcomes necessarily exists (or even wide versions of this). What we should do is first constrained by these "more object

You said "Ruling out Z first seems more plausible, as Z negatively affects the present people, even quite strongly so compared to A and A+." The same argument would support 1 over 2.

Granted, but this example presents just a binary choice, with none of the added complexity of choosing between three options, so we can't infer much from it.

Then you said "Ruling out A+ is only motivated by an arbitrary-seeming decision to compare just A+ and Z first, merely because they have the same population size (...so what?)." Similarly, I could say "Picking 2 is onl

... (read more)

It seems plausible to me that those involved in Nonlinear have received more social sanction than those involved in FTX, even though the latter was obviously more harmful to this community and the world.

Continue reading

I think jailtime counts as social sanction! 

Nathan Young posted a Quick Take 4h ago

An alternate stance on moderation (from @Habryka.)

This is from this comment responding to this post about there being too many bans on LessWrong. Note how the LessWrong is less moderated than here in that it (I guess) responds to individual posts less often, but more moderated in that I guess it rate limits people more without reason. 

I found it thought provoking. I'd recommend reading it.

Thanks for making this post! 

One of the reasons why I like rate-limits instead of bans is that it allows people to complain about the rate-limiting and to participate in discussion on their own posts (so seeing a harsh rate-limit of something like "1 comment per 3 days" is not equivalent to a general ban from LessWrong, but should be more interpreted as "please comment primarily on your own posts", though of course it shares many important properties of a ban).

This is a pretty opposite approach to the EA forum which favours bans.

Things that seem most important to bring up in terms of moderation philosophy: 

Moderation on LessWrong does not depend on effort

"Another thing I've noticed is that almost all the users are trying.  They are trying to use rationality, trying to understand what's been written here, trying to apply Baye's rule or understand AI.  Even some of the users with negative karma are trying, just having more difficulty."

Just because someone is genuinely trying to contribute to LessWrong, does not mean LessWrong is a good place for them. LessWrong has a particular culture, with particular standards and particular interests, and I think many people, even if they are genuinely trying, don't fit well within that culture and those standards. 

In making rate-limiting decisions like this I don't pay much attention to whether the user in question is "genuinely

...
Continue reading
Jason commented on Nathan Young's quick take 10h ago

I am not confident that another FTX level crisis is less likely to happen, other than that we might all say "oh this feels a bit like FTX".

Changes:

  • Board swaps. Yeah maybe good, though many of the people who left were very experienced. And it's not clear whether there are
...
Continue reading

For both of these comments, I want a more explicit sense of what the alternative was.

Not a complete answer, but I would have expected communication and advice for FTXFF grantees to have been different. From many well connected EAs having a low opinion of him, we can imagine that grantees might have been urged to properly set up corporations, not count their chickens before they hatched, properly document everything and assume a lower-trust environment more generally, etc. From not ignoring the base rate of scamminess in crypto, you'd expect to have seen stronger and more developed contingency planning (remembering that crypto firms can and do collapse in the wake of scams not of their own doing!), more decisions to build more organizational reserves rather than immediately ramping up spending, etc.

I started working in cooperative AI almost a year ago, and as an emerging field I found it quite confusing at times since there is very little introductory material aimed at beginners. My hope with this post is that by summing up my own confusions and how I understand them...

Continue reading

Thank you Shaun!

I found myself wondering where we would fit AI Law / AI Policy into that model.

I would think policy work might be spread out over the landscape? As an example, if we think of policy work aiming to establishing the use of certain evaluations of systems, such evaluations could target different kinds of risk/qualities that would map to different parts of the diagram?

This is a Book Review & Summary of The Art of Gathering: How We Meet and Why It Matters by Priya Parker. 

Rating: 4/5

I've pulled the main insights and actionable recommendations from each chapter, so someone can orient themselves to the main upshots of the ...

Continue reading
1
Tristan Williams
17h
Would love to hear more about what you didn't like, but the other piece sounds like it's worth checking out, I'll try to give it a read soon! 

Ha yes that would have been helpful of me, I agree! Unfortunately, I can't remember much, it was a couple of years ago. I remember experiencing a significant vibes mismatch in the section on excluding people (but maybe I was just being close-minded) and frustration with its wordiness. 
 

1
Tristan Williams
17h
Ah I searched for a post a while back but didn't find anything, might have been because I searched for the full book title, not sure, but thanks for flagging. 

People around me are very interested in AI taking over the world, so a big question is under what circumstances a system might be able to do that—what kind of capabilities could elevate an entity above the melange of inter-agent conflict and into solipsistic hegemony?

We...

Continue reading

NIce post!

We might then expect a lot of powerful attempts to change prevailing ‘human’ values, prior to the level of AI capabilities where we might have worried a lot about AI taking over the world. If we care about our values, this could be very bad. 

This seems like a key point to me, that it is hard to get good evidence on. The red stripes are rather benign, so we are in luck in a world like that. But if the AI values something in a more totalising way (not just satisficing with a lot of x's and red stripes being enough, but striving to make all hum... (read more)