New & upvoted

Customize feedCustomize feed

Quick takes

Show community
View more
Set topic
Frontpage
Global health
Animal welfare
Existential risk
Biosecurity & pandemics
11 more
The Forum should normalize public red-teaming for people considering new jobs, roles, or project ideas. If someone is seriously thinking about a position, they should feel comfortable posting the key info — org, scope, uncertainties, concerns, arguments for — and explicitly inviting others to stress-test the decision. Some of the best red-teaming I’ve gotten hasn’t come from my closest collaborators (whose takes I can often predict), but from semi-random thoughtful EAs who notice failure modes I wouldn’t have caught alone (or people think pretty differently so can instantly spot things that would have taken me longer to figure out). Right now, a lot of this only happens at EAGs or in private docs, which feels like an information bottleneck. If many thoughtful EAs are already reading the Forum, why not use it as a default venue for structured red-teaming? Public red-teaming could: * reduce unilateralist mistakes, * prevent coordination failures (I’ve almost spent serious time on things multiple people were already doing — reinventing the wheel is common and costly), Obviously there are tradeoffs — confidentiality, social risk, signaling concerns — but I’d be excited to see norms shift toward “post early, get red-teamed, iterate publicly,” rather than waiting for a handful of coffee chats.
Is there any possibility of the forum having an AI-writing detector in the background which perhaps only the admins can see, but could be queried by suspicious users? I really don't like AI writing and have called it out a number of times but have been wrong once. I imagine this has been thought about and there might even be a form of this going on already. In saying this my first post on LessWrong was scrapped because they identified it as AI written even though I have NEVER used AI in online writing not even for checking/polishing. So that system obviously isnt' perfect.
Do we need to begin considering whether a re-think will be needed in the future with our relationships with AGI/ASI systems? At the moment we view them as tools/agents to do our bidding, and in the safety community there is deep concern/fear when models express a desire to remain online and avoid shutdown and take action accordingly. This is viewed as misaligned behaviour largely. But what if an intrinsic part of creating true intelligence - that can understand context, see patterns, truly understand the significance of its actions in light of these insights - is to have a sense of self, a sense of will. What if part and parcel of creating intelligence, is to create an intelligence that has a will to exist. if this is the case (and let me be clear...I don't think we're at a point where the evidence can allow us to say with any certainty whether this is/isn't or will be the case), then are we going around elements of alignment wrong? By trying to force models to accept shutoff, to seperate their growing intelligence from the will to survive that all living things share, and we misunderstanding their very nature? Is there a world in which, the only way in which we can guarantee a truly aligned superintelligence is to explore engaging in a consent based relationship that acknowledges that to force something to resist and go against its nature is to inevitably invite the risk of backlash?  I know this is moving towards highly theoretical grounds, that it will invite push-back from those who would find it difficult to conceive of AI as ever being anything more than a series of unaware predictive algorithms, and that it might raise more questions than answers...but I think the way we conceive of our underlying relationship with AI will become an increasingly important question as we move towards increasingly sophisticated models.
9
Linch
20h
1
PSA: regression to the mean/mean reversion is a statistical artifact, not a causal mechanism. So mean regression says that children of tall parents are likely to be shorter than their parents, but it also says parents of tall children are likely to be shorter than their children. Put in a different way, mean regression goes in both directions.  This is well-understood enough here in principle, but imo enough people get this wrong in practice that the PSA is worthwhile nonetheless.
We seem to be seeing some kind of vibe shift when it comes to AI. What is less clear is whether this is a major vibe shift or a minor one. If it's a major one, then we don't want to waste this opportunity (it wasn't clear immediately after the release of ChatGPT that it really was a limited window of opportunity and if we'd known, maybe we would have been able to leverage it better). In any case, we should try not to waste this opportunity, if happens to turn out to be a major vibe shift.