New & upvoted

Benjamin Lay — "Quaker Comet", early (radical) abolitionist, general "moral weirdo" — died on this day 267 years ago. I shared a post about him a little while back, and still think of February 8 as "Benjamin Lay Day". ... Around the same time I also made two paintings inspired by his life/work, which I figured I'd share now. One is an icon-style-inspired image based on a portrait of him[1]: The second is based on a print depicting the floor plan of an infamous slave ship (Brooks). The print was used by abolitionists (mainly(?) the Society for Effecting the Abolition of the Slave Trade) to help communicate the horror of the trade. I found it useful to paint it (and appreciate having it around today). But I imagine that not everyone might want to see it, so I'll skip a few lines here in case you expanded this quick take and decide you want to scroll past/collapse it instead. . . . 1. ^ The original (see post) was commissioned by Deborah Read as a gift for her husband Benjamin Franklin, who also Benjamin Lay’s friend.

13

Linch

11h

2

Recent generations of Claude seem better at understanding blog posts and making fairly subtle judgment calls than most smart humans. These days when I’d read an article that presumably sounds reasonable to most people but has what seems to me to be a glaring conceptual mistake, I can put it in Claude, ask it to identify the mistake, and more likely than not Claude would land on the same mistake as the one I identified. I think before Opus 4 this was essentially impossible, Claude 3.xs can sometimes identify small errors but it’s a crapshoot on whether it can identify central mistakes, and certainly not judge it well. It’s possible I’m wrong about the mistakes here and Claude’s just being sycophantic and identifying which things I’d regard as the central mistake, but if that’s true in some ways it’s even more impressive. Interestingly, both Gemini and ChatGPT failed at these tasks. (They can sometimes directionally approach the error I identified, but their formulation is imprecise and broad, and they only have it in a longer list of potential quibbles rather than zero in on the most damning issue). For clarity purposes, here are 3 articles I recently asked Claude to reassess (Claude got the central error in 2/3 of them). I'm also a little curious what the LW baseline is here; I did not include my comments in my prompts to Claude. https://terrancraft.com/2021/03/21/zvx-the-effects-of-scouting-pillars/ https://www.clearerthinking.org/post/what-can-a-single-data-point-teach-you https://www.lesswrong.com/posts/vZcXAc6txvJDanQ4F/the-median-researcher-problem-1

26

Linch

2d

2

I like Scott's Mistake Theory vs Conflict Theory framing, but I don't think this is a complete model of disagreements about policy, nor do I think the complete models of disagreement will look like more advanced versions of Mistake Theory + Conflict Theory. To recap, here's my short summaries of the two theories: Mistake Theory: I disagree with you because one or both of us are wrong about what we want, or how to achieve what we want) Conflict Theory: I disagree with you because ultimately I want different things from you. The Marxists, who Scott was originally arguing against, will natively see this as about individual or class material interests but this can be smoothly updated to include values and ideological conflict as well. I polled several people about alternative models for political disagreement at the same level of abstraction of Conflict vs Mistake, and people usually got to "some combination of mistakes and conflicts." To that obvious model, I want to add two other theories (this list is incomplete). Consider Thomas Schelling's 1960 opening to Strategy of Conflict I claim that this "rudimentary/obvious idea," that the conflict/cooperative elements of many human disagreements is structurally inseparable, is central to a secret third thing distinct from Conflict vs Mistake. If you grok the "obvious idea," we can derive something like Negotiation Theory(?): I have my desires. You have yours. I sometimes want to cooperate with you, and sometimes not. I take actions maximally good for my goals and respect you well enough to assume that you will do the same; however in practice a "hot war" is unlikely to be in either of our best interests. In the Negotiation Theory framing, disagreement/conflict arises from dividing the goods in non-zerosum games. I think the economists/game theorists' "standard models" of negotiation theory is natively closer to "conflict theory" than "mistake theory." (eg, often their models assume rationality, which means the "ca

27

Ben_West🔸

3d

Existential riskAI safety

2

The AI Eval Singularity is Near * AI capabilities seem to be doubling every 4-7 months * Humanity's ability to measure capabilities is growing much more slowly * This implies an "eval singularity": a point at which capabilities grow faster than our ability to measure them * It seems like the singularity is ~here in cybersecurity, CBRN, and AI R&D (supporting quotes below) * It's possible that this is temporary, but the people involved seem pretty worried Appendix - quotes on eval saturation Opus 4.6 * "For AI R&D capabilities, we found that Claude Opus 4.6 has saturated most of our automated evaluations, meaning they no longer provide useful evidence for ruling out ASL-4 level autonomy. We report them for completeness, and we will likely discontinue them going forward. Our determination rests primarily on an internal survey of Anthropic staff, in which 0 of 16 participants believed the model could be made into a drop-in replacement for an entry-level researcher with scaffolding and tooling improvements within three months." * "For ASL-4 evaluations [of CBRN], our automated benchmarks are now largely saturated and no longer provide meaningful signal for rule-out (though as stated above, this is not indicative of harm; it simply means we can no longer rule out certain capabilities that may be pre-requisities to a model having ASL-4 capabilities)." * It also saturated ~100% of the cyber evaluations Codex-5.3 * "We are treating this model as High [for cybersecurity], even though we cannot be certain that it actually has these capabilities, because it meets the requirements of each of our canary thresholds and we therefore cannot rule out the possibility that it is in fact Cyber High."

6

Brian Foerster

19h

Career choiceBuilding EA

0

As a university organizer at a very STEM focused state school, I suspect that students getting liberal arts degrees are more easily convinced to pursue a career in direct work. If this is the case, it could be because direct work compares more favorably with the other career options of those with liberal arts degrees, or because the clearer career outcomes of STEM majors create more path dependence and friction when they consider switching careers. This is potentially another thing to keep in mind when trying to compare the successes of EA uni groups.

Effective Altruism Forum
EA Forum

New & upvoted

Posts tagged community

Quick takes

Opportunities

Online courses