New & upvoted

Customize feedCustomize feed
CommunityCommunity
Personal+

Posts tagged community

Quick takes

Show community
View more
Naive projection about o4 and beyond The Codeforces Elo progression from o1-mini to o3-mini was around 400 points (with compute costs held constant). Similarly, the Elo jumps from 4o (~800) to o1-preview (~1250) to o1-mini (~1650) were also each around 400 points (the compute costs of 4o appear similar to those of o1-mini, while they're higher for o1-preview). People from OpenAI report that o4 is now being trained and that training runs take around three months in the current "reasoning paradigm". So if we were to engage in naive projection, we might project a continued ~400 point Codeforces progression every three months. Below is a naive such projection for the o1-mini cost range, with the dates referring to when model scores are announced (not when the models are released). * March 2025 (March 14th?): o4 ~2400 * June 2025: o5 ~2800 * September 2025: o6 ~3200 * December 2025: o7 ~3600 * If high compute adds around 700 Elo points for full o7 (as it does for o3), this would give full o7 a superhuman score of ~4300 * March 2026: o8 ~4000 (a score only ever achieved by two people) * June 2026: o9 ~4400 (superhuman level for cheap) Part of the motivation for making such a naive projection is that it can provide a salient yardstick to hold future progress up against, to notice whether progress on this benchmark is slowing down, keeping pace, or accelerating. Additionally, as further motivation, one can note that there is some precedent for Elo scores improving linearly over time in other domains, e.g. in chess: Likewise, while they're more subjective, Elo scores on the LLM leaderboard also appear to have increased fairly consistently by an average of ~20 points per month over the last year (the trend has continued beyond the graph below; the current top 10 average is at the ~1360 level one would have predicted based on a naive extrapolation of the post-2023-11 trendline below):
Update (January 28): Marco Rubio has now issued a temporary waiver for "humanitarian programs that provide life-saving medicine, medical services, food, shelter and subsistence assistance."[1] PEPFAR's funding was recently paused as a result of the recent executive order on foreign aid.[2]Ā (It was previously reauthorized until March 25, 2025.[3]) If not exempted, this would pause PEPFAR's work for three months, effective immediately. Marco Rubio has issued waivers for some forms of aid, including emergency food aid, and has the authority to issue a similar waiver for PEPFAR, allowing it to resume work immediately.[4]Ā Rubio has previously expressed (relatively generic) positive sentiments about PEPFAR on Twitter,[5]Ā and I don't have specific reason to think he's opposed to PEPFAR, as opposed to simply not caring strongly enough to give it a waiver without anyone encouraging him to. I think it is worth considering calling your representatives to suggest that they encourage Rubio to give PEPFAR a waiver, similarly to the waiver he provided to programs giving emergency food aid. I have a lot of uncertainty here ā€” in particular, I'm not sure whether this is likely to persuade Rubio ā€” but I think it is fairly unlikely to make things actively worse. I think the argument in favor of calling is likely stronger for people who are represented by Republicans in Congress; I expect Rubio would care much more about pressure from his own party than about pressure from the Democrats. Ā  1. ^ https://apnews.com/article/trump-foreign-assistance-freeze-684ff394662986eb38e0c84d3e73350b 2. ^ My primary source for this quick take is Kelsey Piper's Twitter thread, as well as the Tweets it quotes and the articles it and the quoted Tweet link to. For a brief discussion of what PEPFAR is, see my previous Quick Take. 3. ^ https://www.kff.org/policy-watch/pepfars-short-term-reauthorization-sets-an-uncertain-course-for-its-long-term-future/ 4. ^ htt
I've made a public Forum Events calendar, which you can add to your Gcal. Hopefully, this will give people longer to think about and write for events like Debate Weeks or Theme Weeks. Let me know if you have any issues adding the calendar, or have suggestions for other ways to track Forum events.Ā  Ā  Click to add calendar
I found this finding in the MCF 2024 survey interesting: This survey was hard and only given to a small number of people, so we shouldn't read too much into the specific numbers, but I think it's still a data point against putting significant weight on replacability concerns if you have a job offer for an org you consider impactful.Ā  Survey respondents here (who all work at EA orgs like Open Phil, 80k, CEA, Giving What We Can) are saying that if they make someone a job offer, they would need to receive, in the typical case for junior staff, tens of thousands of dollars to be indifferent about that person taking the job instead of the next best candidate. As someone who's been involved in several hiring rounds, this sounds plausible to me. If you get a job offer from an org you consider impactful, I suggest not putting significant weight on the idea that the next best candidate could also take the role and have just as much or more impact as you, unless you have a good reason to think you're in an atypical situation. There's often a (very) large gap! FYI the question posed was: (there's a debate to be had about how "EA org receiving X in financial compensation" compares to "value to the world in $ terms" or "value in EA-aligned donations" but I stand by the above bolded claim). Full disclosure: I work at CEA and helped build the survey, so I'm somewhat incentivised to say this work was interesting and valuable.
Which interesting EA-related bluesky accounts do you know of? I'm not using Twitter anymore since it's being used to promote hateful views, but Bluesky is quite a cool online space in my opinion. I'm making a list of Bluesky accounts of EA-related organisations and key people. If you're active on Bluesky or some of your favourite EA orgs or key people are, please leave a comment with a link to their profile! I've also made an EA (GHD+AW+CC) Starter Pack in case you're interested. Let me know who I should add! Effective Environmentalism also has a pack with effectiveness-oriented climate change accounts. Some accounts in no particular order: @effectiveenvironmentalism.org (my org)@hannahritchie.bsky.social@alexholst.bsky.social@vegardbeyer.bsky.social@jamesozden.bsky.social@rowanemslie.bksy.social@goodfoodinst.bsky.social@80000hours.bsky.social@ourworldindata.org@givinggreen.bsky.social@brucefriedrich.bsky.social@cleanaircatf.bsky.social@ayudaefectiva.bksy.social@givingwhatwecan.bsky.social@soemanozeijlmans.eu (that's me)