All posts

Old

Friday, 3 May 2024
Fri, 3 May 2024

AI safety 6
Building effective altruism 5
Collections and resources 4
Community 4
Opportunities to take action 3
Announcements and updates 3
More

Frontpage Posts

15
niplav
· · 6m read
187
kta
· · 15m read

Quick takes

72
William_S
16d
5
I worked at OpenAI for three years, from 2021-2024 on the Alignment team, which eventually became the Superalignment team. I worked on scalable oversight, part of the team developing critiques as a technique for using language models to spot mistakes in other language models. I then worked to refine an idea from Nick Cammarata into a method for using language model to generate explanations for features in language models. I was then promoted to managing a team of 4 people which worked on trying to understand language model features in context, leading to the release of an open source "transformer debugger" tool. I resigned from OpenAI on February 15, 2024.
Not sure how to post these two thoughts so I might as well combine them. In an ideal world, SBF should have been sentenced to thousands of years in prison. This is partially due to the enormous harm done to both FTX depositors and EA, but mainly for basic deterrence reasons; a risk-neutral person will not mind 25 years in prison if the ex ante upside was becoming a trillionaire. However, I also think many lessons from SBF's personal statements e.g. his interview on 80k are still as valid as ever. Just off the top of my head: * Startup-to-give as a high EV career path. Entrepreneurship is why we have OP and SFF! Perhaps also the importance of keeping as much equity as possible, although in the process one should not lie to investors or employees more than is standard. * Ambition and working really hard as success multipliers in entrepreneurship. * A career decision algorithm that includes doing a BOTEC and rejecting options that are 10x worse than others. * It is probably okay to work in an industry that is slightly bad for the world if you do lots of good by donating. [1] (But fraud is still bad, of course.) Just because SBF stole billions of dollars does not mean he has fewer virtuous personality traits than the average person. He hits at least as many multipliers than the average reader of this forum. But importantly, maximization is perilous; some particular qualities like integrity and good decision-making are absolutely essential, and if you lack them your impact could be multiplied by minus 20.     [1] The unregulated nature of crypto may have allowed the FTX fraud, but things like the zero-sum zero-NPV nature of many cryptoassets, or its negative climate impacts, seem unrelated. Many industries are about this bad for the world, like HFT or some kinds of social media. I do not think people who criticized FTX on these grounds score many points. However, perhaps it was (weak) evidence towards FTX being willing to do harm in general for a perceived greater good, which is maybe plausible especially if Ben Delo also did market manipulation or otherwise acted immorally. Also note that in the interview, SBF didn't claim his donations offset a negative direct impact; he said the impact was likely positive, which seems dubious.
(EA) Hotel dedicated to events, retreats, and bootcamps in Blackpool, UK?  I want to try and gauge what the demand for this might be. Would you be interested in holding or participating in events in such a place? Or work running them? Examples of hosted events could be: workshops, conferences, unconferences, retreats, summer schools, coding/data science bootcamps, EtG accelerators, EA charity accelerators, intro to EA bootcamps, AI Safety bootcamps, etc.  This would be next door to CEEALAR (the building is potentially coming on the market), but most likely run by a separate, but close, limited company (which would charge, and funnel profits to CEEALAR, but also subsidise use where needed). Note that being in Blackpool in a low cost building would mean that the rates charged by such a company would be significantly less than elsewhere in the UK (e.g. £300/day for use of the building: 15 bedrooms and communal space downstairs to match that capacity). Maybe think of it as Whytham Abbey, but at the other end of the Monopoly board: only 1% of the cost! (A throwback to the humble beginnings of EA?) From the early days of the EA Hotel (when we first started hosting unconferences and workshops), I have thought that it would be good to have a building dedicated to events, bootcamps and retreats, where everyone is in and out as a block, so as to minimise overcrowding during events, and inefficiencies of usage of the building either side of them (from needing it mostly empty for the events); CEEALAR is still suffering from this with it’s event hosting. The yearly calendar could be filled up with e.g. 4 10-12 week bootcamps/study programs, punctuated by 4 1-3 week conferences or retreats in between.  This needn't happen straight away, but if I don't get the building now, the option will be lost for years. Having it next door in the terrace means that the building can be effectively joined to CEEALAR, making logistics much easier (and another option for the building could be a further expansion of CEEALAR proper[1]). Note that this is properly viewed as an investment to take into account a time-limited opportunity, and shouldn't be seen as fungible with donations (to CEEALAR or anything else); if nothing happens I can just sell the building again and recoup most/all of the costs (selling shouldn’t be that difficult, given property prices are rising again in the area due to a massive new development in the town centre). 1. ^ CEEALAR has already expanded once. When I bought the second building it also wasn’t ideal timing, but it never is; I didn’t want to lose option value.
New: “card view” for frontpage posts We’re testing out a new “card view” for the main post list on the home page. You can toggle the layout by clicking the dropdown circled in red below. You can see more details in GitHub here. Let us know what you think! :)
I've said that people voting anonymously is good, and I still think so, but when I have people downvoting me for appreciating little jokes that other people most on my shortform, I think we've become grumpy.