Effective Altruism Forum
EA Forum

All of technicalities's Comments + Replies

Nah we were only 5 people plus a list of contacts to the end. Main blocker was trying to solve executive search and funding at the same time when these are coupled problems. And the cause of that is maybe me not having enough pull.

Who's hiring? (Feb-May 2024)

technicalities2y2

EA and the United Nations

technicalities2y2

Appreciate this.

The second metric is aid per employee I think, so salaries don't come into it(?) Distributing food is labour intensive, but so is UNICEF's work and parts of WHO.

The rest of my evidence is informal (various development economists I've spoken to with horror stories) and I'd be pleased to be wrong.

Who's hiring? (Feb-May 2024)

Answer by technicalitiesJan 31, 202422

Arb is a research consultancy led by Misha Yagudin and Gavin Leech. Here's our review of our first and second years. We worked on forecasting, vaccine strategy, AI risk, economic policy, grantmaking, large-scale data collection, a little software engineering, explaining highly technical concepts, and intellectual history. Lately we've been on a biotech jaunt and also events.

We're looking for researchers with some background in ML, forecasting, technical writing, blogging, or some other hard thing. Current staff include a philosophy PhD, two college dropout... (read more)

JoshuaBlake

Do you have timezone requirements?

Impact Assessment of AI Safety Camp (Arb Research)

technicalities2y7

When producing the main estimates, Sam already uses just the virtual camps, for this reason. Could emphasise more that this probably doesn't generalise.

Impact Assessment of AI Safety Camp (Arb Research)

technicalities2y7

The key thing about AISC for me was probably the "hero licence" (social encouragement, uncertainty reduction) the camp gave me. I imagine this specific impact works 20x better in person. I don't know how many attendees need any such thing (in my cohort, maybe 25%) or what impact adjustment to give this type of attendee (probably a discount, since independence and conviction is so valuable in a lot of research).

Another wrinkle is the huge difference in acceptance rates between programmes. IIRC the admission rate for AISC 2018 was 80% (only possible because ... (read more)

Shallow review of live agendas in alignment & safety

technicalities2y2

https://www.lesswrong.com/posts/pHJtLHcWvfGbsW7LR/roadmap-for-a-collaborative-prototype-of-an-open-agency

I put it in "galaxy-brained end-to-end solutions" for its ambition but there are various places it could go.

Shallow review of live agendas in alignment & safety

technicalities2y2

Well there's a lot of different ways to design an NN.

That sounds related to OAA (minus the vast verifier they also want to build), so depending on the ambition it could be "End to end solution" or "getting it to learn what we want" or "task decomp". See also this cool paper from authors including Stuart Russell.

Jobst Heitzig (vodle.it)

What is OAA? And, more importantly: where now would you put it in your taxonomy?

Shallow review of live agendas in alignment & safety

technicalities2y2

It's not a separate approach, the non-theory agendas and even some of the theory agendas have their own answers to these questions. I can tell you that almost everyone besides CoEms and OAA are targeting NNs though.

Jobst Heitzig (vodle.it)

"targeting NNs" sounds like work that takes a certain architecture (NNs) as a given rather than work that aims at actively designing a system. To be more specific: under the proposed taxonomy, where would a project be sorted that designs agents composed of a Bayesian network as a world model and an aspiration-based probabilistic programming algorithm for planning?

Shallow review of live agendas in alignment & safety

technicalities2y2

excellent, thanks, will edit

Closing Notes on Nonlinear Investigation

technicalities2y3

Oh great, thanks. I would guess that these discrete cases form a minority of their work, but hopefully someone with actual knowledge can confirm.

9[anonymous]2y

I did some more research and 20 complaints a year of varying severity is typical, according to what Julia Wise told TIME magazine for their article:

Closing Notes on Nonlinear Investigation

technicalities2y142

The closing remarks about CH seem off to me.

Justice is incredibly hard; doing justice while also being part of a community, while trying to filter false accusations and thereby not let the community turn on itself, is one of the hardest tasks I can think of.
So I don't expect disbanding CH to improve justice, particularly since you yourself have shown the job to be exhausting and ambiguous at best.
You have, though, rightly received gratitude and praise - which they don't often, maybe just because we don't often praise people for doing thei

... (read more)

Daystar Eld2y17

Yeah, I think it is actually incredibly easy to undervalue CH, particularly if people don't regularly interact with it or make use of them rather than just having a single anecdata to go off of. So much of what I do in the community (everything from therapy to mediation to teaching at the camps) is made easier by Community Health, and no one knows about any of it because why would they? I guess I should make a post to highlight this.

Ben Pace2y12

Some brief reactions:

I mostly don't like the 'justice' process involved in other cases insofar as it is primarily secret and hidden. I don't think it's much of a justice system where you often don't know the accusations against you or why you're being punished.
The data on negative performance is also profoundly censored! I am not sure why you think this makes this more likely to make me update positively on the process involved.
I am pro having some surveys of people's general attitudes toward CEA Community Health. Questions like "Have you ever reported an

... (read more)

[anonymous]2y12

With regards to 2: There is some information CH has made public about how many cases they handle and what actions they take. In a 12 month period around 2021, they handled 19 cases of interpersonal harm. Anonymized summaries of the cases and actions taken are available in the appendix of this post. They ranged from serious:

A person applied to EA Global who had previously been reported for deliberately physically endangering another community member, sending them threatening messages, and more. Written correspondence between the people appears to confirm th

technicalities2y-2

Thanks for all your work Ben.

But a glum aphorism comes to mind: the frame control you can expose is not the true frame control.

Raemon

I think it's true that frame control (or, manipulation in general) tends to be designed to make it hard to expose, but, I think the actual issue here is more like "manipulation is generally harder to expose than it is to execute, so, people trying to expose manipulation have to do a lot of disproportionate work."

Jaime Sevilla's Quick takes

technicalities2y4

What about factor increase per year, reported alongside a second number to show how the increases compose (e.g. the factor increase per decade)? So "compute has been increasing by 1.4x per year, or 28x per decade" or sth.

The main problem with OOMs is fractional OOMs, like your recent headline of "0.1 OOMs". Very few people are going to interpret this right, where they'd do much better with "2 OOMs".

Jaime Sevilla

Factor increase per year is the way we are reporting growth rates by default now in the dashboard. And I agree it will be better interpreted by the public. On the other hand, multiplying numbers is hard, so it's not as nice for mental arithmetic. And thinking logarithmically puts you in the right frame of mind. Saying that GPT-4 was trained on x100 more compute than GPT-3 invokes GPT-3 being 100 times better, whereas I think saying it was trained on 2 OOM more compute gives you a better picture of the expected improvement. I might be wrong here. In any case, it is still a better choice than doubling times.

Strongest real-world examples supporting AI risk claims?

technicalities2y2

Buckman's examples are not central to what you want but worth reading: https://jacobbuckman.com/2022-09-07-recursively-self-improving-ai-is-already-here/

Who's hiring? (May-September 2022) [closed]

technicalities2y15

Despite my best efforts (and an amazing director candidate, and a great list of volunteers), this project suffered from the FTX explosion and an understandable lack of buy-in for an org with maximally broad responsibilities, unpredictable time-to-payoff, and a largeish discretionary fund. As a result, we shuttered without spending any money. Two successor orgs, one using our funding and focussed on bio, are in the pipeline though.

I'll be in touch if either of the new orgs want to contact you as a volunteer.

Ozzie Gooen

Sorry to hear that this one didn't work out! Kudos for staying motivated and continuing with other initiatives.

Strongest real-world examples supporting AI risk claims?

Answer by technicalitiesSep 07, 202310

Break self-improvement into four:

ML optimizing ML inputs: reduced data centre energy cost, reduced cost of acquiring training data, supposedly improved semiconductor designs.
ML aiding ML researchers. e.g. >3% of new Google code is now auto-suggested without amendment.
ML replacing parts of ML research. Nothing too splashy but steady progress: automatic data cleaning and feature engineering, autodiff (and symbolic differentiation!), meta-learning network components (activation functions, optimizers, ...), neural architecture search.
Classic direct re

... (read more)

technicalities

Buckman's examples are not central to what you want but worth reading: https://jacobbuckman.com/2022-09-07-recursively-self-improving-ai-is-already-here/

rosehadshar

Thanks, really helpful!

Strongest real-world examples supporting AI risk claims?

technicalities2y2

The only part of the Bing story which really jittered me is that time the patched version looked itself up through Bing Search, saw that the previous version Sydney was a psycho, and started acting up again. "Search is sufficient for hyperstition."

What I would do if I wasn’t at ARC Evals

technicalities2y17

Re: papers. Arb recently did a writeup and conference submission for Alex Turner; we're happy to help others with writeups or to mentor people who want to try providing this service. DM me for either.

Covid memorial: 1ppm

technicalities2y4

Yes, they all died of or with Covid; yes, you guess right that they inspire me.

Besides my wanting to honour them, the point of the post was to give a sense of vertigo and concreteness to the pandemic then ending. At the time of writing, a good guess for the total excess deaths was 18 million - a million people for each of those named here. The only way to begin to feel anything appropriate about this fact is to use exemplars.

Evan_Gaensbauer

Okay, that's awesome. I appreciate it. I'd like to see more inspirational or personal posts like this on the EA Forum in the future, actually, so this kind of post personally speaks to me as well.

Understanding how hard alignment is may be the most important research direction right now

technicalities3y5

See also Anthropic's view on this

[seeing] a lot of safety research as "eating marginal probability" of things going well, progressively addressing harder and harder safety scenarios.

The implicit strat (which Olah may not endorse) is to try to solve easy bits, then move on to harder bits, then note the rate you are progressing at and get a sense of how hard things are that way.

This would be fine if we could be sure we actually were solving the problems, and also not fooling ourselves about the current difficulty level, and if the relevant resear... (read more)

Aron

I agree the implicit strat here doesn’t seem like it’ll make progress on knowing whether the hard problems are real. Lots of the hard problems (generalising well ood, existence of sharp left turns) just don’t seem very related to the easier problems (like making LLMs say nice things), and unless you’re explicitly looking for evidence of hard problems I think you’ll be able to solve the easier problems in ways that won’t generalise (e.g. hammering LLMs with enough human supervision in ways that aren’t scalable, but are sufficient to ‘align’ it).

Intrinsic limitations of GPT-4 and other large language models, and why I'm not (very) worried about GPT-n

technicalities3y3

Nitpick: It's fairly unlikely that GPT-4 is 1tn params; this size doesn't seem compute-optimal. I grant you the Semafor assertion is some evidence, but I'm putting more weight on compute arithmetic.

Announcing the Prague community space: Fixed Point

technicalities3y10

Vouching for this, it's a wonderful place to work and also to hang out.

Off Road: support for EAs struggling at uni

technicalities3y3

A successor project is live here, takes all comers.

Tristan W

Ah cool, thanks. Would probably include that at the top of the post for others who may be interested.

Announcing “Effective Dropouts”

technicalities3y6

Scottish degrees let you pick 3 very different subjects in first year and drop 1 or 2 in second year. This seems better to me than American forced generalism and English narrowness.

Yonatan Cale

Maybe one day a university will let students study any topic they want from the internet, that would be rad

Off Road: support for EAs struggling at uni

technicalities3y3

Thanks: you can apply here.

I've edited the post to link to the successor project.

How bad a future do ML researchers expect?

technicalities3y6

I dream of getting a couple questions added onto a big conference's attendee application form. But probably not possible unless you're incredibly well-connected.

Comparing top forecasters and domain experts

technicalities3y2

Oh that is annoying, thanks for pointing it out. I've just tried to use the new column width feature to fix it, but no luck.

Here's a slightly more readable gdoc.

Gavin's Quick takes

technicalities3y4

it is good to omit doing what might perhaps bring some profit to the living, when we have in view the accomplishment of other ends that will be of much greater advantage to posterity.

- Descartes (1637)