All of Girish_Sastry's Comments + Replies

Yes! We plan to release the source code soon.

I'd also be interested in funding activities like this. This could  inform how much we can learn about models without distributing weights.

A center applying epistemic best practices to predicting & evaluating AI progress

Artificial  Intelligence and  Epistemic Institutions

Forecasting and evaluating AI progress is difficult and important. Current work in this area is distributed  across multiple organizations or individual researchers, not all of whom possess (a) the technical expertise, (b) knowledge & skill in applying epistemic best practices,  and (c) institutional legitimacy  (or otherwise suffer from cultural constraints). Activities of the center could in... (read more)

On policy analysis, you write:

I will argue that despite the fact that there is overlap, and many of the ideas are well known, the knowledge and experience of policy analysis has much to offer effective altruism in achieving the goals of improving the world. Not only that, but it offers a paradigm for how to reasonably pull from multiple disciplines in helping make decisions - exactly what this series of posts is trying to help with.

 

Did you ever end up writing up those thoughts? I skimmed the rest of the posts in the series but didn't find it.

3
Davidmanheim
2y
Unfortunately, no - as I said here: https://forum.effectivealtruism.org/posts/xAZ4pCFtYdkMi54JS/introduction-a-primer-for-politics-policy-and-international?commentId=fuEtwRwTHdrevJPJr

I don't think I quite follow your criticism of FLOP/s; can you say more about why you think it's not a useful unit? It seems like you're saying that a linear extrapolation of FLOP/s isn't accurate to estimate the compute requirements of larger models. (I know there are a variety of criticisms that can be made, but I'm interested in better understanding your point above)

3
beth​
5y
The issue is that FLOPS cannot accurately represent computing power across different computing architectures, in particular between single CPUs versus computing clusters. As an example, let's compare 1 computer of 100 MFLOPS with a cluster of 1000 computers of 1 MFLOPS each. The latter option has 10 times as many FLOPS, but there is a wide variety of computational problems in which the former will always be much faster. This means that FLOPS don't meaningfully tell you which option is better, it will always depend on how well the problem you want to solve maps onto your hardware. In large-scale computing, the bottleneck is often the communication speed in the network. If the calculations you have to do don't neatly fall apart into roughly separate tasks, the different computers have to communicate a lot, which slows everything down. Adding more FLOPS (computers) won't prevent that in the slightest. You can not extrapolate FLOPS estimates without justifying why the communication overhead doesn't make the estimated quantity meaningless on parallel hardware.

How'd you decide to go focus on going into research, even before you decided that developing technical skills would be helpful for that path?

4
RyanCarey
5y
I was influenced at that time by people like Matt Fallshaw and Ben Toner, who thought that for sufficiently good intellectual work, funding would be forthcoming. It seemed like insights were mostly what was needed to reduce existential risks...

Thanks for the great post. Ryan, I'm curious how you figured this at an early stage:

I figured that in the longer term, my greatest chance at having a substantial impact lay in my potential as a researcher, but that I would have to improve my maths and programming skills to realize that.

5
RyanCarey
5y
I thought that more technical skills were rarer, were neglected in some parts of academia (e.g. in history), and were the main thing holding me back from being able to understand papers about emerging technologies... Also, I asked Carl S, and he thought that if I was to go into research, these would be the best skills to get. Nowadays, one could ask a lot more different people.

What key metrics do research analysts pay attention to in the course of their work? More broadly, how do employees know that they're doing a good job?

2
Holden Karnofsky
6y
We do formal performance reviews twice per year, and we ask managers to use their regular (~weekly) checkins with reports to sync up on performance such that nothing in these reviews should be surprising. There's no unified metric for an employee's output here; we set priorities for the organization, set assignments that serve these priorities, set case-by-case timelines and goals for the assignments (in collaboration with the people who will be working on them), and compare output to the goals we had set.

By (3), do you mean the publications that are listed under "forecasting" on MIRI's publications page?

I agree that this makes sense in the "ideal" world, where potential donors have better mental models of this sort of research pathway, and have found this sort of thinking useful as a potential donor.

From an organizational perspective, I think MIRI should put more effort into producing visible explanations of their work (well, depending on their strategy to get funding). As worries about AI risk become more widely known, there will be a larger pool potential donations to research in the area. MIRI risks becoming out-competed by others who are bet... (read more)

Do you share Open Phil's view that there is a > 10% chance of transformative AI (defined as in Open Phil's post) in the next 20 years? What signposts would alert you that transformative AI is near?

Relatedly, suppose that transformative AI will happen within about 20 years (not necessarily a self improving AGI). Can you explain how MIRI's research will be relevant in such a near-term scenario (e.g. if it happens by scaling up deep learning methods)?

5
Jessica_Taylor
7y
I share Open Phil’s view on the probability of transformative AI in the next 20 years. The relevant signposts would be answers to questions like “how are current algorithms doing on tasks requiring various capabilities”, “how much did this performance depend on task-specific tweaking on the part of programmers”, “how much is performance projected to improve due to increasing hardware”, and “do many credible AI researchers think that we are close to transformative AI”. In designing the new ML-focused agenda, we imagined a concrete hypothetical (which isn’t stated explicitly in the paper): what research would we do if we knew we’d have sufficient technology for AGI in about 20 years, and this technology would be qualitatively similar to modern ML technology such as deep learning? So we definitely intend for this research agenda to be relevant to the scenario you describe, and the agenda document goes into more details. Much of this research deals with task-directed AGI, which can be limited (e.g. not self-improving).

The authors of the "Concrete Problems in AI safety" paper distinguish between misuse risks and accident risks. Do you think in these terms, and how does your roadmap address misuse risk?