Founded Northwestern EA club. Studied Math and Econ.
Starting a trading job in a few months and self-studying python. Talk to me about cost benefit analysis !
It seems plausible that there are ≥100,000 researchers working on ML/AI in total. That’s a ratio of ~300:1, capabilities researchers:AGI safety researchers.
Barely anyone is going for the throat of solving the core difficulties of scalable alignment. Many of the people who are working on alignment are doing blue-sky theory, pretty disconnected from actual ML models.
One question I'm always left with is: what is the boundary between being an AGI safety researcher and a capabilities researcher?
For instance, My friend is getting his PhD in machine learning, he barely knows about EA or LW, and definitely wouldn't call himself a safety researcher. However, when I talk to him, it seems like the vast majority of his work deals with figuring out how ML systems act when put in foreign situations wrt the training data.
I can't claim to really understand what he is doing but it sounds to me a lot like safety research. And it's not clear to me this is some "blue-sky theory". A lot of the work he does is high-level maths proofs, but he also does lots of interfacing with ml systems and testing stuff on them. Is it fair to call my friend a capabilities researcher?
So I can choose then?
Yes. but I think to be very specific, we should call the problems A and B (for instance, the quiz is problem A and the exam is problem B), and a choice to work on problem A equates to spending your resource [1]on problem A in a certain time frame. We can represent this as where {i} is the period in which we chose a and {j} is the number of times we have picked a before. j is sorta irrelevant for problem A since we only can use one resource max to study but relevant for problem B to represent the diminishing returns via .
What do we mean by 'last'? Do you mean that the choice in period 1, , yields benefits (or costs) in periods 1 and 2, while the choice in period 2, , only affects outcomes in period 2?
Neither if I'm understanding you correctly. I mean that the Scale of problem A in period 2, , is 0. This also implies that the marginal utility of working on problem A in period 2 is 0. For instance, if I study for my quiz after it happens this is worthless. This is different from the diminishing returns that are at play when repeatedly studying for the same exam.
This is the extreme end of the spectrum though. We can generalize this by acknowledging that the marginal utility of a certain problem is a function of time. For instance, it's better to knock on doors for an election the day before than 3 years before but probably not infinitely better.
Can you define this a bit? Which 'choices' have different scale, and what does that mean?
I think I maybe actually used scale as both meaning MU/resource and as meaning: if we solve the entire problem, how much is that worth? Basically, importance, as described in the ITN framework, except maybe I didn't mean it as a function of the percent of work done and rather the total. Generally though, I think people consider this to be a constant (which I'm not sure they should...) but this being the case, we are basically talking about the same thing but they are dividing by a factor of 100, which again doesn't matter for this discussion.
I think what Eliot meant is importance, so that's what I'm going to define it as, but I think you picked up on this confusion which is my bad.
By choices, I meant the problems, like the quiz or the exam. I think I used the incorrect wording here though since choices also denote a specific decision to spend a resource on a problem. My fault for the confusion.
Maybe you want to define the sum of benefits
E.g.,
,
,
,
,
Yes basically but I think that
and
and
are better notations. although it doesn't really matter, I got what you were saying.
where a and b are positive numbers, and is a diminishing returns parameter?
essentially yes but with my notation.
For 'different scale' do you just mean something like ?
No. taking b to mean , b is the marginal utility of spending a resource in period 1 on problem B, not the total utility to be gained by solving problem b. Using the test example the scale of B is either since this is the maximum grade I can achieve based on the convergent geometric sum described or 20% since this is the maximum grade total although maybe it's literally impossible for me to reach this. I'm not actually sure which to use, but I guess let's go with 20%, and denote a convergent sum as meaning .
What I meant was or 20% > 10% in the test example
So this is like, above, if
I think this was the point I was trying to make with the examples I gave to you. Basically that the decision at t = 1 in a sequence of decisions that maximizes utility over multiple periods of time is not the same as the decision that maximizes utility at t= 1, which is what I believe you are pointing out here. In effect
But actually, I think the claim I originally made in response to him was actually a lot simpler than that, more along the lines of "A problem being urgent does not mean that its current scale is higher than if it was not urgent". taking U(Ai) to be the Scale of problem A in the ith period, and taking problem A to be urgent to mean , which I'm getting from the op saying
Some areas can be waited for a longer time for humans to work on, name it, animal welfare, transhumanism.
my original claim in response to Elliot is something like
and does not imply
where
The fact that I get no value out of studying for a Monday quiz on Tuesday doesn't mean the quiz is now worth more than 10% of my grade. On the flip side if the quiz was moved to Wednesday It would still be worth 10% of my grade.
I think it was maybe not what Eliot meant. That being said, taking his words literally I do think this is what he implied. I'm not really sure honestly haha.
But that's not just 'because a has no value in period ' but also because of the diminishing returns on b (otherwise I might just choose b in both periods.
Correct. I think there are further specifications that might make my point less niche, but I'm not sure.
As an aside, I'm not sure I'm correct about any of this but I do wish the forum was a little more logic and math-heavy so that we could communicate better.
we could model a situation where you have multiple resources in every period but here I choose to model as if you have a single resource to spend in each period
I don't full comprehend why we can't include it. It seems like the ITN framework does not describe the future of the marginal utility per resource spent on the problem but rather the MU/resource right now. If we want to generalize the ITN framework across time, which theoretically we need to do to choose a sequence of decisions, we need to incorporate the fact that tractability and scale are functions of time (and even further the previous decisions we make).
all this is going to do is change the resulting answer from (MU/$) to MU/$(t), where t is time. everything still cancels out the same as before. In practice I don't know if this is actually useful.
Assume there are two societies that passed the great filter and are now grabby. Society EA and society NOEA.
Society EA you could say is quite similar to our own society. The majority of the dominant species is not concerned with passing the great filter and most individuals are inadvertently increasing the chance of the species extinction. However, a small contingent had become utilitarian rationalists and speced heavily into reducing x-risk. Since the group passed the great filter, you can assume this is in large part due to this contingent of EAs/guardian angels.
Now society NOEA is a species that passed the filter, but they didn't have EA rationalists. The only way they were able to pass the filter was because as a species, they are overall quite careful and thoughtful. The whole species rather than a divergent few has enough of a security mindset that there was no special group that "saved" them.
Which species would we prefer to get more control of resources?
The punchline is that the very fact that we "need" EA on earth might provide evidence that our values are worse than the species that didn't need EA to pass the filter.