MHFC Update and RFP

wtroy; Peter Brietbart; Lily Y

Egg Syntax

5 karmaJoined Apr 2023

Message

Posts
1

Sorted by New

Terminology suggestion: standardize terms for probability ranges

Egg Syntax

· 8mo ago · 1m read

Comments
6

Why *not* just send people to Bluedot (FBB#4)

Egg Syntax21d4

From my perspective as a researcher not involved with fieldbuilding, this post misses an important distinction. I do occasionally suggest that new people take a BlueDot course (or apply to AI Safety Camp, or SPAR, or one of the other excellent programs out there), but far more often than that I point new people to the BlueDot curriculum. I commonly see others doing the same; I think it's become the default AIS 101 reading. Maybe you're mistaking that for people pushing the BlueDot course on everyone new to the field?

As a more general and perhaps contrarian pushback: AI safety (other than governance) isn't at all a local problem, and so there's no particular reason to focus on local groups. I realize that some people find it inherently motivating to be in the same room with other people in their own community and build social bonds, so there's some value there. But in general I think it's more valuable for people to find ways to fill important vacant niches in the AIS ecosystem than to focus on replicating another organization but in <location>. That can be supplemented with informal local groups that exist to serve those social needs.

It’s well-known that the AIS community is mentor and management-constrained

That's not obvious to me; I do think there are constraints there but my sense is that the field is currently mainly bottlenecked by funding (1, 2).

If you have a young friend interested in AI Safety, they might just be fine with taking a local group’s course if they have the opportunity. It won’t be run as professionally as Bluedot’s course, but they are more likely to give AI Safety the benefit of the doubt.

Why are they more likely to give AIS the benefit of the doubt? Won't that be most likely to happen if their exposure is to the highest-quality course they have access to?

On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI

Egg Syntax10mo1

I think Ryan's solution shows that the intelligence is coming for him, and not from Chat-GPT4o.

If this is true, then substituting in a less capable model should have equally good results; would you predict that to be the case? I claim that plugging in an older/smaller model would produce much worse results, and if that's the case then we should consider a substantial part of the performance to be coming from the model.

This is what Chollet is talking about in the podcast when he says...'I’m pretty skeptical that we’re going to see an LLM do 80% in a year. That said, if we do see it, you would also have to look at how this was achieved.'

This seems to me to be Chollet trying to have it both ways. Either a) ARC is an important measure of 'true' intelligence (or at least of the ability to reason over novel problems), and so we should consider LLMs' poor performance on it a sign that they're not general intelligence, or b) ARC isn't a very good measure of true intelligence, in which case LLMs' performance on it isn't very important. Those can't be simultaneously true. I think that nearly everywhere but in the quote, Chollet has claimed (and continues to claim) that a) is true.

On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI

Egg Syntax10mo1

I'm quite confused, given the fact that all of the weights in the transformer are frozen after training and RLHF, why it's called learning at all. The model certainly isn't learning anything.

I would frame it as: the model is learning but then forgetting what it's learned (due to its inability to move anything from working/short-term memory to long-term memory). That's something that we see in learning in humans as well (one example: I've learned an enormous number of six-digit confirmation codes, each of which I remember just long enough to enter it into the website that's asking for it), although of course not so consistently.

On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI

Egg Syntax10mo2

I have thoughts, but a question first: you link a Kambhampati tweet where he says,

...as the context window changes (with additional prompt words), the LLM, by design, switches the CPT used to generate next token--given that all these CPTs have been pre-computed?

What does 'CPT' stand for here? It's not a common ML or computer science acronym that I've been able to find.

On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI

Egg Syntax10mo1

"humans that are just visually observing/predicting the patterns."

I don't think that's actually any simpler than doing it as JSON; it's just that our brains are tuned for (and we're more accustomed to) doing it visually. Depending on the specifics of the JSON format, there may be a bit of advantage to being able to have adjacency be natively two-dimensional, but I wouldn't expect that to make a huge difference.

On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI

Egg Syntax10mo1

Typo watch:

Dwarkesh is annoyed because he thinks that François is conceptually defining LLM-like models as incapable of memorisation

I assume you mean 'incapable of generalization' here?

Egg Syntax

Posts 1

Comments6

Posts
1

Comments
6