JF

James Fodor

1276 karmaJoined Melbourne VIC, Australia

Bio

Hi, my name is James Fodor. I am a longtime student and EA organiser from Melbourne. I love science, history, philosophy, and using these to make a difference in the world.

Sequences
1

Critique of Superintelligence

Comments
22

Hi Toby, thanks for the comment.

I have read about some of the work on tackling the ARC dataset, and I am not at all confident that the approaches which perform well have anything to do with generalisable reasoning. The problem remains that there is no validation that the benchmark measures what it claims to. I don't know what methods o3 used to solve it, but until I do I don't believe the marketing hype released by OpenAI that it must be generalisable reasoning.

As to why we'd see inference time scaling if chain-of-thought consisted of not much more than post-hoc rationalizations, this is still an open question but it seems to be partly driven by increased compute time and number of tokens. I don't have the full answer here, but the evidence we do have strongly cautions against just assuming these models are doing what we might describe as 'genuine reasoning'.

Hi David,

The point I was trying to communicate here was simply that our design was able to find a pattern of differences between the control and treatment groups which is interpretable (i.e. in terms of different ages and career stage). I think this provides some validation of the design, in that if large enough differences exist then our measures pick up these differences and we can statistically measure them. We don't, for instance, see an unintelligable mess of results that would cast doubt on the validity of our measures or the design itself. Of course, if as you point out the effect size for attending the conference is smaller then we won't be able to detect that given our sample size. For most of our measures this was around 15-20%. But given we were able to measure sufficiently large effects using this design, I think it provides justification for thinking that a large enough sample size using a similar study design would be able to detect smaller effects, if they existed. Hope that clarifies a bit. 

I think it is appropriate for the movement to reflect at this time on whether there are systematic  problems or failings within the community that might have contributed to this problem. I have publicly argued that there are, and though I might be wrong about that, I do think its entirely reasonable to explore these issues. I don't think its reasonable to just continually assert that it was all down to a handful of bad actors and refuse to discuss the possibility of any deeper or broader problems. I like to think that the EA community can learn and grow from this experience.

I disagree that events can't be evidence for or against philosophical positions. If empirical claims about human behaviour or the real-world operation of ethical principles are relevant to the plausibility of competing ethical theories, then I think events can provide evidential value for philosophical positions. Of course that raises a much broader set of issues and doesn't really detract from the main point of this post, but I thought I would push back on that specific aspect.

I love the research-focus of this piece and the lack of waffle. Very impressed.

"Is it really "grossly immoral" to do the same thing in crypto without telling depositors?"
Yes

Great point about ventilation. I am not aware of any evidence that hand sanitisation in particular is merely 'safety theater'. Surface transmission may not be the major method of viral spread, but it still is a method, and hand sanitisation is a very simple intervention. Also, to emphasise something I mentioned in the post, masks are definitely not 'safety theater'. It is good to see that the revised COVID protocol now mentions that mask use will be encouraged and widely available.

I don't understand how Australia's travel policy is relevant. I'm not asking for anything particularly unusual or onerous, I just would expect that a community of effective altruists would follow WHO guidelines regarding methods to reduce the spread of COVID. I honestly don't understand the negative reaction.

Thanks Amy, I think these clarifications significantly improve the policy. I disagree on the decision not to mandate masks but I understand there will be differences in views there. However mentioning that they are encouraged may be just as effective at ensuring widespread use. That was part of my original concern, that I did not feel this aspect of norm-setting was as evident in the original version of the policy.

It doesn't seem to me this has much relevance to EA.

Load more