Martín Soto

34 karmaJoined Jul 2022

Message

Posts
1

Sorted by New

General advice for transitioning into Theoretical AI Safety

Martín Soto

· 3y ago · 12m read

Comments
3

My favorite articles by Brian Tomasik and what they are about

Martín Soto1y3

This especially follows from eliminativism

My understanding was that eliminativism (qualia are just a kind of physical process, not a different essence or property existing separate from physical reality) is orthogonal to panpsychism (whether particles have qualia), and thus to whether particles suffer. Or is there some connection I'm missing?

EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem

Martín Soto2y8

(Comment cross-posted from LessWrong. Excuse the formatting, I'm on mobile.)

Hi Elizabeth, I feel like what I wrote in those long comments has been strongly misrepresented in your short explanations of my position in this post, and I kindly ask for a removal of those parts of the post until this has cleared up (especially since I had in the past offered to provide opinions on the write-up). Sadly I only have 10 minutes to engage now, but here are some object-level ways in which you've misrepresented my position:

"The charitable explanation here is that my post focuses on naive veganism, and Soto thinks that’s a made-up problem."

Of course, my position is not as hyperbolic as this.

"his desired policy of suppressing public discussion of nutrition issues with plant-exclusive diets will prevent us from getting the information to know if problems are widespread"

In my original answers I address why this is not the case (private communication serves this purpose more naturally).

"I have a lot of respect for Soto for doing the math and so clearly stating his position that “the damage to people who implement veganism badly is less important to me than the damage to animals caused by eating them”."

As I mentioned many times in my answer, that's not the (only) trade-off I'm making here. More concretely, I consider the effects of these interventions on community dynamics and epistemics possibly even worse (due to future actions the community might or might not take) than the suffering experienced by farmed animals murdered for members of our community to consume at present day.

"I can’t trust his math because he’s cut himself off from half the information necessary to do the calculations. How can he estimate the number of vegans harmed or lost due to nutritional issues if he doesn’t let people talk about them in public?"

Again, I addressed this in my answers, and argue that data of the kind you will obtain are still not enough to derive the conclusions you were deriving.

More generally, my concerns were about framing and about how much posts like this one can affect sensible advocacy and the ethical backbone of this community. There is indeed a trade-off here between transparent communications and communal dynamics, but that happens in all communities and ignoring it in ours is wishful thinking. It seems like none of my worries have been incorporated into the composition of this post, in which you have just doubled down on the framing. I think these worries could have been presented in a way healthier form without incurring in all of those framing costs, and I think its publication is net-negative due to the latter.

How would a language model become goal-directed?

Martín Soto2y2

My take would be: Okay, so you have achieved that, instead of the whole LLM being an agent, it just simulates an agent. Has this gained much for us? I feel like this is (almost exactly) as problematic. The simulated agent can just treat the whole LLM as its environment (together with the outside world), and so try to game it like any agentic enough misaligned AI would: it can act deceptively so as to keep being simulated inside the LLM, try to gain power in the outside world which (if it has a good enough understanding of minimizing loss) it knows is the most useful world model (so that it will express its goals as variables in that world model), etc. That is, you have just pushed the problem one step back, and instead of the LLM-real world frontier, you must worry about the agent-LLM frontier.

Of course we can talk more empirically about how likely and when these dynamics will arise. And it might well be that the agent being enclosed in the LLM, facing one further frontier between itself and real-world variables, is less likely to arrive at real-world variables. But I wouldn't count on it, since the relationship between the LLM and the real world would seem way more complex than the relationship between the agent and the LLM, and so most of the work is gaming the former barrier, not the latter.

Martín Soto

Posts 1

Comments3

Posts
1

Comments
3