AlexMennen

Posts

Sorted by New

Comments

Announcing the Buddhists in EA Group
Thus, I present to you, the Buddhists in EA Facebook group.

Dead link. It says "Sorry, this content isn't available right now
The link you followed may have expired, or the page may only be visible to an audience you're not in."

Why I think the Foundational Research Institute should rethink its approach

My critique of analytic functionalism is that it is essentially nothing but an assertion of this vagueness.

That's no reason to believe that analytic functionalism is wrong, only that it is not sufficient by itself to answer very many interesting questions.

Without a bijective mapping between physical states/processes and computational states/processes, I think my point holds.

No, it doesn't. I only claim that most physical states/processes have only a very limited collection of computational states/processes that it can reasonably be interpreted as, not that every physical state/process has exactly one computational state/process that it can reasonably be interpreted as, and certainly not that every computational state/process has exactly one physical state/process that can reasonably be interpreted as it. Those are totally different things.

it feels as though you're pattern-matching me to IIT and channeling Scott Aaronson's critique of Tononi

Kind of. But to clarify, I wasn't trying to argue that there will be problems with the Symmetry Theory of Valence that derive from problems with IIT. And when I heard about IIT, I figured that there were probably trivial counterexamples to the claim that Phi measures consciousness and that perhaps I could come up with one if I thought about the formula enough, before Scott Aaronson wrote the blog post where he demonstrated this. So although I used that critique of IIT as an example, I was mainly going off of intuitions I had prior to it. I can see why this kind of very general criticism from someone who hasn't read the details could be frustrating, but I don't expect I'll look into it enough to say anything much more specific.

I mention all this because I think analytic functionalism- which is to say radical skepticism/eliminativism, the metaphysics of last resort- only looks as good as it does because nobody’s been building out any alternatives.

But people have tried developing alternatives to analytic functionalism.

Why I think the Foundational Research Institute should rethink its approach

That said, I do think theories like IIT are at least slightly useful insofar as they expand our vocabulary and provide additional metrics that we might care a little bit about.

If you expanded on this, I would be interested.

Why I think the Foundational Research Institute should rethink its approach

Speaking of the metaphysical correctness of claims about qualia sounds confused, and I think precise definitions of qualia-related terms should be judged by how useful they are for generalizing our preferences about central cases. I expect that any precise definition for qualia-related terms that anyone puts forward before making quite a lot of philosophical progress is going to be very wrong when judged by usefulness for describing preferences, and that the vagueness of the analytic functionalism used by FRI is necessary to avoid going far astray.

Regarding the objection that shaking a bag of popcorn can be interpreted as carrying out an arbitrary computation, I'm not convinced that this is actually true, and I suspect it isn't. It seems to me that the interpretation would have to be doing essentially all of the computation itself, and it should be possible to make precise the sense in which brains and computers simulating brains carry out a certain computation that waterfalls and bags of popcorn don't. The defense of this objection that you quote from McCabe is weak; the uncontroversial fact that many slightly different physical systems can carry out the same computation does not establish that an arbitrary physical system can be reasonably interpreted as carrying out an arbitrary computation.

I think the edge cases that you quote Scott Aaronson bringing up are good ones to think about, and I do have a large amount of moral uncertainty about them. But I don't see these as problems specific to analytic functionalism. These are hard problems, and the fact that some more precise theory about qualia may be able to easily answer them is not a point in favor of that theory, since wrong answers are not helpful.

The Symmetry Theory of Valence sounds wildly implausible. There are tons of claims that people put forward, often contradicting other such claims, that some qualia-related concept is actually some other simple thing. For instance, I've heard claims that goodness is complexity and that what humans value is increasing complexity. Complexity and symmetry aren't quite opposites, but they're certainly anti-correlated, and both theories can't be right. These sorts of theories never end up getting empirical support, although their proponents often claim to have empirical support. For example, proponents of Integrated Information Theory often cite that the cerebrum has a higher Phi value than the cerebellum does as support for the hypothesis that Phi is a good measure of the amount of consciousness a system has, as if comparing two data points was enough to support such a claim, and it turns out that large regular rectangular grids of transistors, and the operation of multiplication by a large Vandermonde matrix, both have arbitrarily high Phi values, and yet the claim that Phi measures consciousness still survives and claims empirical support, despite this damning disconfirmation. And I think the “goodness is complexity” people also provided examples of good things that they thought they had established are complex and bad things that they thought they had established are not. I know this sounds totally unfair, but I won't be at all surprised if you claim to have found substantial empirical support for your theory, and I still won't take your theory at all seriously if you do, because any evidence you cite will inevitably be highly dubious. The heuristic that claims that a qualia-related concept is some simple other thing are wrong, and that claims of empirical support for such claims never hold up, seems to be pretty well supported. I am almost certain that there are trivial counterexamples to the Symmetry Theory of Valence, even though perhaps you may have developed a theory sophisticated enough to avoid the really obvious failure modes like claiming that a square experiences more pleasure and less suffering than a rectangle because its symmetry group is twice as large.

My current thoughts on MIRI's "highly reliable agent design" work

There's a strong possibility, even in a soft takeoff, that an unaligned AI would not act in an alarming way until after it achieves a decisive strategic advantage. In that case, the fact that it takes the AI a long time to achieve a decisive strategic advantage wouldn't do us much good, since we would not pick up an indication that anything was amiss during that period.

Reasons an AI might act in a desirable manner before but not after achieving a decisive strategic advantage:

Prior to achieving a decisive strategic advantage, the AI relies on cooperation with humans to achieve its goals, which provides an incentive not to act in ways that would result in it getting shut down. An AI may be capable of following these incentives well before achieving a decisive strategic advantage.

It may be easier to give an AI a goal system that aligns with human goals in familiar circumstances than it is to give it a goal system that aligns with human goals in all circumstances. An AI with such a goal system would act in ways that align with human goals if it has little optimization power but in ways that are not aligned with human goals if it has sufficiently large optimization power, and it may attain that much optimization power only after achieving a decisive strategic advantage (or before achieving a decisive strategic advantage, but after acquiring the ability to behave deceptively, as in the previous reason).

What Should the Average EA Do About AI Alignment?

5) Look at the MIRI and 80k AI Safety syllabus, and see if how much of it looks like something you'd be excited to learn. If applicable to you, consider diving into that so you can contribute to the cutting edge of knowledge. This may make most sense if you do it through

...

Semi-regular Open Thread #35

Do any animal welfare EAs have anything to say on animal products from ethically raised animals, and how to identify such animal products? It seems plausible to me that consumption of such animal products could even be morally positive on net, if the animals are treated well enough to have lives worth living, and raising them does not reduce wild animal populations much more than the production of non-animal-product substitutes. Most animal welfare EAs seem confident that almost all animals raised for the production of animal products do not live lives worth living, and that most claims by producers that their animals are treated well are false. However, there are independent organizations (e.g. the Cornucopia Institute's egg and dairy scorecards) that agree that such claims are often false, but also claim to be able to identify producers that do treat their animals well. Thoughts?

Lunar Colony

One thing to keep in mind is that we currently don't have the ability to create a space colony that can sustain itself indefinitely. So pursuing a strategy of creating a space colony in case of human life on Earth being destroyed probably should look like capacity-building so that we can create an indefinitely self-sustaining space colony, rather than just creating a space colony.

A new reference site: Effective Altruism Concepts

Even though the last paragraph of the expected value maximization article now says that it's talking about the VNM notion of expected value, the rest of the article still seems to be talking about the naive notion of expected value that is linear with respect to things of value (in the examples given, years of fulfilled life). This makes the last paragraph seem pretty out of place in the article.

Nitpicks on the risk aversion article: "However, it seems like there are fewer reasons for altruists to be risk-neutral in the economic sense" is a confusing way of starting a paragraph about how it probably makes sense for altruists to be close to economically risk-neutral as well. And I'm not sure what "unless some version of pure risk-aversion is true" is supposed to mean.

Load More