DH

Donald Hobson

43 karmaJoined Sep 2019

Posts
1

Sorted by New

Comments
23

The problem with "anthropomorphic AI" approaches is

  1. The human mind is complicated and poorly understood.
  2. Safety degrades fast with respect to errors. 

Lets say you are fairly successful. You produce an AI that is really really close to the human mind in the space of all possible minds. A mind that wouldn't be particularly out of place at a mental institution. They can produce paranoid ravings about the shapeshifting lizard conspiracy millions of times faster than any biological human. 

Ok, so you make them a bit smarter. The paranoid conspiricies get more complicated and somewhat more plausible.  But at some points, they are sane enough to attempt AI research and produce useful results. Their alignment plan is totally insane.

In order to be useful, anthropomorphic AI needs to not only make AI that thinks similarly to humans. They need to be able to target the more rational, smart and ethical portion of mind space. 

Humans can chuck the odd insane person out of the AI labs. Sane people are more common and tend to think faster. A team of humans can stop any one of their number crowning themselves as world king. 

In reality, I think your anthropomorphic AI approach gets an arguably kind of humanlike in some ways AI that takes over the world. Because it didn't resemble the right parts of the right humans in the right ways closely enough in the places where it matters. 

If that B52 had exploded, the death toll would probably have been smaller than that of the Hiroshima bomb. (It landed in a random bit of countryside, not a city). Unless you think that accident would have triggered all out nuclear war? Sure, it would have a large impact on American politics.  Quite possibly lead to total nuclear disarmament, at least of America. But any anthropic or fine tuning argument doesn't apply. 

Suppose we were seeing lots of "near misses". These were near misses of events that, had they happened, would have destroyed a random american town. Clearly this isn't anthropic effects or similar. I would guess something about nuclear safety engineers being more/less competent in various ways. Or nuclear disarmament supporters in high places that want lots of near miss scares. Or the bombs are mostly duds, but the government doesn't want to admit this. 

Lets imagine a toy model, where we only care about current people. It's still possible for most of the potential qualies  to be far into the future. (If either lifetimes are very long, or quality is very high, maybe both.) So in this model, almost all the qualies come from currently existing people living for trillion years in an FAI utopia. 

So lets suppose there are 2 causes, AI safety, and preventing nuclear war. Nuclear war will happen unless prevented, and will lead to half as many current people reaching ASI. (Either it kills them directly, or it delays the development of ASI)  Let the QALY's of (No nukes, FAI) be X.

Case 1) Currently P(FAI)=0.99, And AI safety research will increase that to P(FAI)=1. If we work on AI safety, a nuclear war happens, half the people survive to ASI,  and we get U=0.5X. If we work on preventing nuclear war, the AI is probably friendly anyway, so U=0.99X

Case 2) Currently P(FAI)=0, but AI safety research can increase that to P(FAI)=0.01. Then if we prevent nuclear war, we get practically 0 utility, and if we work on AI safety, we get 0.005 utility,  a 1% chance of the 50% of survivors living in a post FAI utopia. 

Of course, this is assuming all utility ultimately resides in post ASI utopia, as well as not caring about future people. If you put a substantial fraction of utility on pre ASI world states, then the calculation is different. (Either by being really pesimistic about the chance of alignment, or by applying some form of time discounting to not care too much about the far future of existing people either. )

I think sentience is purely computational, it doesn't matter what the substrate is. Suppose you are asleep. I toss a coin, heads I upload your mind into a highly realistic virtual copy of your room. Tails I leave you alone. Now I offer you some buttons that switch the paths of various trolleys in various real world trolley problems. (With a dependency on the coin flip) So if you are real, pressing the red button gains 2 util, if you are virtual, pressing costs 3 util. As you must (by the assumption the simulation is accurate) make the same decisions in reality and virtuality, then to get max util, you must act as if you are uncertain. 

"I have no idea if I'm currently sentient or not" is a very odd thing to say. 

Maybe it is chemical structure. Maybe a test tube full of just dopamine and nothing else is everso happy as it sits forgotten in the back shelves of a chemistry lab. Isn't it convenient the sentient chemicals are full of carbon and get on well with human biochemistry. Like what if all the sentient chemical structures contained americium. No one would be sentient until the nuclear age,  and people could make themselves a tiny bit sentient at the cost of radiation poisoning. 

"But, humans could suffer digital viruses, which could be perhaps worse than the biological ones." Its possible for the hardware to get a virus, like some modern piece of malware, that just happens to be operating on a computer running a digital mind. Its possible for nasty memes to spread.  But in this context we are positing a superintelligent AI doing the security, so neither of those will happen.

Fixing digital minds is easier than fixing chemical minds, for roughly the reason fixing digital photos is easier than fixing chemical ones. With chemical photos, often you have a clear idea what you want to do, just make this area lighter, yet doing it is difficult. With chemical minds, sometimes you have a clear idea what you want to do, just reduce the level of this neurotransmitter, yet doing it is hard. 

 

"But then, how would you differentiate a digital virus from an interaction, if both would change some aspects of the code or parameters?" If those words describe a meaningful difference, then there must be some way to tell. We are positing a superintelligence with total access to every bit flipped, so yes it can tell. "how can you tell between pictures of cats and dogs when they are both just grids of variously coloured pixels?"

 

"Ah, I think there is nothing beyond 'healthy.' Once one is unaffected by external and internal biological matters, they are healthy." 

Sure.  But did you define healthy in a way that agrees with this. And wouldn't mind uploading reduce the chance of getting cancer in the future. The AI has no reason not to apply whatever extreme tech it can to reduce the chance of you ever getting ill by another 0.0001%

I don't think the human difficulty imagining what super-healthy is is the reason the AI needs nanobots. A person who is say bulletproof is easy to imagine, and probably not achievable with just good nutrition, but is achievable with nanobots. The same goes for biology that is virus proof, cancer proof etc. 

I can imagine mind uploading quite easily. 

There may be some "super-healthy" so weird and extreme that I can't imagine it. But there is already a bunch of weird extreme stuff I can imagine. 

In a competitive market, the investors make a tiny amount of profit. Suppose widgets cost $100 to make. All widget buyers always choose the cheapest widget. If you start selling them for $200,  someone else can undercut you by selling them for $180. The only equilibrium is selling widgets for $102 or something. Just enough to make a slim profit, but not enough for anyone to try undercutting you.  Of course, if you and you alone have a way to magic widgets out of nothing, then you can make fat profits. This is roughly how a lot of markets work. Its why there are commodity prices, and those making commodities have slim profit margins. 

Note that it is easy for investors to loose money by being stupid. And they can potentially turn a large profit if they are a unique source of exceptionally good info. They just can't turn a large profit in a large pool of similarly competent investors.

The investors have skin in the game, and as long as the investors maximize money, the system should work. The problem is that if people are prepared to burn their own money, they can burn other money with it. (Ultimately meaning some charities get less funding if the silly money gets added in the wrong place than if the silly money had just directly burned their cash at home) 

I'd not like to ingest nanobots which would be something like a worm infection but worse!

For a huge range of goals, the optimum answer involves some kind of nanobot. (Unless even deeper magic tech exists). If you want a person to be healthy, the nanobots can make them healthier than any good nutrition. 

The idea I was getting at is that asking an AI for better nutrition, meant the way you mean it, is greatly limiting the options for what you actually want. Suppose you walk a lot to get to places, and your shoes are falling apart. You ask the AI for new shoes, when it could have given you a fancy car. By limiting the AI to "choice of food" rather than "choice of every arrangement of atoms allowed by physics" you are greatly reducing the amount the AI can optimize your health. 

You know, as far as seeing ourselves on a path to doom, I don't see why development of a superintelligent rogue AI isn't treated like development of a superweapon. 

 

Because distinguishing it from benign, not superintelligent AI is really hard. 

So you are the FBI, you have a big computer running some code. You can't tell if its a rouge superintelligence or the next DALL-E by looking at the outputs. A rouge superintelligence will trick you until its too late. Once its run at all on a computer that isn't in a sandboxed bunker its probably too late. So you have to notice people writing code, and read that code before its run. There are many smart people writing code all the time. That code is often illegible spaghetti. Maybe the person writing the code will know, or at least suspect, that it might be a rouge superintelligence. Maybe not. 

Lots of computer scientists are in practice rushing to develop self driving cars, the next GPT. All sorts of AI services. The economic incentive is strong.

1)The subsidy market is a buffer of money, not a long term source. R should be adjusted so the amount of money flowing in and flowing out are the same. 

3) I was trying to generalize a Vickrey auction, but I think I messed that up. Just use whatever kind of auction you feel like.

5) If the amount raised is less than the cost bar, the project can try to muddle through with the money they have, or can refund the investors/ cancel the thing. 

7) Yes, and scaled down to reverse the scaling up.

None of the examples illustrate the investors making positive returns. The scheme is delibirately set up, so that in the limit of ideal markets, the investors make nothing. In practice the investors would probably make something, but hopefully not much. Well the investors sometimes win, but its counterbalanced by a loss. 

This scheme should pay any project that produces at least util/$, and not pay any other projects, always selecting the highest util/$ variant if  there are multiple variants. If we assume project managers are paid a fixed fair salary, there is nothing to threaten them with, unless you want to threaten to cut off funding to an effective charity, because you think it could be even more effective.

This lack of incentive (so long as your util/$ is above some threshold) is what would happen in a regular retroactive system, where all shares are sold, or where you didn't in practice care how much "credit" you ended up with. (Because you couldn't turn that credit into money)

Giving the project manager 1% of the shares would produce some incentive in the right direction. Or you could just hope they are all altruists. 

I was kind of thinking like the subsidy market more as a currency exchange. 

You could replace the scaled down and scaled up money with 2 different kinds of crypto token. 

The whole point of the subsidy market is to move money from the easy wins with loads of money, to the marginal cases that are just about worth doing, while leaving the no hopers behind. 

Well if you have it, I'll take it. In the general scenario, a very powerful benevolent AI is left to do whatever it thinks is best. If the AI decides that freedom is one of humans top values,  it will try to make the world better while optimizing human freedom. Giving humans more freedom in practice than the typical government is not a particularly high bar. Of course, plenty of people might want the AI micromanaging every detail of their life, the AI will do a really good job of it. But I would think ideally freedom should be there for those who want it.

 

Its also worth noting there is a fairly common belief that we are on a path to probable doom, and any AI that offered anything better than paperclips is work taking. So, even if your AI was much too controlling and humans would prefer a less controlling one,  many EA would say " best AI we are going to get".

Load more