JM

Joseph Miller

347 karmaJoined Jul 2022

Comments
14

EA promoted earning to give When the movement largely moved away from it, not enough work was done to make that distance

Why would we want to do that? Earning to give is a good way to help the world. Maybe not the best, but still good.

It's also worth remembering that this is advertising. Claiming to be a little bit better on some cherry picked metrics a year after GPT-4 was released is hardly a major accelerant in the overall AI race.

Fair point. On the other hand, the perception is in many ways more important than the actual capability in terms of incentivizing competitors to race faster.

Also based on early user reports it seems to actually be noticably better than GPT-4.

Yes, I think this is a reasonable response. However, it seems to rest on the assumption that just trying a bit harder at safety makes a meaningful difference. If Alignment is very hard then Anthropic's AIs are just as likely to kill everyone as other labs'. It seems very unclear whether having "safety conscious" people at the helm will make any difference to our chance of survival, especially when they are almost always forced to make the exact same decisions as people who are not safety conscious in order to stay at the helm.

Even if they are right that it is important to stay in the race, what Anthropic should be doing is

  • Calling for governments to enforce a worldwide Pause such that they can stop racing towards Superintelligence without worry about other labs getting ahead.
  • Trying to agree with other labs to decelerate race dynamics.
  • Warning politicians and the public that automation of all office jobs may be just around the corner.
  • Setting out their views as to how politics works in a world with superintelligence.
  • Declaring in advance what would compel them to consider AIs as moral patients.

All of which they could do while continuing to compete in the race. RSPs are nice, but not sufficient.

People are clearly using agree / disagree voting wrong. What does it mean to agree vote a question?

Was there some blocker that caused this to happen now, rather than 6 months / 1 year ago?

I mean something like "the scenario where there is no pause and also no other development that currently seems very unlikely and changes the level of risk dramatically (eg. a massive breakthrough in human brain emulation next year)."

Specifically, we are looking to use cost-effective Internet messaging tools to communicate the evidence that disempowering AI poses serious dangers (to economic livelihoods and personal safety) for people of every industry, for people of every country, and for humanity as a whole.

Thanks for clarifying, I can see why you'd want to make your mission statement broad enough to encompass future activity.

What "cost-effective Internet messaging tools" do you imagine you will be using in the near future?

We use evidence-based outreach to inform people of the threats that advanced AI poses to their economic livelihoods and personal safety (HOW). Our mission is to create a united front for humanity, driving national and international coordination on robust solutions to AI-driven disempowerment (WHAT).

I'm not sure if this was the aim of the mission statement, but after reading this I still do not know what StakeOut.AI does in a concrete way.

Is there a risk that Mustafa's company could speed up the race towards dangerous capabilities?

Disheartening to a hear a pretty weak answer to this critical question. Analysis of his answer:

First, I think the primary threat to the stability of the nation-state is not the existence of these models themselves, or indeed the existence of these models with the capabilities that I mentioned. The primary threat to the nation-state is the proliferation of power.

I'm really not sure what this means and surprised Rob didn't follow up on this. I think he must mean that they won't be open sourcing the weights, which is certainly good. However, it's unclear how much this matters if the model is available to call from an API. The argument may be that other actors can't fine-tune the model to remove guardrails, which they have put in place to make the model completely safe. I was impressed to hear his claim about jailbreaks later on:

It isn’t susceptible to any of the jailbreaks or prompt hacks, any of them. If anybody gets one, send it to me on Twitter.

Although strangely he also said:

it doesn’t generate code;

Which is trivial to disprove, so I'm not sure what he meant by that. Regardless, I think that providing API access to a model distributes a lot of the "power" of the model to everyone in the world.

I’m not in the AGI intelligence explosion camp that thinks that just by developing models with these capabilities, suddenly it gets out of the box, deceives us, persuades us to go and get access to more resources, gets to inadvertently update its own goals.

There hasn't ever been any very solid rebuttal of the intelligence explosion argument. It mostly gets dismissed of the basis of sounding like sci-fi. You can make a good argument that dangerous capabilities will emerge before we reach this point, and we may have a "slow take-off" in that sense. However, it seems to me that we should expect recursive self-improvement to happen eventually because there is no fundamental reason why it isn't possible and it would clearly be useful for achieving any task. So the question is whether it will start before or after TAI. It's pretty clear that no one knows the answer to this question so it's absurd to be gambling the future of humanity on this point.

Me not participating certainly doesn’t reduce the likelihood that these models get developed.

The AI race currently consists of a small handful of companies. A CEO who was actually trying to minimize the risk of extinction would at least attempt to coordinate a deceleration between these 4 or 5 actors before dismissing this as a hopeless tragedy of the commons.

Related: Advantages of Cutting Your Salary

In all seriousness, I think this is a good point

Load more