Slightly against aligning with neo-luddites

Matthew_Barnett

To summarize,

When considering whether to delay AI, the choice before us is not merely whether to accelerate or decelerate the technology. We can choose what type of regulations are adopted, and some options are much better than others.
Neo-luddites do not fundamentally share our concern about AI x-risk. Thus, their regulations will probably not, except by coincidence, be the type of regulations we should try to install.
Adopting the wrong AI regulations could lock us into a suboptimal regime that may be difficult or impossible to leave. So we should likely be careful not endorse a proposal because it's "better than nothing" unless it's also literally the only chance we get to regulate AI.
In particular, arbitrary data restrictions risk preventing researchers from having access to good data that might help with alignment, potentially outweighing the (arguably) positive effect of slowing down AI progress in general.

It appears we are in the midst of a new wave of neo-luddite sentiment.

Earlier this month, digital artists staged a mass protest against AI art on ArtStation. A few people are reportedly already getting together to hire a lobbyist to advocate more restrictive IP laws around AI generated content. And anecdotally, I've seen numerous large threads on Twitter in which people criticize the users and creators of AI art.

Personally, this sentiment disappoints me. While I sympathize with the artists who will lose their income, I'm not persuaded by the general argument. The value we could get from nearly free, personalized entertainment would be truly massive. In my opinion, it would be a shame if humanity never allowed that value to be unlocked, or restricted its proliferation severely.

I expect most LessWrong readers to agree with me on this point — that it is not worth sacrificing a technologically richer world just to protect workers from losing their income. Yet there is a related view that I have recently heard some of my friends endorse: that nonetheless, it is worth aligning with neo-luddites, incidentally, in order to slow down AI capabilities.

On the most basic level, I think this argument makes some sense. If aligning with neo-luddites simply means saying "I agree with delaying AI, but not for that reason" then I would not be very concerned. As it happens, I am sympathetic to many of the arguments in Katja Grace's recent post about delaying AI in order to ensure existential AI safety.

Yet I worry that some people intend their alliance with neo-luddites to extend much further than this shallow rejoinder. I am concerned that people might work with neo-luddites to advance their specific policies, and particular means of achieving them, in the hopes that it's "better than nothing" and might give us more time to solve alignment.

In addition to possibly being mildly dishonest, I'm quite worried such an alliance will be counterproductive on separate, purely consequentialist grounds.

If we think of AI progress as a single variable that we can either accelerate or decelerate, with other variables held constant upon intervention, then I agree it could be true that we should do whatever we can to impede the march of progress in the field, no matter what that might look like. Delaying AI gives us more time to reflect, debate, and experiment, which prima facie, I agree, is a good thing.

A better model, however, is that there are many factor inputs to AI development. To name the main ones: compute, data, and algorithmic progress. To the extent we block only one avenue of progress, the others will continue. Whether that's good depends critically on the details: what's being blocked, what isn't, and how.

One consideration, which has been pointed out by many before, is that blocking one avenue of progress may lead to an "overhang" in which the sudden release of restrictions leads to rapid, discontinuous progress, which is highly likely to increase total AI risk.

But an overhang is not my main reason for cautioning against an alliance with neo-luddites. Rather, my fundamental objection is that their specific strategy for delaying AI is not well targeted. Aligning with neo-luddites won't necessarily slow down the parts of AI development that we care about, except by coincidence. Instead of aiming simply to slow down AI, we should care more about ensuring favorable differential technological development.

Why? Because the constraints on AI development shape the type of AI we get, and some types of AIs are easier to align than others. A world that restricts compute will end up with different AGI than a world that restricts data. While some constraints are out of our control — such as the difficulty of finding certain algorithms — other constraints aren't. Therefore, it's critical that we craft these constraints carefully, to ensure the trajectory of AI development goes well.

Passing subpar regulations now — the type of regulations not explicitly designed to provide favorable differential technological progress — might lock us into bad regime. If later we determine that other, better targeted regulations would have been vastly better, it could be very difficult to switch our regulatory structure to adjust. Choosing the right regulatory structure to begin with likely allows for greater choice than switching to a different regulatory structure after one has already been established.

Even worse, the subpar regulations could even make AI harder to align.

Suppose the neo-luddites succeed, and the US congress overhauls copyright law. A plausible consequence is that commercial AI models will only be allowed to be trained on data that was licensed very permissively, such as data that's in the public domain.

What would AI look like if it were only allowed to learn from data in the public domain? Perhaps interacting with it might feel like interacting with someone from a different era — a person from over 95 years ago, whose copyrights have now expired. That's probably not the only consequence, though.

Right now, if an AI org needs some data that they think will help with alignment, they can generally obtain it, unless that data is private. Under a different, highly restrictive copyright regime, this fact may no longer be true.

If deep learning architectures are marble, data is the sculptor. Restricting what data we're allowed to train on shrinks our search space over programs, carving out which parts of the space we're allowed to explore, and which parts we're not. And it seems abstractly important to ensure our search space is not carved up arbitrarily — in a process explicitly intended for unfavorable ends — even if we can't know now which data might be helpful to use, and which data won't be.

True, if very powerful AI is coming very soon (<5 years from now), there might not be much else we can do except for aligning with vaguely friendly groups, and helping them pass poorly designed regulations. It would be desperate, but sensible. If that's your objection to my argument, then I sympathize with you, though I'm a bit more optimistic about how much time we have left on the clock.

If very powerful AI is more than 5 years away, we will likely get other chances to get people to regulate AI from a perspective we sympathize with. Human extinction is actually quite a natural thing to care about. Getting people to regulate AI for that explicit reason just seems like a much better, and more transparent strategy. And while AI gets more advanced, I expect this possibility will become more salient in people's minds anyway.

77 Reactions

More posts like this

Comments17

Sorted by

New & upvoted

Click to highlight new comments since: Today at 9:15 PM

Minh NguyenDec 29 202224

I upvoted this because AI-related advocacy has become a recent focus of mine. My background is from organising climate protests, and I think EAs have a bit of a blindspot when it comes to valuing advocacy. So it's good to have this discussion. However, I do disagree on a few points.

1. Just Ask: In broad strokes, I think people tend to overestimate exactly how unreasonable and persistent initial objections will be. My simplest rebuttal would be: How do you know these advocates would even disagree with your approach? An approach I'm considering now is to find a decent AI Governance policy proposal, present it to the advocates explaining how it solves their problem and see who says yes. If half of them say no, you work with the other half. Before assuming the "neo-Luddites" won't listen to reason, shouldn't you ... ask? Present them with options? I don't see why it's not at least worth reaching out to potential allies, and I don't see why it's an irredeemable sin to be angry at something with no clear solutions, when no one has presented a solution. It's perhaps ironic the assumptions given here.

2. Counterfactuals I think by most estimates, anti-AI advocacy only grows from here. Having a lot of structurally unemployed angry people is historically a recipe for trouble. You then have to consider that reactionary responses will happen regardless of whether "we align with them". If they are as persistently unreasonable as you say they are, they will force bad policy regardless. They will influence mainstream discourse towards their views, and be loud enough to crowd out our "more reasonable" views. I just think it makes a lot of sense to engage these groups early on, and make an earnest effort to make our case. Because the counterfactual is that they get bad policies passed without our input.

3. False dichotomy of advocates and researchers I speak more generally here. In my time in climate risk, everyone had an odd fixation on separating climate advocates and researchers.^[1] I don't think this split was helpful for epistemics or strategy overall. Because then you had scientists who had all the solutions and epistemics that the public/policymakers generally ignored out of lack of engagement, and the advocates who started latching onto poorly-informed and counterproductive radical agendas, and were constantly rebutted with "why are we listening to you clueless youngsters and not the scientists (who we ignore anyway)". It was just a constant headache to have two subgroups needlessly divide themselves while the clock ran down. Like sure, the advocates were ... not the most epistemically rigorous. And the scientists generally struggled to put across their concerns. But I'd greatly prefer if everyone valued more communication/coordination, and not less.

And for my sanity's sake, I'd like the AI risk community to not repeat this dynamic.

^{^}
I suspect most of this dichotomy was not made in good faith, but simply by people uncomfortable with the premise of anthropogenic climate change and throwing out fallacies to discredit any arguments they're confronted with in their daily lives.

John_MaxwellDec 27 202211

...their regulations will probably not, except by coincidence, be the type of regulations we should try to install.

A priori, I'd expect a randomly formulated AI regulation to be about 50% likely to be an improvement on the status quo, since the status quo wasn't selected for being good for alignment.

Adopting the wrong AI regulations could lock us into a suboptimal regime that may be difficult or impossible to leave.

I don't see good arguments supporting this point. I tend to think the opposite -- building a coalition to pass a regulation now makes it easier to pass other regulations later.

arbitrary data restrictions risk preventing researchers from having access to good data that might help with alignment

OpenAI claims that ChatGPT is an "alignment advance". I think this is way off -- the approach they're using just isn't good enough in principle. Incrementally improving on ChatGPT's "alignment" the way OpenAI is doing leads to disaster, IMO. You don't write the code for a nuclear reactor through trial and error.

If an alignment scheme doesn't work with arbitrary data restrictions, it's probably not good enough in principle. Even all the data on the internet is "arbitrarily restricted" relative to all the data that exists or could exist. If my alignment scheme fails with public domain data only, why shouldn't it also fail with all the data on the internet? (And if an alignment scheme works on public domain data only, it should be possible to add in more data to boost performance before using the AI for something mission-critical.)

I think a better argument for your conclusion is that incentivizing researchers to move away from big data approaches might make AI research more accessible and harder to control. Legal restrictions also favor open source, which has the same effect. We don't want cutting-edge AI to become something that exists on the margins, like using bittorrent for piracy.

I suspect the right compromise is for AI art companies to pay a license fee to any artist in their training data who signs up to be paid. We want to reduce company profits and decrease AI progress incentives without pushing things towards open source a bunch. I'd take the same approach for language models like GPT. I think our interests are actually fairly well-aligned with artists here, in the sense that the approach which allocates profits away from AI companies and towards artists is probably also the best approach for humanity's long-run future.

Matthew_BarnettDec 27 20225

A priori, I'd expect a randomly formulated AI regulation to be about 50% likely to be an improvement on the status quo, since the status quo wasn't selected for being good for alignment.

I don't agree.

It's true that the status quo wasn't selected for being good for alignment directly, but it was still selected for things that are arguably highly related to alignment. Our laws are the end result of literally thousands of years of experimentation, tweaking, and innovation in the face of risks. In that time, numerous technologies and threats have arisen, prompting us to change our laws and norms to adapt.

To believe that a literal random change to the status quo has a 50% chance of being beneficial, you'd likely have to believe that AI is so radically outside the ordinary reference class of risks that it is truly nothing whatsoever like we have ever witnessed or come across before. And while I can see a case for AI being highly unusual, I don't think I'd be willing to go that far.

I don't see good arguments supporting this point. I tend to think the opposite -- building a coalition to pass a regulation now makes it easier to pass other regulations later.

Building a coalition now makes it easier to pass other similar regulations later, but it doesn't necessarily make it easier to switch to an entirely different regulatory regime.

Laws and their associated bureaucracies tend to entrench themselves. Suppose that as a result of neo-luddite sentiment, the people hired to oversee AI risks in the government concern themselves only with risks to employment, ignoring what we'd consider to be more pressing concerns. I think it would be quite a lot harder to fire all of them and replace them with people who care relatively more about extinction, than to simply hire right-minded people in the first place.

If an alignment scheme doesn't work with arbitrary data restrictions, it's probably not good enough in principle. Even all the data on the internet is "arbitrarily restricted" relative to all the data that exists or could exist. If my alignment scheme fails with public domain data only, why shouldn't it also fail with all the data on the internet?

I think it might be worth quoting Katja Grace from a few days ago,

My weak guess is that there’s a kind of bias at play in AI risk thinking in general, where any force that isn’t zero is taken to be arbitrarily intense. Like, if there is pressure for agents to exist, there will arbitrarily quickly be arbitrarily agentic things. If there is a feedback loop, it will be arbitrarily strong. Here, if stalling AI can’t be forever, then it’s essentially zero time. If a regulation won’t obstruct every dangerous project, then is worthless. Any finite economic disincentive for dangerous AI is nothing in the face of the omnipotent economic incentives for AI. I think this is a bad mental habit: things in the real world often come down to actual finite quantities.

Likewise, I think actual quantities of data here might matter a lot. I'm not confident at all that arbitrarily restricting 98% of the supply of data won't make the difference between successful and unsuccessful alignment, relative to allowing the full supply of data. I do lean towards thinking it won't make that difference, but my confidence is low, and I think it might very easily come down to the specific details of what's being allowed and what's being restricted.

On the other hand, I'm quite convinced that, abstractly, it is highly implausible that arbitrarily limiting what data researchers have access to will be positive for alignment. This consideration leaves me on the side of caution, and inclines me to say we should probably not put in place such arbitrary restrictions.

I think a better argument for your conclusion is that incentivizing researchers to move away from big data approaches might make AI research more accessible and harder to control. Legal restrictions also favor open source, which has the same effect. We don't want cutting-edge AI to become something that exists on the margins, like using bittorrent for piracy.

I think that's also true. It's a good point that I didn't think to put into the post.

John_MaxwellDec 27 20223

Our laws are the end result of literally thousands of years of of experimentation

The distribution of legal cases involving technology over the past 1000 years is very different than the distribution of legal cases involving technology over the past 10 years. "Law isn't keeping up with tech" is a common observation nowadays.

a literal random change to the status quo

How about we revise to "random viable legislation" or something like that. Any legislation pushed by artists will be in the same reference class as the "thousands of years of of experimentation" you mention (except more recent, and thus better adapted to current reality).

AI is so radically outside the ordinary reference class of risks that it is truly nothing whatsoever like we have ever witnessed or come across before

Either AI will be transformative, in which case this is more or less true, or it won't be transformative, in which case the regulations matter a lot less.

Suppose that as a result of neo-luddite sentiment, the people hired to oversee AI risks in the government concern themselves only with risks to employment, ignoring what we'd consider to be more pressing concerns.

If we're involved in current efforts, maybe some of the people hired to oversee AI risks will be EAs. Or maybe we can convert some "neo-luddites" to our point of view.

simply hire right-minded people in the first place

Sounds to me like you're letting the perfect be the enemy of the good. We don't have perfect control over what legislation gets passed, including this particular legislation. Odds are decent that the artist lobby succeeds even with our opposition, or that current legislative momentum is better aligned with humanity's future than any legislative momentum which occurs later. We have to think about the impact of our efforts on the margin, as opposed to thinking of a "President Matthew Barnett" scenario.

On the other hand, I'm quite convinced that, abstractly, it is highly implausible that arbitrarily limiting what data researchers have access to will be positive for alignment.

It could push researchers towards more robust schemes which work with less data.

I want a world where the only way for a company like OpenAI to make ChatGPT commercially useful is to pioneer alignment techniques that will actually work in principle. Throwing data & compute at ChatGPT until it seems aligned, the way OpenAI is doing, seems like a path to ruin.

As an intuition pump, it seems possible to me that a solution for adversarial examples would make GPT work well even when trained on less data. So by making it easy to train GPT on lots of data, we may be letting OpenAI neglect adversarial examples. We want an "alignment overhang" where our alignment techniques are so good that they work even with a small dataset, and become even better when used with a large dataset. (I guess this argument doesn't work in the specific case of safety problems which only appear with a large dataset, but I'm not sure if there's anything like that.)

Another note: I've had the experience of sharing alignment ideas with OpenAI staff. They responded by saying "what we're doing seems good enough" / not trying my idea (to my knowledge). Now they're running into problems which I believe the ideas I shared might've solved. I wish they'd focus more on finding a solid approach, and less on throwing data at techniques I view as subpar.

RemmeltJan 31 20239

A friend in AI Governance just shared this post with me.

I was blunt in my response, which I will share below:

~ ~ ~

Two cruxes for this post:

Is aligning AGI to be long-term safe even slightly possible – practically given default AI scaled training and deployment trends and complexity of the problem (see Yudkowsky’s list of AGI lethalities) or theoretically given strict controllability limits (Yampolskiy) and uncontrollable substrate-needs convergence (Landry).

If clearly, pre-aligning AGI to not cause a mass extinction is not even slightly possible, then IMO splitting hairs about “access to good data that might help with alignment” is counterproductive.

Is a “richer technological world” worth the extent to which corporations are going to automate away our ability to make our own choices (starting with our own data), the increasing destabilisation of society, and the toxic environmental effects of automating technological growth?

These are essentially rhetorical questions, but covers the points I would ask someone who proposes desisting from collaborating with other groups who notice related harms and risks of corporations scaling AI.

To be honest, the reasoning in this post seems rather motivated without examination of underlying premises.

These sentences particularly:

“A world that restricts compute will end up with different AGI than a world that restricts data. While some constraints are out of our control — such as the difficulty of finding certain algorithms — other constraints aren't. Therefore, it's critical that we craft these constraints carefully, to ensure the trajectory of AI development goes well. Passing subpar regulations now — the type of regulations not explicitly designed to provide favorable differential technological progress — might lock us into bad regime.”

It assumes AGI is inevitable, and therefore we should be picky about how we constrain developments towards AGI.

It also implicitly assumes that continued corporate scaling of AI counts as positive “progress” – at least for the kind of world they imagine would result and want to live in.

The tone also comes across as uncharitable. As if they are talking down at others they have not spent time trying to listen carefully to, take the perspective of, and paraphrase back their reasoning to (at least nothing is written about/from those attempts in the post).

Frankly, we cannot be held back by motivated techno-utopian arguments from taking collective action against exponentially increasing harms and risks (in extents of the scale and local impacts). We need to work with other groups to make traction.

~ ~ ~

Noah ScalesDec 27 20226

You wrote

Earlier this month, digital artists staged a mass protest against AI art on ArtStation. A few people are reportedly already getting together to hire a lobbyist to advocate more restrictive IP laws around AI generated content. And anecdotally, I've seen numerous large threads on Twitter in which people criticize the users and creators of AI art.

and

Personally, this sentiment disappoints me. While I sympathize with the artists who will lose their income, I'm not persuaded by the general argument. The value we could get from nearly free, personalized entertainment would be truly massive. In my opinion, it would be a shame if humanity never allowed that value to be unlocked, or restricted its proliferation severely.

and

it is not worth sacrificing a technologically richer world just to protect workers from losing their income.

Are you arguing from principle here?

Artists' (the workers') work is being imitated by the AI tools, so cost-effectively that an artist's contributions, once public, render the artists' continuing work unnecessary to produce work with their style.

Is the addition of technology T with capability C that removes need for worker W with job role R and capability C more important than loss of income I to worker W, for all T, C, W, R, and I?

Examples of capabilities could be:

summarizing existing research work (for example, an AI Safety paper)
collecting typical data used to make predictions (for example, cost and power of compute)
monitoring new research work (for example, recent publications and their relationships, such as supporting, building on or contradicting)
hypothesizing about preconditions for new developments (for example, conditions suitable for AGI development)
developing new theories or models (for example, of AI Safety)
testing new theories or models (for example, of AI Safety)

Loss of income could be:

partial (for example, a reduction in grant money for AI Safety workers as those funds are diverted to automation projects with 10-20 year timelines)
complete (for example, replacement of 50 AI Safety workers with 10 workers that rely on semi-automated research tools)

The money allocated to workers could be spent on technology instead.

Investments in specific technologies T1, T2 with capabilities C1, C2 can start with crowd-sourcing from workers W1, W2,..., Wk, and more formal automation and annotation projects targeting knowledge K developed by workers Wk+1, ..., Wn (for example, AI Safety researchers) who do not participate in the crowd-sourcing and automation effort but whose work is accessible.

You repeatedly referred to "we" as in:

True, if very powerful AI is coming very soon (<5 years from now), there might not be much else we can do except for aligning with vaguely friendly groups, and helping them pass poorly designed regulations.

However, a consequence of automation technology is that it removes the political power (both money and responsibility) that accrued to the workers that it replaces. For example, any worker in the field of AI Safety, to the extent that her job depends on her productivity and cost-effectiveness, will lose both her income and status as the field progresses to include automation technology that can replace her capabilities. Even ad hoc automation methods (for example, writing software that monitors cost and power of compute using web-scraping and publicly available data) remove a bit of that status. In that way, the AI Safety researcher loses status among her peers and her influence on policy that her peers direct. The only power left to the researcher is as an ordinary voter in a democracy.

Dividing up and replacing the responsibilities for the capabilities Ci of an individual W1 can help an ad hoc approach involving technologies Ti corresponding to the capabilities of that worker. Reducing the association of the role with the status can dissolve the role and sometimes the worker's job who held that role. The role itself can disappear from the marketplace, along with the interests that it represents. For example, although artists have invested many years in their own talents, skills, and style, within a year they lost their status and income to some new AI software. I think artists have cause to defend their work from AI. The artist role won't disappear from the world of human employment entirely but the future of the role has been drastically reduced and has permanently lost a lot of what gave it social significance and financial attractiveness, unless the neo-luddites can defend paid employment in art from AI.

Something similar can happen to AI Safety researchers, but will anyone object? AI Safety researcher worker capabilities and roles could be divided and dissolved into larger job roles held by fewer people with different titles, responsibilities, and allegiances over time as the output of the field is turned into a small, targeted knowledge-base and suite of tools for various purposes.

If you are in fact arguing from principle, then you have an opportunity to streamline the process of AI safety research work through efforts such as:

collecting AI Safety research work on an ongoing basis as it appears in different venues and making it publicly accessible
annotating the research work to speed up queries for common questions such as:
- what are controversies in the field, that is, who disagrees with whom about what and why?
- what is the timeline of development of research work?
- what literature address specific research questions (for example, on compute developments, alternative technologies, alignment approaches, specific hypotheses in the field, prediction timelines)?
- what are summaries of current work?
paying for public hosting of AI Safety information of this type as well as ad hoc tools (for example, the compute power tracker)

I'm sure you could come up with better ideas to remove AI Safety worker grant money from those workers, and commensurately benefit the cost-effectiveness of AI Safety research. I've read repeatedly that the field needs workers and timely answers, automation seems like a requirement or alternative to reduce the financial and time constraints on the field but also to serve its purpose effectively.

While artists could complain that AI art does a disservice to their craft and reduces the quality of art produced, I think the tools imitating those artists have developed to the point that they serve the purpose and artists know it and so does the marketplace. If AI Safety researchers are in a position to hold their jobs a little while longer, then they can assist the automation effort to end the role of AI Safety researchers and move on to other work that much sooner! I see no reason to hold you back from applying the principle that you seem to hold, though I don't hold it myself.

AI Safety research is a field that will hopefully succeed quickly and end the need for itself within a few decades. It's workers can move on, presumably to newer and better things. New researchers in the field can participate in automation efforts and then find work in related fields, either in software automation elsewhere or other areas such as service work for which consumers still prefer a human being. Supposedly the rapid deployment of AGI in business will grow our economies relentlessly and at a huge pace, so there should be employment opportunities available (or free money from somewhere).

If any workers have a reason to avoid neo-ludditism, it would have to be AI Safety researchers, given their belief in a future of wealth, opportunity, and leisure that AI help produce. Their own unemployment would be just a blip of however long before the future they helped manifest rescues them. Or they can always find other work, right? After all they work on the very technology depriving others of work. A perfectly self-interested perspective from which to decide whether neo-ludditism is a good idea for themselves.

EDIT: sorry, I spent an hour editing this to convey my own sense of optimism and include a level of detail suitable for communicating the subtle nuances I felt deserved inclusion in a custom-crafted post of this sort. I suppose chatGPT could have done better? Or perhaps a text processing tool and some text templates would have sped this up. Hopefully you find these comments edifying in some way.

SharmakeDec 28 20222

AI Safety research is a field that will hopefully succeed quickly and end the need for itself within a few decades. It's workers can move on, presumably to newer and better things. New researchers in the field can participate in automation efforts and then find work in related fields, either in software automation elsewhere or other areas such as service work for which consumers still prefer a human being. Supposedly the rapid deployment of AGI in business will grow our economies relentlessly and at a huge pace, so there should be employment opportunities available (or free money from somewhere).

This is the most important paragraph in a comment where I strongly agree. Thanks for saying it.

Noah ScalesDec 29 202215

You seem to genuinely want to improve AGI Safety researcher productivity.

I'm not familiar with resources available on AGI Safety, but it seems appropriate to:

develop a public knowledge-base
fund curators and oracles of the knowledge-base (library scientists)
provide automated tools to improve oracle functions (of querying, summarizing, and relating information)
develop ad hoc research tools to replace some research work (for example, to predict hardware requirements for AGI development).
NOTE: the knowledge-base design is intended to speed up the research cycle, skipping the need for the existing hodge-podge of tools in place now

The purpose of the knowledge-base should be:

goal-oriented (for example, produce a safe AGI soon)
with a calendar deadline (for example, by 2050)
meeting specific benchmarks and milestones (for example, an "aligned" AI writing an accurate research piece at decreasing levels of human assistance)
well-defined (for example, achievement of AI human-level skills in multiple intellectual domains with benevolence demonstrated and embodiment potential present)

Lets consider a few ways that knowledge-bases can be put together:

1. the forum or wiki: what lesswrong and the EA forum does. There's haphazard:
- tagging
- glossary-like list
- annotations
- content feedback
- minimal enforced documentation standards
- no enforced research standards
- minimal enforced relevance standards
- poor-performing search.
- WARNING: Forum posts don't work as knowledge-base entries. On this forum, you'll only find some information by the author's name if you know that the author wrote it and you're willing to search through 100's of entries by that author. I suspect, from my own time searching with different options, that most of what's available on this forum is not read, cited, or easily accessible. The karma system does not reflect documentation, research, or relevance standards. The combination of the existing search and karma system is less effective in a research knowledge-base.
2. the library: library scientists are trained to:
- build a knowledge-base.
- curate knowledge.
- follow content development to seek out new material.
- acquire new material.
- integrate it into the knowledgebase (indexing, linking).
- follow trends in automation.
- assist in document searches.
- perform as oracles, answering specific questions as needed.
- TIP: Library scientists could help any serious effort to build an AGI Safety knowledge-base and automate use of its services.
3. with automation: You could take this forum and add automation (either software or paid mechanical turks) to:
- write summaries.
- tag posts.
- enforce documentation standards.
- annotate text (for example, annotating any prediction statistics offered in any post or comment).
- capture and archive linked multimedia material.
- link wiki terms to their use in documents.
- verify wiki glossary meanings against meanings used in posts or comments.
- create new wiki entries as needed for new terms or usages.
- NOTE: the discussion forum format creates more redundant information rather than better citations, as well as divergence of material from any specific purpose or topic that is intended for the forum. A forum is not an ideal knowledgebase, and the karma voting format reflects trends, but the forum is a community meeting point with plenty of knowledge-base features for users to work on, as their time and interest permits. It hosts interesting discussions. Occasionally, actual research shows up on it.
4. with extreme automation: A tool like chatGPT is unreliable or prone to errors (for example, in programming software), but when guided and treated as imperfect, it can perform in an automated workflow. For example, it can:
- provide text summaries.
- be part of automation chains that:
  - provide transcripts of audio.
  - provide audio of text.
  - provide diagrams of relationships.
  - graphs data.
  - draw scenario pictures or comics.
- act as a writing assistant or editor. TIP: Automation is not a tool that people should only employ by choice. For example, someone who chooses to use an accounting ledger and a calculator rather than Excel is slowing down an accounting team's performance.
  CAUTION: Once AI enter the world of high-level concept processing, their errors have large consequences for research. Their role should be to assist human tasks, as cognitive aids, not as human replacements, at least until they are treated as having equivalent potential as humans, and are therefore subject to the same performance requirements and measurements as humans.

Higher level analysis

The ideas behind improving cost-effectiveness of production include:

standardizing: take a bunch of different work methods, find the common elements, and describe the common elements as unified procedures or processes.
streamlining: examining existing work procedures and processes, identifying redundant or non-value-added work, and removing it from the workflow by various means.
automating: using less skilled human or faster/more reliable machine labor to replace steps of expert or artisan work.

Standardizing research is hard, but AGI Safety research seems disorganized, redundant, and slow right now. At the highest chunk level, you can partition AGI Safety development into education and research, and partition research into models and experiments.

education
research models
research experiments

The goal of the knowledge-base project is to streamline education and research of models in the AGI Safety area. Bumming around on lesswrong or finding someone's posted list of resources is a poor second to a dedicated online curated library that offers research services. The goal of additional ad hoc tools should be to automate what researchers now do as part of their model development. A further goal would be to automate experiments toward developing safer AI, but that is going outside the scope of my suggestions.

Caveats

In plain language, here's my thoughts on pursuing a project like I have proposed. Researchers in any field worry about grant funding, research trends, and professional reputation. Doing anything quickly is going to cross purposes with others involved, or ostensibly involved, in reaching the same goal. The more well-defined the goal, the more people will jump ship, want to renegotiate, or panic. Once benchmarks and milestones are added, financial commitments get negotiated and the threat of funding bottlenecks ripple across the project. As time goes on, the funding bottlenecks manifest, or internal mismanagement blows up the project. This is a software project, so the threat of failure is real. It is also a research project without a guaranteed outcome of either AGI Safety or AGI, adding to the failure potential. Finally, the field of AGI Safety is still fairly small and not connected to income potential long-term, meaning that researchers might abandon an effective knowledge-base project for lack of interest, perhaps claiming that the problem "solved itself" once AGI become mainstream, even if no AGI Safety goals were actually accomplished.

Minh NguyenDec 29 20222

Strong upvoted because this is indeed an approach I'm investigating in my work and personal capacity.

For other software fields/subfields, upskilling can be done fairly rapidly, by grinding knowledge bases with high feedback loops. It is possible to be as good as a professional software engineer quickly, independently and in a short timeframe.

If AI Safety wants to develop its talent pool to keep up with the AI Capabilities talent pool (which is probably growing much faster than average), researchers-especially juniors- need an easy way to learn quickly and conveniently. I think existing researchers may underrate this, since they're busy putting out their own fires and finding their own resources.

Ironically, it has not been quick and convenient for me to develop this idea to a level where I'd work on it, so thanks for this.

Noah ScalesDec 29 20221

Sure. I'm curious how you will proceed.

I'm ignorant of whether AGI Safety will contribute to safe AGI or AGI development. I suspect that researchers will shift to capabilities development without much prompting. I worry that AGI Safety is more about AGI enslavement. I've not seen much defense or understanding of rights, consciousness, or sentience assignable to AGI. That betrays the lack of concern over social justice and related worker's rights issues. The only scenarios that get attention are the inexplicable "kill all humans" scenarios, but not the more obvious "the humans really mistreat us" scenarios. That is a big blindspot in AGI Safety.

I was speculating about how the research community could build a graph database of AI Safety information alongside a document database containing research articles, CC forum posts and comments, other CC material from the web, fair use material, and multimedia material. I suspect that the core AI Safety material is not that large and far far less than AI Capabilities material. The graph database could provide more granular representation of data and metadata and so a richer representation of the core material but that's an aside.

A quick experiment would be to represent a single AGI Safety article in a document database, add some standard metadata and linking, and then go further.

Here's how I'd do it:

take an article.
capture article metadata (author, date, abstract, citations, the typical stuff)
establish glossary word choices.
link glossary words to outside content.
use text-processing to create an article summary. Hand-tune if necessary.
use text-processing to create a concise article rewrite. Hand-tune if necessary.
Translate the rewrite into a knowledge representation language.
- begin with Controlled English.
- develop an AGI Safety controlled vocabulary. NOTE: as articles are included in the process, the controlled vocabulary can grow. Terms will need specific definition. Synonyms of controlled vocabulary words will need identification.
- combine the controlled vocabulary and the glossary. TIP: As the controlled vocabulary grows, hyperonym-hyponym relationships can be established.

Once you have articles in a controlled english vocabulary, most of the heavy lifting is done. It will be easier to query, contrast, and combine their contents in various ways.

Some article databases online already offer useful tools for browsing work, but leave it to the researcher to answer questions requiring meaning interpretation of article contents. That could change.

If you could get library scientists involved and some money behind that project, it could generate an educational resource fairly quickly. My vision does go further than educating junior researchers, but that would require much more investment, a well-defined goal, and the participation of experts in the field.

I wonder whether AI Safety is well-developed enough to establish that its purpose is tractable. So far, I have not seen much more than:

expect AGI soon
AGI are dangerous
AGI are untrustworthy
Current AI tools pose no real danger (maybe)
AGI could revolutionize everything
We should or will make AGI

The models do provide evidence of existential danger, but not evidence of how to control it. There's a downside to automation: technological unemployment; concentration of money and political power (typically); societal disruption; increased poverty. And as I mentioned, AGI are not understood in the obvious context of exploited labor. That's a worrisome condition that, again, the AGI Safety field is clearly not ready to address. Financiallly unattractive as it is, that is a vision of the future of AGI Safety research, a group of researchers who understand when robots and disembodied AGI have developed sentience and deserve rights.

SharmakeDec 28 20224

I agree, and I do not think that slowing down AI or speeding it up is desirable, for reasons related to Rohin Shah's comment on LW about why slowing down AI progress is not desirable:

It makes it easier for a future misaligned AI to take over by increasing overhangs, both via compute progress and algorithmic efficiency progress. (This is basically the same sort of argument as "Every 18 months, the minimum IQ necessary to destroy the world drops by one point.")

Such strategies are likely to disproportionately penalize safety-conscious actors.

(As a concrete example of (2), if you build public support, maybe the public calls for compute restrictions on AGI companies and this ends up binding the companies with AGI safety teams but not the various AI companies that are skeptical of “AGI” and “AI x-risk” and say they are just building powerful AI tools without calling it AGI.)

For me personally there's a third reason, which is that (to first approximation) I have a limited amount of resources and it seems better to spend that on the "use good alignment techniques" plan rather than the "try to not build AGI" plan. But that's specific to me.

Michael SimmJan 2 20232

While it's understandable to want to take action and implement some form of regulation in the face of rapidly advancing technology, it's crucial to ensure that these regulations are effective and aligned with our ultimate objectives.

It's possible that regulations proposed by neo-Luddites could have unintended consequences or even be counterproductive to our goals. For example, they may focus on slowing down AI progress in general, without necessarily addressing specific concerns about AI x-risk. Doing so could drive cutting-edge AI research into the black market or autocratic countries. It's important to carefully evaluate the motivations and objectives behind different regulatory proposals and ensure that they don't end up doing more harm than good.

Personally, I'd rather have a world with 200 mostly-positively-aligned research organizations than a world where only autocratic regimes and experienced coding teams - that are willing to disregard the law - can push the frontiers of AI.

JosephDec 29 20222

While I sympathize with the artists who will lose their income, I'm not persuaded by the general argument. The value we could get from nearly free, personalized entertainment would be truly massive. In my opinion, it would be a shame if humanity never allowed that value to be unlocked, or restricted its proliferation severely.

This seems like an area in which utilitarianism should be bounded by deontology.^[1] The reasoning here seems to be roughly "the value of this scenario is so high that I am okay with the harm it causes."

There are options other than artists loose their income by having their work pirated/stolen, copied, and altered, and humanity never unlocks that value. For example, by paying artists for their work.

^{^}
Vaguely parallel to how democracy should be bounded by liberalism: 9 people of group A voting to remove the rights of 1 group B person is democratic, which is the simplistic example of how the liberalism of protecting rights is important.

Matthew_BarnettDec 29 20225

This seems like an area in which utilitarianism should be bounded by deontology.

I'm confused because I think precisely the opposite is true.

If I were applying a deontological framework to automation, I'd perhaps first point out that there's an act/omission asymmetry in using technology. By using AI art generators, you're not actually harming anyone directly. You're just using a fun tool to generate images. What you're calling "harm" is in fact the omission of payments given to artists whose images were used during training, which deontology views quite differently.

While it's true that most deontologists believe in things like property rights, and compensation for work, I am not familiar with any deontological theory that says we are obligated to compensate people who we merely learn from, unless that is part of some explicit contract.

By contrast, the only very plausible argument I can imagine for why we should compensate artists for AI art is utilitarian. That is, providing compensation to artists would offset the implicit harm to their profession, redistributing economic surplus from consumers and producers of AI art, to artists who lose out from competition.

Such compensation is often recommended by economists as part of a package for turning Kaldor-Hicks improvements into Pareto improvements, but I have yet to hear of such a proposal from strict deontologists before. Have you?

JosephDec 30 20226

In that case, perhaps instead of phrasing it as "utilitarianism should be bounded by deontology" I should have instead phrased it as something along the lines of "a large benefit from this system doesn't justify the harms of creating this system." The general idea that I am trying to gesture toward is that when the piece of art someone created is used in a way that they do not consent to, the use benefiting someone doesn't necessarily make it okay. So while the value might be -1 over here and +50 over there, I (as a layperson, rather than a law maker) don't think that should be used as justification. If the creator gives informed consent, then I think it sounds fine. I know that I would feel really shitty if I spent time making something, sold copies of it, and then found that someone had copied my creation and was distributing variations on it for free.

Perhaps one area where I wasn't clear is that rather than a profession simply fading away (such as those made obsolete by the invention of digital spreadsheets or those made obsolete by the invention of automobiles), the "harm" I am referring to an artists work being copied without his/her permission (or stolen, or used without consent, or pirated). So perhaps I've mis-understood your perspective here. I understood your perspective to be "The value from a really good entertainment generation system would be so large that it would be justified to not pay the artists for their work." But perhaps when you referred to lost income you meant the future of their profession, rather than simply not paying for their work?

Such compensation is often recommended by economists as part of a package for turning Kaldor-Hicks improvements into Pareto improvements, but I have yet to hear of such a proposal from strict deontologists before. Have you?

No, I have not heard of such a proposal from a strict deontologist. But to my knowledge I've also never had any interaction with a strict deontologist. 😅

EDIT: My views are probably quite influenced by recently learning about Lensa scraping artists work without their consent. If I hadn't learned about that, then I probably wouldn't have even thought about the ethics of what goes into a content generation system.

LarksDec 30 20220

So while the value might be -1 over here and +50 over there, I (as a layperson, rather than a law maker) don't think that should be used as justification.

A 50:1 benefit:cost ratio is huge! Even fairly hardcore libertarians will accept policies that have such massively positive consequences. A typical policy is more likely to be -1 over here, +1.1 over there. If you're not willing to enact a policy like this I think there is basically no policy ever that will satisfy you.

JosephDec 30 20222

I agree that in the abstract a 50:1 benefit:cost ratio sounds great. But it also strikes me as naïve utilitarianism (although maybe I am using that term wrong?). To make it more concrete:

If you have a book that you enjoy reading, can I steal it and copy it and share it with 50 of my friends?
Is you stealing $100 from me justified if it generates far greater value when you donate that $100 to other people?
If we can save 50 lives by killing John Doe and harvesting his organs, does that justify the act?
If I can funnel millions or billions of dollars toward highly effective charities by lying to or otherwise misleading investors, does that benefit justify the cost?

These are, of course, simplistic examples and analogies, rather than some sort of rock solid thesis. And this isn't a dissertation that I've thought out well; this is mostly impulse and gut feeling on my part, so maybe after a lot of thought and reading on the topic I'll feel very differently. So I'd encourage you to look at this as my fuzzy explorations/musings rather than as some kind of confident stance.

Any maybe that 50:1 example is so extreme that some things that would normally be abhorrent do actually make sense. Maybe the benefit of pirating an ebook (in which one person has their property stolen and thousands of people benefit from it) is so large that it is morally justified. So perhaps for my example should have chosen a more modest ratio, like 5:1. 😅

I'll also note that I think I tend to learn a bit toward negative utilitarianism, so I prioritize avoiding harm a bit more than I prioritize causing good. I think this makes me have a fairly high bar for these kinds of the ends justify the means scenarios.