> How difficult should we expect AI alignment to be?
With many of the AI questions, one needs to reason backwards rather than pose the general question.
Suppose we all die because unaligned AI. What form did the unaligned AI take? How did it work? Which things that exist now were progenitors of it, and what changed to make it dangerous? How could those problems have been avoided, technically? Organisationally?
I don't see how useful alignment research can be done quite separately to capabilities research. Otherwise we'll get will be people coming in at the...
As someone with a mathematical background, I see a claim about a general implication (the RC) arising from Total Utilitarianism. I ask 'what is Total Utilitarianism?' I understand 'add up all the utilities'. I ask 'what would the utility functions have to look like for the claim to hold?' The answer is, 'quite special'.
I don't think any of us should be comfortable with not checking the claim works at a gears level. The claim here being, approximately, that the RC is implied under Total Utilitarianism regardless of the choice of utility function. Which is f...
Thanks for the considered reply :)
The crux I think lies in, "is not meant to be sensitive to how resources are allocated or how resources convert to wellbeing." I guess the point established here is that it is, in fact, sensitive to these parameters.
In particular if one takes this 'total utility' approach of adding up everyone's individual utility we have to ask what each individual's utility is a function of.
It seems easy to argue that the utility of existing individuals will be affected by expanding or contacting the total pool of individuals. There will...
The crux I think lies in, "is not meant to be sensitive to how resources are allocated or how resources convert to wellbeing." I guess the point established here is that it is, in fact, sensitive to these parameters.
In particular if one takes this 'total utility' approach of adding up everyone's individual utility we have to ask what each individual's utility is a function of.
Yes, that is a question that needs to be answered, but population ethics is not an attempt to answer it. This subdiscipline treats distributions of wellbeing across individuals in dif...
Well, on the basis of the description in the SEP article:
The idea behind this view is that the value of adding worthwhile lives to a population varies with the number of already existing lives in such a way that it has more value when the number of these lives is small than when it is large
It's not the same thing, since above we're saying that each individual's utility is a function of the whole setup. So when you add new people you change the existing population's utilities. The SEP description instead sounds like changing only what happens at the margin....
I think it's more of a comment that one would find the number of academics 'excited' about AIS would increase as the number of venues for publication grew.
This doesn't seem to have been said, so I will: $1m is enough to live off as an endowment. You can use this to work your entire life on any cause you want to, and then donate as much of it in your will as you wish to.
Upvoted because I think that this should not be downvoted without comment. However I think OP will get more engagement and generate a fuller respose here if:
Would suggest at least forming a 'control group', performing the same analysis and looking at differences in the sets of popular feeds. Following Obama doesn't tell you much about a person.
One would also need to figure out what it is about those accounts that separates them from other , similar accounts 'IPW's could have followed but didn't.
Might get hold of it and confirm my biases :D.
I feel fairly confident though that this argument doesn't hinge too much on a particular technology such as IoT (as I see in the blurb). To unnecessarily recapitulate: something like the above argument on GDP falls out as the consequence of marginal costs being driven to zero. By whatever means, and relying only on micro 101 theory. In the limit, GDP will provide very little information about utility. There'll be a lot of good, cool stuff in the world which will be free.
GDP is a very leaky measure for growth in this context. To see this, consider a utopian scenario with dirt cheap fusion energy powered Star Trek replicators. The marginal cost of most traded goods drops to near zero and GDP tanks (setting aside services). You have for traditional industry writ large a similar dynamic to that napster triggered for the music industry.
Assuming we don't all die sometime soon and things 'carry on' the solution is likely to lie at least in part, eventually, in giving up on trying to summarise all of technology in a s...
I don't think this is an accurate portrayal of what Dale was trying to say.
I don't see them actively recommending a particular policy in the post -- just noting that some studies of repressive behavior find that it may lead to a certain outcome. It can be true that repression sometimes quells riots while also being true that it has many other negative outcomes and should clearly be avoided. (Though I didn't see Dale say that, either, and I don't want to put words in their mouth.)
Of course, the vague term "repression" and the differing social context of the
...But collectively we are all better off if everyone stops holding protests for now.
Who is the 'we' here and by whose yardstick the benefit measured?
Animal rights activists are not turning out in large numbers to get tear gassed and beaten for the cause. This is pretty good evidence that they are not in the set of 'everyone else who thinks their reason is as good as I think this one is'.
As usual, there are better alternatives being neglected here. Those who want more lockdown have, in this situation, two options to get it: more violence o...
Who is the 'we' here and by whose yardstick the benefit measured?
Investigations into police brutality that follow viral footage have historically been quite harmful for all involved. The upside is a small reduction in police brutality. The downside is a massive increase in non-police brutality, as found in this recent paper:
all investigations that were preceded by "viral" incidents of deadly force have led to a large and statistically significant increase in homicides and total crime. We estimate that these investigations caused almo...
Now that is a big philosophical question.
One answer is that there is no difference between 'orders' of random variables in Bayesian statistics. You've either observed something or you haven't. If you haven't, then you figure out what distribution the variable has.
The relationship between that distribution and the real world is a matter of your assiduousness to scientific method in constructing the model.
Lack of a reported distribution on a probability, e.g. p=0.42, isn't the same as a lack of one. It could be taken as the asse...
One of the topics I hope to return to here is the importance of histograms. They're not a universal solvent. However they are easily accessible without background knowledge. And as a summary of results, they require fewer parametric assumptions.
I very much agree about the reporting of means and standard deviations, and how much a paper can sweep under the rug by that method.
Nice example, I see where you're going with that.
I share the intuition that the second case would be easier to get people motivated for, as it represents more of a confirmed loss.
However, as your example shows actually the first case could lead to an 'in it together' effect on co-ordination. Assuming the information is taken seriously. Which is hard as, in advance, this kind of situation could encourage a 'roll the dice' mentality.
I also think it would be a lot more helpful to walk through how this mistake could happen in some real scenarios in the context of EA
Hopefully, we'll get there! It'll be mostly Bayesian though :)
Thanks - that last link was one I'd come across and liked when looking for previous coverage. My sole previous blog post was about Pascal's Wager. I'd found though when speaking about it that I was assuming too much for some of the audience I wanted to bring along; notwithstanding my sloppy writing :D So, I'm going to attempt to stay focused and incremental.
As long as the core focuses on unusual priorities – which using neglectedness as a heuristic for prioritization will mean is likely – there’s a risk that new members get surprised when they find out about these unusual priorities
Perhaps there are also some good reasons that people with different life experience both a) don't make it to 'core' and b) prioritize more near term issues.
There's an assumption here that weirdness alone is off-putting. But, for example, technologists are used to seeing weird startup ideas and considering the contents.
This suggests a next thing to find out is: who disengages and why.
TL;DR's for the EA Forum/Welcome: ”Effective altruists are trying to figure out how to build a more effective AI, using paperclips, but we're not really sure how it's possible to do so.
Ouch.
Perhaps EA's roots in philosophy lead it more readily to this failure mode?
Take the diminishing marginal returns framework above. Total benefit is not likely to be a function of a single variable 'invested resources'. If we break 'invested resources' out into constituent parts we'll hit the buffers OP identifies.
Breaking into constituent parts would mean envisaging the scenario in which the intervention was effective and adding up the concrete things one spent money on to get there: does it need new PhDs minted? There'...
I would spend every penny unblocking the pathway to a vaccine.
"Our actions have dominating long-term effects that we cannot ignore."
To me, this is a strange intuition. Most actions by most people most of the time disappear like ripples in a stream.
If this were not the case, reality would tear under the weight of schemes past people had for the present. Perhaps it is actually hard to change the course of history?
This is a nice piece of accessible scholarship. It would perhaps benefit from an explicit note on why the question is interesting in this context and to this audience.
Ah, that's interesting and the nub of a difference.
The way I see it, a 'good' impact function would upweight the impact of low probability downside events and, perhaps, downweight low probability upside events. Maximising the expectation of such a function would push one toward policies which more reliably produce good outcomes.
So, what do you think of the idea that aiming for high expected returns in long term investments might not be the best thing to do, given the skewed distribution? This is, we want to ensure that most futures are 'good'; not just a few that are 'excellent' lost in a mass of 'meh' or worse.
BTW, I did like the podcast - it does take something to make me tap out forum posts :)
Thanks for the response. To clarify: in the second model both the drift and the diffusion term impact on the expected returns. If you substitute in a model return e^{q + sz}, with z a standard normal:
E[V(1)] = E[e^{q + s z}] = E[e^{sz}]e^q = e^{s^2/2} e^q > e^q
So, if we have fixed from some source that E[V(1)]=1.07=e^r then we cannot set q=r in the model with randomness while maintaining the equality. Where the equality cashes out as 'the expected rate of return a year from now is 7%'.
Empirically estimated long run rates already take into acco...
"People don’t want to be associated with something low status and are likely to subject anything they perceive as low status to a lot of scrutiny."
Ouch! Alas, it is true in general. However, I think it's a dangerous heuristic when not backed by the kinds of substantive comments made in 1-6.
I do think toning down 5 might foster a better culture. Perhaps there is more information here I don't know. But this kinda sounds like someone tried something it didn't work out, and they don't get a second chance. That's not a great rubric to establish if you want people to take risks.
Ego depletion is quite a narrow psychological effect. If the idea that people's moment to moment fatigue saps moment to moment willpower is debunked, that's far from showing that akrasia isn't a thing in general.
In a world where general-sense akrasia was not a thing there would be a far higher rate of people being ripped like movie stars, a far lower rate of smoking, a much high rate of personal savings etc than there is in the world we inhabit.
The willpower argument is actually quite good. There are ways to reduce the amount of willpower required, but the kernel of the argument applies.
My prediction for people who constantly feel bad for not living up to an exacting standard is that a majority will fall off the boat entirely.
Maximising paperclips is a misunderstood human value. Some lazy factory owners says, gee wouldn't it be great if I could get an AI to make my paperclips for me? Then builds an AGI and asks it to make paperclips, and it then makes everything into paperclips its utility function being unreflective of its owners true desire to also have a world.
If there is a flaw here it's probably somewhere in thinking that AGI will get built as some sort of intermediate tool and that it will be easy to rub the lamp and ask the genie to do something in easy to misunderstand natural language.
Nice point.
'I also wish we didn't accidentally make donating to AMF or GiveDirectly so uncool.'
This reminds me of the pattern where we want to do something original, so we don't take the obvious solution.
"Making rationality more accessible."
Sounds great, and I've thought about this too. But what does it look like?
How to assess what the main topics should be though? I feel the p...
My quick answer would be: since writing the comment I noticed plenty of people made first contact via hpmor :D
I still don't know the answer though. I'd guess a startupy algorithm to answer this might lookw like:
But obvs this is a pretty involved effort and perhaps something one would go for a grant for :o