I'm trying to create a website/organisation/community around exploring difficult problems and improving the decisions people make.
I've currently got an alpha website where people can interact with AI in different scenarios and record the decisions and reasoning they make, to inform others.
I'm curious how others would approach this endeavour (I don't have a broad network)
So I've been trying to think of ways to improve the software landscape. If we do this it might make traditional software more aligned with human values and it's models for building more advanced systems too.
One piece I've been looking at is software licensing.
Instead of traditional open source, have an easy to get license for a version of software, based on a cryptographic identity. This could make it less frictional to be a bad actor.
This license is checked on startup that it matches the version of the software running (git sha stored somewhere). If it do...
I had an idea for a new concept in alignment that might allow nuanced and human like goals (if it can be fully developed).
Has anyone explored using neural clusters found by mechanistic interpretability as part of a goal system?
So that you would look for clusters for certain things e.g. happiness or autonomy and have that neural clusters in the goal system. If the system learned over time it could refine that concept.
This was inspired by how human goals seem to have concepts that change over time in them.
Here is a blog post also written with Claudes help that I hope to engage with home scale experimenters with
I appreciate your views on space and AI working with ML systems in that way might be useful.
But I think that I am drawn to the base reality a lot because of threats to that from things like gamma ray bursts or aliens. These things can only be represented probabilistically in simulations because they are out of context. The branching tree explodes with possibilities.
I agree that we aren't ready for agents , but I would like to try to build time non-static intelligence augmentation as slowly as possible. Starting with building systems to control and shape them tested out with static ML systems. Then testing them with people. Then testing them inside simulations etc
I find your view of things interesting. A few questions, how do you deal with democracy when people might be inhabiting worlds unlike the real one and have forgotten the real one exists?
I think static AI models lack corrigibility, humans can't give them instruction on how to change how to act, so they might be a dead end in terms of day to day usefulness. They might be good as scientists though as they can be detached from human needs. So worth exploring.
There is a concept of utility, but I'm expecting these systems to mainly be user focussed so not agents in their own rights, so the utility is based on user feedback about the system. So ideally the system would be an extension of the feedback systems within humans.
There is also karma which is separate from utility which is given by one ml system to another, if it is helped it out or hindered it in a non-economic fashion.
I've been thinking that AGI will require an freely evolving multi-agent approach. So I want to try out the multi-agent control patterns on ML models without the evolution. Which should prove them out in a less dangerous setting. The multi-agent control patterns I am thinking are things like karma and market based alignment patterns. More information on my blog
I suppose I'm interested in questions around what is an existential threat. How bad a nuclear winter would it have to be to cause the collapse of society (and how easily could society be rebuilt afterwards). Both require robust models of agriculture in extreme situations and models of energy flows in economies where strategic elements might have been destroyed (to know how easy rebuilding would be). Since pandemic/climate change also have societal collapse as a threat the models needed would apply to them too (they might trigger nuclear exchange or at leas...
It's true that all data and algorithms are biased in some way. But I suppose the question is, is the bias from this less than what you get from human experts, who often have a pay cheque that might lead them to think in a certain way.
I'd imagine that any system would not be trusted implicitly, to start with, but would have to build up a reputation of providing useful predictions.
In terms of implementation, I'm imagining people building complex models of the world, like decision making under deep uncertainty with the AI mainly providing a user friendly interface to ask questions about the model.
Thanks, I did a MSc in this area back in the early 2000s, my system was similar to Tierra, so I'm familiar with evolutionary computation history. Definitely useful context. Learning classifier systems are also interesting to check out for aligning multi-agent evolutionary systems. It definitely informs where I am coming from.
Do you know anyone with this kind of background that might be interested in writing something long form on this? I'm happy to collaborate, but my mental health has not been the best. I might be able to fund this a small bit, if the right person needs it.
Thanks, I've had a quick skim of propositions, it does mention perhaps limiting rights of reproduction, but not the conditions under which it should be limited or how it should be controlled.
Another way of framing my question is if natural selection favours ai over humans, what form of selection should we try to put in place for AI. Rights are just part of the the question. Evolutionary dynamics and what is needed by society from AI (and humans) to continue functioning is the major part of the question.
Hi, I'm thinking about a possibly new approach to AI safety. Call it AI monitoring and safe shutdown.
Safe shutdown, riffs on the idea of the big red button, but adapts it for use in simpler systems. If there was a big red button, who gets to press it and how? This involves talking to law enforcement, legal and policy. Big red buttons might be useful for non learning systems, large autonomous drones and self-driving cars are two system that might suffer from software failings and need to be shutdown safely if possible (or precipitously if the risks fr...
I found this report on adaptation, which suggest adaptation with some forethought will be better than waiting for problems to get worse. Talks about things other than crops too. The headlines
I've been thinking for a while civilisational collapse scenarios impact some of the common assumptions about the expected value of movement building or saving for effective altruism. This has knock on implications to when things are most hingeist.
That said, I personally would be quite surprised if worldwide crop yields actually ended up decreasing by 10-30%. (Not an informed opinion, just vague intuitions about econ).
I hope they won't too, if we manage to develop the changes we need to make before we need them. Economics isn't magic
But I wanted to point out that there will probably be costs associated with stopping deaths associated with food shortages with adaptation. Are they bigger or smaller than mitigation by reducing CO2 output or geoengineering?
This case hasn't been made either way to my knowledge and could help allocate resources effectively.
Current investment in solar geoengineering is roughly 10 million annually (this may have increased in the last few years), so by most metrics it's really neglected. The main project working on this is the Harvard solar geoengineering research program, which OPP has funded about 2.5 million dollars for a few years in 2016. They've also funded a solar governance program in 2017 for about 2 million dollars. Grants here. Recently, they don't appear to have made any climate-related grants in this space, and its unclear to me what the funding situ...
I'm expecting the richer nations to adapt more easily, So I'm expecting a swing away from food production in the less rich nations as poorer farmers would have a harder time adapting as there farms get less productive (and they have less food to sell). Also farmers with now unproductive land would struggle to buy food on the open market
I'd be happy to be pointed to the people thinking about this and planning on having funding for solving this problem. Who are the people that will be funding the teaching of subsistence rice farmers (of all na...
On 1) not being able to read the full text of the impactlab report, but it seem they just model the link between heat and mortality, but not the impact of heat on crop production causing knock on health problems. E.g. http://dels.nas.edu/resources/static-assets/materials-based-on-reports/booklets/warming_world_final.pdf suggests that each degree of warming would reduce the current crop yields by 5-15%. So for 4 degrees warming (baseline according to https://climateactiontracker.org/global/temperatures/ ), this would be 20-60% of world food supply reduction...
You'd need to think there was a very significant failure of markets to assume that food supplies wouldn't be adapted quickly enough to minimize this impact. That's not impossible, but you don't need central management to get people to adapt - this isn't a sudden change that we need to prep for, it's a gradual shift. That's not to say there aren't smart things that could significantly help, but there are plenty of people thinking about this, so I don't see it as neglected of likely to be high-impact.
As currently defined, long termists have two possible choices.
There are however other actions that may be more beneficial.
Let us look again at the definition of influential again
a time ti is more influential (from a longtermist perspective) than a time tj iff you would prefer to give an additional unit of resources,[1] that has to be spent doing direct work (rather than investment), to a longtermist altruist living at t...
Let's say they only mail you as much protein as one full human genome.
This doesn't make sense. Do you mean proteome? There is not a 1-1 mapping between genome and proteome. There are at least 20,000 different proteins in the human proteome, it might be quite noticeable (and tie up the expensive protein producing machines), if there were 20,000 orders in a day. I don't know the size of the market, so I may be off about that.
I will be impressed if the AI manages to make a biological nanotech that is not immediately eaten up or accidentally sa...
There might be a further consideration, people might not start or fund impactful startups if there wasn't a good chance of getting investment. The initial investors (if not impact oriented), might still be counting on impact oriented people to buy the investment. So while each individual impact investor is not doing much in isolation, collectively they are creating a market for things that might not get funded otherwise. How you account for that I'm not sure.
It might be worth looking at the domains where it might be less worthwhile (formal chaotic systems, or systems with many sign flipping crucial considerations). If you can show that trying to make cost-effectiveness based decisions in such environments is not worth it, that might strengthen your case.
Hi Gregory,
A couple of musings generated by your comment.
2: I don’t think there’s a neat distinction between ‘technical dangerous information’ and ‘broader ideas about possible risks’, with the latter being generally safe to publicise and discuss.
I have this idea of independent infrastructure, trying to make infrastructure (electricity/water/food/computing) that is on a smaller scale than current infrastructure. This is for a number of reasons, one of which includes mitigating risks, How should I build broad-scale support for my ideas without talking ...
For people outside of EA, I think those who are in possession of info hazard-y content are much more likely to be embedded in some sort of larger institution (e.g., a research scientist or a journal editor looking to publish something), where perhaps the best leverage is setting up certain policies, rather than trying to teach everyone the unilateralist's curse.
There is a growing movement of maker's and citizen scientists that are working on new technologies. It might be worth targeting them somewhat (although again probably without the math). I think t...
Ah right. I suppose the unilateralist's curse is only a problem insofar as there are a number of other actors also capable of releasing the information; if you are a single actor then the curse doesn't really apply. Although one wrinkle might be considering the unilateralist's curse with regards to different actors through time (i.e., erring on the side of caution with the expectation that other actors in the future will gain access to and might release the information), but coordination in this case might be more challenging.
Interesting idea. This may ...
My understanding is that it applies regardless of whether or not you expect others to have the same information. All it requires is a number of actors making independent decisions, with randomly distributed error, with a unilaterally made decision having potentially negative consequences for all.
Information determines the decisions that can be made. For example you can't spread the knowledge of how to create effective nuclear fusion without the information on how to make it.
If there is a single person with the knowledge of how to create safe efficient n...
The unilateralists curse only applies if you expect other people to have the same information as you right?
You can figure out if they have the same information as you to see if they are concerned about the same things you are. By looking at the mitigation's people are attempting. Altruists should be attempting mitigations in a unilateralist's curse position, because they should expect someone less cautious than them to unleash the information. Or they want to unleash the information themselves and are mitigating the downsides until they think it is safe.
...Thanks for writing this up! I've forwarded it to a friend who was interested in the happiness app space a while back.
I would add to the advice, from my experience, pick something not too far out of people's comfort zones for a startup or research idea. There seems to be a horizon beyond which you don't get feedback or help at all.
I think it possible that blockchain can help us solve some co-ordination problems. However it also introduces new ones (e.g. which fork of a chain/version of the protocol you should go with).
So I am torn. It would be good to see one successful use/solid proposal of the technology for solving our real world coordination problems using ethereum.
Something I am keeping an eye on is the economic space agency
I would add something likes "Sensitivity" to the list of attributes needed to navigate the world.
This is different from Predictive Power. You can imagine two ships, with the exact same compute power and Predictive Power. One with cameras on the outside and long range sensors, one blind without. You'd expect the first to do a lot better moving about the world
In Effective Altruism's case I suspect this would be things like the basic empirical research about the state of the world and the things important to their goals.
I'm thinking about radically more secure computer architectures as a cause area.
I'd be interested in doing an analysis of whether it is effective altruistic cause. I'm just doing it as a hobby at the moment. Anyone interested in the same region want to collaborate?
There are some systemic reforms that seem easier reason about that others. Getting governments to be able to agree a tax scheme such that the Google's and Facebook's of the world can't hide their profits, seems like a pretty good idea. Their money piles suggest that they aren't hurting for cash to invest in innovation. It is hard to see the downside.
The upside is going to be less in developing world than the developed (due to more profits occurring in the developed world). So it may not be ideal. The tax justice network is something I want to follow more....
I'm thinking about funding an analysis of the link between autonomy and happiness.
I have seen papers like
https://academic.oup.com/heapro/article/28/2/166/661129
and http://www.apa.org/pubs/journals/releases/psp-101-1-164.pdf
I am interested in how reproducible and reliable they are and I was wondering if I could convert money into an analysis of the methodology used in (some of) these papers.
As I respect EA's analytical skills (and hope their is a shared interest in happiness and truth), I thought I would ask here.
In the context of the measurement problem: If the idea is that we may be able to explain the Born rule by revising our understanding of what the QM formalism corresponds to in reality (e.g., by saying that some hidden-variables theory is true and therefore the wave function may not be the whole story, may not be the kind of thing we'd naively think it is, etc.), then I'd be interested to hear more details.
Heh, I'm in danger of getting nerd sniped into physics land, which would be a multiyear journey. I'm found myself trying to figure out whether the st...
Ah, it has been a while since I engaged with this stuff. That makes sense. I think we are talking past each other a bit though. I've adopted a moderately modest approach to QM since I've not touched it in a bit and I expect the debate has moved on a bit.
We started from a criticism of a particular position (the copenhagen interpretation) which I think is a fair thing to do for the modest and immodest. The modest person might misunderstand a position and be able to update themselves better if they criticize it and get a better explanation.
The question is wh...
and Eliezer hasn't endorsed any solution either, to my knowledge)
Huh, he seemed fairly confident about endorsing MWI in his sequence here
Concerning QM: I think Eliezer's correct that Copenhagen-associated views like "objective collapse" and "quantum non-realism" are wrong, and that the traditional arguments for these views are variously confused or mistaken, often due to misunderstandings of principles like Ockham's razor. I'm happy to talk more about this too; I think the object-level discussions are important here.
I don't think the modest view (at least as presented by Gregory) would believe in any of the particular interpretations as there is significant debate sti...
Is there any data on how likely EAs think that explosive progress after HLMI will happen? I would have thought it more than 10%?
I would also have expected more debate about explosive progress, more than just the recent Hanson-Yudkowski flair up, if there was as much doubt in the community as that survey suggests.
Another reason to not have too much modesty within society is that it makes expert opinion very appealing to subvert. I wrote a bit about that here.
Note that I don't think that my views about the things that I believe subverted/unmoored would be necessarily correct, but that the first order of business would be to try and build a set of experts with better incentives.
Since I've not seen it mentioned here, unconferences seem like a inclusive type of event as described above. I'm not sure how EAG compare.
I'm taking decision making under deep uncertainty as a base. So being comfortable with making decisions under many view points. So trying to avoid any one dominant view point or analysis paralysis.