Yes, the brain is sparse and semi-modularized, but it'd be hard to really call it more 'brain-like' than dense models. Brains have all sorts of very long range connections in a small-world topology, where most of the connections may be local but there's still connections to distant parts, and those are important; distant brain regions can also communicate and be swapped in and out as the brain recurs and ponders. The current breed of MoEs along the lines of Switch Transformer don't do any of that. They do a single pass, and each module is completely local and firewalled from the others. This is what makes them so 'efficient': they are so separate they can be run and optimized easily in parallel with no communication and they handle only limited parts of the problem so they are still early in the scaling curve.
To continue Holden's analogy, it's not so much like gluing 100 mouse brains together (or in my expression, 'gluing a bunch of chihuahuas back to back and expecting them to hunt like a wolf'), it's like having one mouse brain as a harried overworked MBA manager who must send an email off to one or two of his 99 mouse employees, each of whom then must take care of the job entirely on their own that instant (and are not allowed to communicate or ask for clarification or delegate to any of the other mice).
The more you add recurrency or flexible composition of experts or long-range connections, the more you give up what made them cheap in the first place... I continue to be skeptical that MoEs as currently pursued are anything but a distracting pennywise-poundfoolish sort of diversion, settling for trying to ape GPT-3 at mere fractional savings. Sure, approaches like ERNIE 3.0 Titan look horrifically expensive, but at least they look like they're pushing into new territory.
(If anyone asks, say 'PASTA' was designed as an allusion to Strega Nona.)
Your example doesn't make sense to me. If Bob is not providing any money and cannot 'personally lose the cash' and is never 'any worse off' because he just resells it, what is he doing, exactly? Extending Anne some sort of disguised interest-free loan? (Guaranteed and risk-free how?) Why can't he be replaced by a smart contract if there are zero losses?
It seems like in any sensible Paul-like capitalist system, he must be providing money somewhere in the process - if only by eating the loss when no one shows up to buy it at the same or higher price! If Bob gets involved and does anything useful at all, he's personally losing cash, somehow, in expectation.
So, I don't see how this is any different from the ratchet system where the 'loss' is upfront and Bob buys half the tokens for the blog post and Chris buys the other half, or Bob+Chris pool in a DAO to jointly buy the post's NFT, or something. Maybe someone will show up to buy out those tokens, and they get their money back. Or they don't. Just like the capitalist system. But the 'loss' goes to Anne either way.
Yes, I regard this as a feature and not a bug, and a problem with capitalist CoI schemes. There is no difference between 'talent scouting' and 'speculative bubble unmoored from fundamentals', as this is implemented. It becomes a Keynesian beauty contest: buying CoIs because you think many someones will think it's a CoI to buy...
There is no ground truth which verifies the 'talent' which has been shouted up. The only 'verification' is that there is a greater fool who does buy the CoI from you, so that means 'talent scout' here actually means 'snake oil salesman and marketer' as the scheme collapses under Goodhart, and Paul (or whoever outcompetes him in marketing rather than impacting) starts spending all his time shilling his NFTs on Instagram and talking about how EA CoIs are going to be auctioned at Christies soon, and his followers DM you saying that their inside source says that a new Paul blog is going to drop at midnight on Thursday and if you join their Discord the dropbot can get you in on the token buy early to flip them for guaranteed profits! don't be a sucker or left holding the bag!...
CoIs should be about paying for past performance, and not playing at being covert prediction markets, and doing so poorly. Mixing pay for making more accurate predictions and pay for performance is an uneasy combination at the best of times. If PMs and CoIs are going to be con-fused into the same financial instrument, it needs to be thought through much more carefully. There is probably a role for PMs with subsidies on EA-relevant questions, which can then be used to help price CoIs of any type, but not by directly determining their prices as the answer to their prices, circularly.
Price discovery is implemented by new NFTs. As they reach equilibrium and stop trading, new NFTs have to come out (as one would hope, as the world needs new impacts every day). If an activity is discovered to be worthless, people will just stop buying the new NFTs involving that activity.
Note that new NFTs need to be issued under capitalist CoI too, because time marches on and the world changes: maybe an activity did have the impact back then, but that's not the same question as "today, I, a potential impacter, should do something; what should that something be?" A CoI for fighting iodine deficiency 20 years ago may have a high price, and may always trade at around that high price, and the value of further fighting iodine today be ~$0. The price of the old CoI does not answer the current question; what does is... issuing a new CoI, which people can refuse to buy - "don't you know, iodization is solved, dude? Just check the last national nutrition survey! I'm not buying it, not at that price."
Buyers can buy the CoI of researchers of CoIs. :) Think of how much a CoI must be worth to an altruistic philanthropist when that research affects the purchase of hundreds of later CoIs by other altruists! So much impact.
The impact certificate is resold when someone wants to become the owner of it and pays more than the current owner paid for it; that's just built-in, like a Harberger tax except there's no ongoing 'tax'. (I thought about what if you made a tax that paid to the creator - sort of like an annuity? "Research X is great, as a reward, here's the NPV of it but in the form of a perpetuity." But the pros and cons were unclear to me.) The current owner has no choice about it, and if they want to keep owning it, well, they can then just buy it back at their real valuation of it, and then they are indifferent to any further transfers.
I don't see how CoI NFTs are any worse for coordination?
You do need capital upfront for the refund, but your capital loss is another EA'ers capital gain: on net, it cancels out. The person you just bought the CoI NFT from now has X ETH they can deploy to new CoI NFTs, if they wish.
If you're worried about lumpiness in prices, NFTs can be subdivided - they are just tokens, after all, there's no reason you couldn't have NFTs on scales anywhere from "a year of a nonprofit organization's work" to "the first paragraph of this blog post" or just 'shares' of each. Or pool funds in a DAO to buy them collectively. Plenty of options for that. (This would be set more by things like blockchain fees and mental accounting costs.)
I don't see why that wouldn't be the case? If the cost of 1 utilon is $1, what stops a creator from spending $1, issuing a new CoI NFT, and eventually receiving ~$1? People won't pay >$1 because then they could have bought more utilons by paying for a new NFT. The person who first paid $1 for the NFT keeps it, and then a new one gets made, which gets bid up to $1, and then a new one gets made, and so on and so forth.
A certificate can't be sold at a 'loss' by the terms of the smart contract. It just ratchets. If the price is not so low that someone is tempted to buy it, it just stops trading and remains with the last buyer and has reached its charitable equilibrium, as the creator has been paid in full by philanthropists based on their belief of the impact of that NFT. (A "loss" is a weird thing to talk about in a philanthropic or fan context like this; almost by definition, every single impact certificate is a 'loss' in the sense that you don't get back more money than you put into it, that's the point!)
It seems like your main goal is to avoid a scenario where creators sell their ICs for too little, thereby being exploited.
It seems like your main goal is to avoid a scenario where creators sell their ICs for too little, thereby being exploited.
The main thing is to avoid the pathologies of NFTs as collectibles and speculative bubbles, where the price and activities have nothing whatsoever to do with any fundamentals. The "Beepleification" of EA, if you will. If certificates of impact are subject to the same dynamics as Beeple NFTs are, for example, then they are useless. What the creators skim off is not the main problem, and in fact, to the extent that creators successfully skim off more (the way Beeple has) while those financial dynamics remain intact, they worsen the problem.
Certificates seem like a nice match for NFTs because if you are serious about the status/prestige thing, you do want a global visible registry so you can brag about what impacts you retrocausally funded; and for creators, this makes a lot more sense than doing one-off negotiations over, like, email.* I was thinking about Harberger taxes on NFTs and how to ensure that NFT collectibles can always be transferred without needing a tax and ratcheting up price as a mechanisms, and that doesn't work because of wash trades with oneself (esp powered by flash loans), but something like that might make sense for certificate of impact NFTs.
A CoI NFT would be a NFT linked to a specific action or object, such as a research paper; it would be sold by the responsible agent; a CoI NFT contains the creator's address; a CoI NFT can be purchased/transferred at any time from its current owner by sending the last price + N ETH to the contract, where the last owner gets the last price as a refund and the creator gets the marginal N ETH as further payment for their impact.
So you might buy the NFT for Paul's latest blog post for 1 ETH from Paul, and then Jess decides it's actually more important, and buys it away from you for 1.1 ETH (you are then break-even, and Paul is at 1 + 0.1 ETH, and Jess is at -1.1 ETH); then pg decides he really likes it and buys it for 10 ETH, refunding Jess's 1.1 ETH and sending an additional 8.9 ETH to Paul... At that point, people collectively agree that the true worth of Paul's post is indeed about 10 ETH, and the NFT stops moving and pg gets the prestige of having the good philanthropic taste to have (retro-causally) patronized Paul & caused by commissioning that post.
The creator of impact gets all the revenue irreversibly so there's no pernicious speculative financial bubble problems; any person worldwide can contribute more at any time permissionlessly; only one person at a time 'owns' the collectible and gets the status & prestige of "I own and retroactively commissioned awesome thing X"; and the faster you bid it up to its true price (as you believe it), the more likely you are to win the game of musical chairs, incentivizing everyone to weigh in fast. (And to the extent there's a winner's curse, well, that's a good thing, since this is for public goods and other underincentivized things.)
* I turned down one or two impact requests for my own work because I couldn't decide if it was really a good idea to irrevocably sell this sort of nebulous right to my works, and if it was a good idea, didn't it then logically follow that I'd want to maximize my gains by some sort of public auction rather than negotiating one on one with the first buyer to come along & make an offer?
I mostly agree with that with the further caveat that I tend to think the low value reflects not that ML is useless but the inertia of a local optima where the gains from automation are low because so little else is automated and vice-versa ("automation as colonization wave"). This is part of why, I think, we see the broader macroeconomic trends like big tech productivity pulling away: many organizations are just too incompetent to meaningful restructure themselves or their activities to take full advantage. Software is surprisingly hard from a social and organizational point of view, and ML more so. A recent example is coronavirus/remote-work: it turns out that remote is in fact totally doable for all sorts of things people swore it couldn't work for - at least when you have a deadly global pandemic solving the coordination problem...
As for my specific tweet, I wasn't talking about making $$$ but just doing cool projects and research. People should be a little more imaginative about applications. Lots of people angst about how they can possibly compete with OA or GB or DM, but the reality is, as crowded as specific research topics like 'yet another efficient Transformer variant' may be, as soon as you add on a single qualifier like, 'DRL for dairy herd management' or 'for anime', you suddenly have the entire field to yourself. There's a big lag between what you see on Arxiv and what's out in the field. Even DL from 5 years ago, like CNNs, can be used for all sorts of things which they are not at present. (Making money or capturing value is, of course, an entirely different question; as fun as This Anime Does Not Exist may be, there's not really any good way to extract money. So it's a good thing we don't do it for the money.)
Lousy paper, IMO. There is much more relevant and informative research on compute scaling than that.
I think your confusion with the genetics papers is because they are talking about _effective_ population size (N~e~), which is not at all close to 'total population size'. Effective population size is a highly technical genetic statistic which has little to do with total population size except under conditions which definitely do not obtain for humans. It's vastly smaller for humans (such as 10^4) because populations have expanded so much, there are various demographic bottlenecks, and reproductive patterns have changed a great deal. It's entirely possible for effective population size to drop drastically even as the total population is growing rapidly. (For example, if one tribe with new technology genocided a distant tribe and replaced it; the total population might be growing rapidly due to the new tribe's superior agriculture, but the effective population size would have just shrunk drastically as a lot of genetic diversity gets wiped out. Ancient DNA studies indicate there has been an awful lot of population replacements going on during human history, and this is why effective population size has dropped so much.) I don't think you can get anything useful out of effective population size numbers for economics purposes without making so many assumptions and simplifications as to render the estimates far more misleading than whatever direct estimates you're trying to correct; they just measure something irrelevant but misleadingly similar sounding to what you want.
This seems like a retread of Bostrom's argument that, despite astronomical waste, x-risk reduction is important regardless of whether it comes at the cost of growth. Does any part of this actually rely on Roodman's superexponential growth? It seems like it would be true for almost any growth rates (as long as it doesn't take like literally billions or hundreds of billions of years to reach the steady state).
“Recent GWASs on other complex traits, such as height, body mass index, and schizophrenia, demonstrated that with greater sample sizes, the SNP h2 increases. [...] we suspect that with greater sample sizes and better imputation and coverage of the common and rare allele spectrum, over time, SNP heritability in ASB [antisocial behavior] could approach the family based estimates.”
I don't know why Tielbeek says that, unless he's confusing SNP heritability with PGS: a SNP heritability estimate is unconnected to sample size. Increasing n will reduce the standard error but assuming you don't have a pathological case like GCTA computations diverging to a boundary of 0, it should not on average either increase or decrease the estimate... Better imputation and/or sequencing more will definitely yield a new, different, larger SNP heritability, but I am really doubtful that it will reach the family-based estimates: using pedigrees in GREML-KIN doesn't reach the family-based Neuroticism estimate, for example, even though it gets IQ close to the IQ lower bound.
For example, the meta-analysis by Polderman et al. (2015, Table 2) suggests that 93% of all studies on specific personality disorders “are consistent with a model where trait resemblance is solely due to additive genetic variation”. (Of note, for “social values” this fraction is still 63%).
Twin analysis can't distinguish between rare and common variants, AFAIK.
The SNP heritabilities I'm referring to are https://en.wikipedia.org/w/index.php?title=Genome-wide_complex_trait_analysis&oldid=871623331#Psychological There's quite low heritabilities across the board, and https://www.biorxiv.org/content/10.1101/106203v2 shows that the family-specific rare variants (which are still additive, just rare) are almost twice as large as the common variants. A common SNP heritability of 10% is still a serious limit, as it upper bounds the PGS which will be available anytime soon, and also hints at very small average effects making it even harder. Actually, 10% is much worse than it seems even if you compare to the quoted IQ's 30%, because personality is easy to measure compared to IQ, and the UKBB has better personality inventories than IQ measures (at least, substantially higher test-retest reliabilities IIRC).
Dominance...And what about epistasis? Is it just that there are quadrillions of possible combinations of interactions and so you would need astronomical sample sizes to achieve sufficient statistical power after correcting for multiple comparisons?
Yes. It is difficult to foresee any path towards cracking a reasonable amount of the epistasis, unless you have faith in neural net magic starting to work when you have millions or tens of millions of genomes, or something. So for the next decade, I'd predict, you can write off any hopes of exploiting epistasis to a degree remotely like we already can additivity. (Epistasis does make it a little harder to plan interventions: do you wind up in local optima? Does the intervention fall apart in the next generation after recombination? etc. But this is minor by comparison to the problem that no one knows what the epistasis is.) I'm less familiar with how well dominance can work.
So to summarize: the SNP heritabilities are all strikingly low, often <10%, and pretty much always <20%. These are real estimates and not anomalies driven by sampling error, nor largely deflated by measurement error. The PGSes, accordingly, are often near-zero and have no hits. The affordable increases in sample sizes using common SNP genotyping will push it up to the SNP heritability limit, hopefully; but for perspective, recall that IQ PGSes 2 years ago were *already* up to 11% (Allegrini et al 2018) and still have at least 20% to go, and IQ isn't even that big a GWAS success story (eg height is >40%). The 'huge success' story for personality research is that with another few million samples years and years from now, they can reach where a modestly successful trait was years ago before they hit a hard deadend and will need much more expensive sequencing technology in generally brandnew datasets, at which point the statistical power issues become far more daunting (because rare variants by definition are rare), and other sources of predictive power like epistatic variants will remain inaccessible (barring considerable luck in someone coming up with a method which can actually handle epistasis etc). The value of the possible selection for the foreseeable future will be very small, and is already exceeded by selection on many other traits, which will continue to progress more rapidly, increasing the delta, and making selection on personality traits an ever harder sell to parents since it will largely come at the expense of larger gains on other traits.
Could you select for personality traits? A little bit, yeah. But it's not going to work well compared to things selection does work well for, and it will continue not working well for a long time.