I have generally been quite skeptical about the view that we are on the cusp of a revolution that will lead us to artificial general intelligence in the next 50 years so.
Aside from fundamental limitations of current AI systems, and flaws of extrapolating their remarkable ability at narrow tasks towards more general learning by appealing to "exponential" growth, there is another issue with the discourse on AI that I want to highlight.
One of the primary reasons to believe that AGI will happen in the near to mid term future comes from predictions of experts working in the field, the majority of whom seem to think that we will have AGI latest by 2100.
While there is every reason to attach credence to their perspective, it should be noted that deep learning, the framework that underpins most of recent developments in AI including language models like BERT and GPT-3 and strategy-game champions like AphaGo are notoriously hard to decipher from a theoretical perspective.
It would be a mistake to assume that people who design, develop and deploy these models necessarily understand why they happen to be as successful as they are. This may sound like a rather strange statement to make but the reality is that despite incredible pace of progress across various frontiers of AI with deep learning, our knowledge of why it works -- the mathematical theory of it -- lags behind immensely.
To be clear, I am not at all suggesting that research scientists at Google or DeepMind have no knowledge all of why models they design and deploy work. They are certainly guided by various ideas and heuristics when deciding on the loss function, the type of attention mechanism to use, the iterative update to the reward, the overall architecture of the network, etc. However, there are two things to note here : first, a lot of the design is based on experimenting with various functional forms, wiring combinations, convolution structure, parameter choices; second, the fact that there are heuristics and high level understanding of what is happening does not imply that there is a first-principles mathematical explanation for it.
There are people study the theoretical side of deep learning work towards establishing exact results and also aim to understand why the model training process is so incredibly successful. The progress there has been rather limited, and certainly well behind where the state-of-the-art in terms of performance is. There are a lot of unusual things with deep learning and among them the fact that core concepts in conventional machine learning simply does not seem to apply (such as overfitting). For a more technical view on this, watch this amazing talk by Sanjeev Arora where he explains how intriguing deep learning model and training is.
This should be contrasted with physics where our understanding of theories is much deeper and fundamental. There is a very precise mathematical framework to characterize the physics of say, electrons or quarks, and, at the other end of the spectrum, a model to understand cosmology. There is no such thing even remotely comparable to that in deep learning.
Given all this, one should be more skeptical about prediction timelines for a qualitatively superior intelligence from experts in this field. The fact that there are considerable gaps in our understanding would suggest that expert opinion is perhaps guided less by some deeper insight into the learning and generalization process of AI models and more by higher level examination of the rapid progress of AI, i.e., their views may be relatively more closer to that of a lay person. Couple this with the fact that we have a very limited understanding of human consciousness and how that is related to the electro-physiological properties of the brain. Such limitations impose considerable challenges to predict with any degree of certainty.
Great points again!
I have only cursorily examined the links you've shared (bookmarked them for later) but I hope the central thrust of what I am saying does not depend too strongly on being closely familiar with the contents of those.
A few clarifications are in order. I am really not sure about AGI timelines and that's why I am reluctant to attach any probability to it. For instance, the only reason I believe that there is less than 50% chance that we will have AGI in the next 50 years is because we have not seen it yet and IMO it seems rather unlikely to me that the current directions will lead us there. But that is a very weak justification. What I do know is that there has to be some radical qualitative change for artificial agents to go from excelling in narrow tasks to developing general intelligence.
That said, it may seem like nit-picking but I do want to draw the distinction between "not significant progress" and "no progress at all" towards AGI. Not only am I stating the former, I have no doubt that we have made incredible progress with algorithms in general. I am less convinced about how much those algorithms help us get closer towards an AGI. (In hindsight, it may turn out that our current deep learning approaches such as GANs contain path-breaking proto-AGI ideas /principles, but I am unable to see it that way).
If we consider a scale of 0-100 where 100 represents AGI attainment and 0 is some starting point in the 1950s, I have no clear idea whether the progress we've made thus far is close to 5 or 0.5 or even 0.05. I have no strong arguments to justify one or the other because I am way too uncertain about how far the final stage is.
There can also be no question with respect to the other categories of progress that you have highlighted such as compute power and infrastructure and large datasets -indeed I see these as central to the remarkable performance we have come to witness with deep learning models.
The perspective I have is that while acknowledging plenty of progress in understanding several processes in the brain such as signal propagation, mapping of specific sensory stimuli to neuronal activity, theories of how brain wiring at birth may have encoded several learning algorithms, they constitute piece-meal knowledge and they still seem quite a few strides removed the bigger question - how do we attain high level cognition, develop abstract thinking, be able to reason and solve complex mathematical problems ?
I agree that we don't necessarily have to reproduce the exact wiring or the functional relation in order to create a general intelligence (which is why I mentioned the equivalence classes).
Finite number of genes implies finite steps/information/computation (and that is not disputable of course) but the number of potential wiring options in the brain and functional forms between input and output is exponentially large. (It is in principle, infinite, if we want to reproduce the exact function, but we both agree that that may not be necessary). Pure exploratory search may not be feasible and one may make the case that with appropriate priors and assuming some modular structure of the brain, the search space will reduce considerably, but still how much of a quantitative grip do we have on this? And how much rests on speculation?