Until recently, philosophy of artificial intelligence seemed to be mostly about arguments why the Turing test was not a useful benchmark for intelligence. Pretty much everyone who had ever thought about the problem seriously had come to the same conclusion.
The fundamental issue was the assumption that general intelligence is an objective property that can be determined experimentally. It's better to consider intelligence an abstraction that may help us to understand the behavior of a system.
A system where a fixed LLM provides answers to prompts is little more than a Chinese room. If we give the system agency to interact with external systems on its own initiative, we get qualitatively different behavior. The same happens if we add memory that lets the system scale beyond the fixed context window. Now we definitely have some aspects of general intelligence, but something still seems to be missing.
Current AIs are essentially symbolic reasoning systems that rely on a fixed model to provide intuition. But the system never learns. It can't update its intuition based on its experiences.
Maybe the ability to learn in a useful way is the final obstacle on the way towards AGI. Or maybe once again, once we start thinking we are close to solving intelligence, we realize that there is more to intelligence than what we had thought so far.
The Turing test isn't as bad as people make it out to be. The naive version, where people just try to vibe out whether something is a human or not, is obviously wrong. On the other hand, if you set a good scientist loose on the Turing test, give them as many interactions as they want to come to a conclusion, and you let them build tools to assist in the analysis, it suddenly becomes quite interesting again.
For example, looking at the statistical distribution of the chat over long time horizons, and looking at input/output correlations in a similar manner would out even the best current models in a "Pro Turing Test." Ironically, the biggest tell in such a scenario would be excess capabilities AI displays that a human would not be able to match.
I would consider something generally intelligent that is capable of sustaining itself. So... self-sufficiency? I don't see why the bar would be much lower than that. And before people chime in about kids not being self-sufficient so by that definition I wouldn't consider them generally intelligent which is obviously false... to that I would say... they're still in pre-training.
To my knowledge Turing test has not been blown out of the water. The forms I saw were time limited and participants were not pushed hard to interrogate.
It's crystal-clear that a model that was trained specifically to fool expert interrogators in a Turing test would, in fact, be able to do so. You'd have to sandbag the model just to keep it from tipping its hand by being too good.
We don't have any such models right now, AFAIK, so we can't run such a test. They wouldn't be much good for anything else, and would likely spark ethical concerns due to potential for misuse. But I have no doubt that it's possible to train for the Turing test.
I don't think that was the intent of the comment, more that true AGI should be so useful and transformative that it unlocks enough value and efficiencies to boost GDP. Much like the Industrial Revolution or harnessing electricity, instead of a fancy chatbot.
Not equivalent, but I do think a necessary byproduct of actual AGI is that it will be able to solve actual problems in the real world in a way that generates positive value on a large enough scale that it will show up in GDP
Humans will never accept we created AI, they'll go so far as to say we were not intelligent in the first place. That is the true power of the AI effect.
And yet another way to look at it is maybe current LLM agents are AGI, but it turns out that AGI in this form is actually not that useful because of its many limitations and solving those limitations will be a slow and gradual process.
What is the benchmark now that the Turing test has been blown out of the water?