Opinion The Turing test is about us, not the bots, and it has failed.
Fans of the slow burn mainstream media U-turn had a treat last week.
On Saturday, the news broke that Blake Lemoine, a Google engineer charged with monitoring a chatbot called LaMDA for nastiness, had been put on paid leave for revealing confidential information.
Lamoine had indeed gone public, but instead of something useful like Google’s messaging strategy (a trade secret if ever there was one) he made the claim that LaMDA was alive.
Armed with a transcript where LamDA did indeed claim sentience and claims that it had passed the Turing Test, Lemoine was the tech whistleblower from heaven for the media. By the time the news had filtered onto BBC radio news on Sunday evening, it was being reported as an event of some importance.
On Twitter, it had been torn apart in a few hours, but who trusts Twitter with its large and active AI R&D community?
A couple of days later, the story was still flying, but by now journalists had brought in expert comment, by way of a handful of academics who had the usual reservations about expressing opinions.
On the whole, no, it probably wasn’t, but you know it’s a fascinating area to talk about.
Finally as the story fell off the radar at the end of the week, the few remaining outlets still covering it had found better experts who, one presumes, were as exasperated as the rest of us. No. Absolutely not. And you won’t find anyone in AI who thinks otherwise. The conversation still revolved around sentience rather than how interesting it was.
Google has to use humans to check its chatbot outputs for hate speech, but we were back on the planet.
For future reference and to save time for everyone, here’s the killer tell that a story is android paranoia – “The Turing Test” as a touchstone for sentience. It isn’t.
It was never meant thus. Turing promised it in a 1950 paper as a way of actually avoiding the question “can machines think?”
He sensibly characterized that as unanswerable until you sort out what thought is. We hadn’t then. We haven’t now.
Instead, the test – can a machine hold a convincingly human conversation? – was designed to be a thought experiment to check arguments that machine intelligence was impossible. It tests human perceptions and misconceptions, but like Google’s “Quantum Supremacy” claims, the test itself is tautologous: passing the test just means the test was passed. By itself, it proves nothing more.
Take a hungry Labrador dog, which is to say any Labrador not asleep nor dead, who becomes aware of the possibility of food.
An animal of prodigious and insatiable appetite, at the merest hint of available calories, the Labrador puts on a superb show of deep longing and immense unrequited need. Does this reflect a changed cognitive state analogous to the lovesick human teenager it so strongly resembles? Or is it learned behavior that turns emotional blackmail into snacks? We may think we know, but without a much wider context, we cannot. We might be gullible. Passing the lab test means you get fed. By itself, nothing more.
The first system to arguably pass the Turing test, in spirit if not the letter of the various versions Turing proposed, was an investigation into the psychology of human-machine interaction. ELIZA, the progenitor chatbot, was a 1966 program by MIT computer researcher Joseph Weizenbaum.
It was designed to crudely mimic the therapeutic practice of echoing a patient’s questions back to them.
“I want to murder my editor.”
“Why do you want to murder your editor?”
“He keeps making me hit deadlines.”
“Why do you dislike hitting deadlines?” and so on.
Famously, Weizenbaum was amazed when his secretary, one of the first test subjects, imbued it with intelligence and asked to be left alone with the terminal.
The Google chatbot is a distant descendant of ELIZA, fed on large amounts of written data from the internet and turned into language models by machine learning. It is an automated method actor.
A human actor who can’t add up can play Turing most convincingly – but quiz them on the Entscheidungsproblem and you’ll soon find out they’re not. Large language models are very good at simulating conversation, but if you have the wherewithal to generate the context which will test for whether it is what it appears to be, you can’t say more than that.
We are nowhere near defining sentience, although our increasingly nuanced appreciation of animal cognition is showing it can take many forms.
At least three types – avian, mammalian, and cephalopodian – with significant evolutionary distance look like three very different systems indeed. If machine sentience does happen, it won’t be by a chatbot suddenly printing out a cyborg bill of rights. It will come after decades of directed research, building on models and tests and successes and failures. It will not be an imitation of ourselves.
And that is why the Turing test, fascinating and thought-provoking though it was, has outlived its shelf life. It does not do what people think it does, rather it has been traduced into serving as a Hollywood adjunct that focuses on a fantasy. It soaks up mainstream attention that should be spent on the real dangers of machine-created information. It is the astrology of AI, not the astronomy.
The very term “artificial intelligence” is just as bad, as everyone from Turing on has known. We’re stuck with that. But it’s time to move the conversation on, and say goodbye to the brilliant Alan Turing’s least useful legacy. ®