For many years, futurists and technologists have dreamed about having conversations with their computers. The amount of progress that has been made is astounding when you consider the state of the art in 2004. We now carry billions of devices in our hands, and our homes actively listen to our questions and provide the best possible answers. Despite all the time, money, and effort invested, chatbots of all kinds have not taken over the world as their developers had hoped. They work miracles. In addition, they are dull. It’s important to consider why.
The phrase “chatbot” refers to a wide range of systems, including voice assistants, artificial intelligence, and everything in between. In the bad old days, chatting with your computer meant typing into a window and watching as the machine attempted to simulate a real conversation instead of having one. Restating user inputs as a question was an old ELIZA (1964–1967) marketing gimmick that helped promote this performance. And this persisted all the way up to the SmarterChild chatbot from 2001. The other area of this effort involved digitizing analog signals using voice-to-text engines, such as the sometimes excellent but often difficult offering from Nuance.
In 2011, the ideas in that early work joined up to make Siri for the iPhone 4S, which was quietly built on Nuance’s work. Amazon founder, Jeff Bezos, saw Siri’s promise early and launched a large internal project to make a homegrown competitor. In 2014, Alexa arrived, with Cortana and Google Assistant following in subsequent years. Natural language computing was now available on countless smartphones and smart home devices.
Companies are largely reticent to be specific about the price of building new projects, but chat has been costly. Forbes reported in 2011 that buying the startup behind Siri cost Apple $200 million. In 2018, The Wall Street Journal quoted Dave Limp, who said Amazon’s Alexa team had more than 10,000 employees. A Business Insider story from 2022 suggested the company pegged more than $10 billion in losses on Alexa’s development. Last year, The Information claimed Apple is now spending a million dollars a day on AI development.
So, what do we use this costly technology for? Turning our smart bulbs on and off, playing music, answering the doorbell and maybe getting the sports scores. In the case of AI, perhaps getting poorly summarized web search results (or an image of human subjects with too many fingers.) You’re certainly not having much in the way of meaningful conversation or pulling vital data out of these things. Because in pretty much every case, its comprehension sucks and it struggles with the nuances of human speech. And this isn’t isolated. In 2021, Bloomberg reported on internal Amazon data saying up to a quarter of buyers stop using their Alexa unit entirely in the second week of owning one.
The oft-cited goal has been to make these platforms conversationally intelligent, answering your questions and responding to your commands. But while it can do some basic things pretty well, like mostly understanding when you ask it to turn your lights down, everything else isn’t so smooth. Natural language tricks users into thinking the systems are more sophisticated than they actually are. So when it comes time to ask a complex question, you’re more likely to get the first few lines of a wikipedia page, eroding any faith in their ability to do more than play music or crank the thermostat.
The assumption is that generative AIs bolted onto these natural language interfaces will solve all of the issues presently associated with voice. And yes, on one hand, these systems will be better at pantomiming a realistic conversation and trying to give you what you ask for. But, on the other hand, when you actually look at what comes out the other side, it’s often gibberish. These systems are making gestures toward surface level interactions but can’t do anything more substantive. Don’t forget when Sports Illustrated tried to use AI-generated content that boldly claimed volleyball could be “tricky to get into, especially without an actual ball to practice with.” No wonder so many of these systems are, as Bloomberg reported last year, propped up by underpaid human labor.
Of course, the form’s boosters will suggest it’s early days and, like OpenAI CEO Sam Altman has said recently, we still need billions of dollars in more chip research and development. But that makes a mockery of the decades of development and billions of dollars already spent to get where we are today. But it’s not just cash or chips that’s the issue: Last year, The New York Times reported the power demands of AI alone could skyrocket to as much as 134 terawatt hours per year by 2027. Given the urgent need to curb power consumption and make things more efficient, it doesn’t bode well for either the future of its development or our planet.
We’ve had 20 years of development, but chatbots still haven’t caught on in the ways we were told they would. At first, it was because they simply struggled to understand what we wanted, but even if that’s solved, would we suddenly embrace them? After all, the underlying problem remains: We simply don’t trust these platforms, both because we have no faith in their ability to do what we ask them to and because of the motivations of their creators.
One of the most enduring examples of natural language computing in fiction, and one often cited by real-world makers, is the computer from Star Trek: The Next Generation. But even there, with a voice assistant that seems to possess something close to general intelligence, it’s not trusted to run the ship on its own. A crew member still sits at every station, carrying out the orders of the captain and generally performing the mission. Even in a future so advanced it’s free of material need, beings still crave the sensation of control.