BLOOMINGTON, Ind. – Google’s launch of Bard, it’s search-integrated, AI-powered chatbot, went wrong when the bot’s first advertisement accidentally showed it was unable to find and present accurate information to users.
Research by professors at the Indiana University Kelley School of Business and the University of Minnesota’s Carlson School of Management explains why it may be harder for the creator of the world’s largest search engine to write off the situation as a temporary issue.
Although it isn’t uncommon for software vendors to release incomplete products and subsequently fix bugs and provide additional features, the research shows this may not be the best strategy for AI.
As seen through a one-day $100 billion decrease in market value for Alphabet, Google’s parent company, a botched demo can cause significant damage. Findings in an article published by the journal ACM Transactions on Computer-Human Interaction indicates that errors that occur early in users’ interactions with an algorithm can have a lasting negative impact on trust and reliance.
Antino Kim and Jingjing Zhang, associate professors of operations and decision technologies at Kelley, are co-authors of the paper, “When Algorithms Err: Differential Impact of Early vs. Late Errors on Users’ Reliance on Algorithms,” with Mochen Yang, assistant professor of information and decision sciences at Carlson. Zhang also is co-director of the Institute for Business Analytics at Kelley. Yang taught at Kelley in 2018-19.
Known as “algorithm aversion,” users tend to avoid using algorithms, particularly after encountering an error. The researchers found that giving users more control over AI results can alleviate some of the negative impacts of early errors.
Kim, Yang and Zhang examined the situation through the lens of their research and present their analysis below:
“Not long ago, search engines simply fetched existing content from the internet based on the keywords users provided. Then, in late 2022, ChatGPT, a conversational AI developed by OpenAI, took the internet by storm. Within just a couple of months, Microsoft announced its multibillion-dollar investment in OpenAI and integrated ChatGPT capabilities into Bing.
“Understandably, Google, the defending champion of search engines, was feeling the pressure, and it was quick to react. On Feb. 6, Google ran an advertisement showcasing its own conversational AI service, Bard. Unfortunately, in its first demo, Bard produced a factual error, and the market was not forgiving of Bard’s bad first impression. This error led to a $100 billion decrease in market value for Alphabet, Google’s parent company.
“In the aftermath, Google employees criticized the CEO for the ‘rushed, botched’ announcement of Bard, and Google is now asking staff to help fix the AI’s ‘bad responses’ manually.
“Predictive algorithms and generative AI — broadly referred to as “algorithms” in this article — operate using probabilistic processes instead of deterministic ones, meaning that even the best algorithms can sometimes make mistakes.
“However, users may not tolerate such mistakes, and the term ‘algorithm aversion’ refers to users’ tendency to avoid using algorithms, particularly after encountering an error.
“Not all errors have the same effect on users and, in Google’s case, the market. Our research suggests that errors occurring early on in users’ interactions with an algorithm, before they have had a chance to build trust through successful interactions, have a long-lasting negative impact on users’ trust and reliance.
“Essentially, early errors can create a bad first impression that persists for a long time. In fact, during our experiment, where participants repeatedly interacted with an algorithm, their trust levels following an early error never fully recovered to the level of no error.
“The situation was different, however, for errors that occurred after participants had already had enough successful interactions with the algorithm and built trust. In such cases, participants were more forgiving when algorithms made a mistake, treating it as a one-time fluke. As a result, the level of trust and reliance did not suffer significantly.
“To be fair to Google, it is not uncommon for traditional software vendors to release incomplete products and subsequently fix bugs and provide additional features. However, for AI, this may not be a wise strategy, as the damage from a botched demo can be significant. Our research suggests that Google’s road to recovery from the negative impact of the error may be long.
“So, what steps can AI systems take to mitigate the effects of errors like the one made by Google’s Bard? Our findings suggest that giving users control over how to use the algorithm’s results can alleviate some of the negative impacts of early errors.
“It is possible that Bard’s error had such a significant adverse effect because of the confidence with which the incorrect result was presented. When asked, ‘What new discoveries from the James Webb Space Telescope can I tell my 9-year-old about?,’ the chatbot responded with bullet points that the telescope took the very first pictures of exoplanets — a factually incorrect claim that Google could have verified by Googling it.
“For algorithms that involve probabilistic processes, there is typically a score marking the confidence level for the result. When the score is below a certain threshold, it may be wise to give users more control. One example could be reverting to the search engine mode, where several credible and relevant sources are presented for users to navigate.
“After all, that is what Google does best, and it may be a better approach than hastily releasing another AI that may confidently return an incorrect answer.”
Editor’s note: Professors Kim, Zhang and Mochen are available for interviews. Contact George Vlahakis of the Kelley School at firstname.lastname@example.org for assistance.