As ISC2 Security Congress 2024 got underway in Las Vegas, attendees learned that AI still gets a lot wrong, sometimes amusingly. However, the underlying weaknesses of the technology need to be factored into the next generation of applications or risk a backlash against its use.

Janelle ShaneThe world of artificial intelligence (AI) is living through the best of times but also the worst of times. On the one hand, AI is everywhere, an increasingly important enabler for many everyday processes, from customer service chatbots to figuring out protein folding and winning computer scientists a Nobel Prize for Chemistry in the process.

However, as the opening ISC2 Security Congress keynote presentation Lessons from the Lighter Side of Artificial Intelligence by AI speaker and humorist Janelle Shane set out to remind us, AI still struggles at many things humans find so easy they aren’t even considered problems.

AI’s Convincing Alternate Reality

According to Shane, examples of this are all around us, a point she was able to underline with examples from her own experiments with AI as well as from popular culture. Anyone who’s used generative AI will have experienced the way these platforms sometimes hallucinate answers, sometimes bizarrely so. What’s not always appreciated, however, is the scale of the parallel reality AI can build and how difficult it can be for humans to realize they are being fooled by it.

Shane mentioned several incidents, starting with a help bot on an automation platform that sent an enquiring user to a helpful video tutorial. Except no such video tutorial existed and what the user saw when they clicked on the link was a video of Rick Astley singing Never Going to Let You Down .

Or the Air Canada bot that offered a customer a discount for a flight under a policy that turned out not to exist when he tried to claim on it (Air Canada disputed its responsibility for this error but lost the case).

“Their job is to come up with what is probable not what has a connection to truth. They don’t know what customer service is,” said Shane on these bot problems.

AI has no idea what you’re asking it to do. Surprisingly, the tendency of AI to misinterpret tasks within its frame of reference has been an issue since the early days of machine learning. “And it keeps happening,” Shane pointed out. “It turns out to be easier to produce fluent text than to produce text that corresponds to reality.”

AI also struggles with exceptions, that is unusual data that is possible within the data set they were trained on but which they have never encountered. Shane used the example of a spotless giraffe born in a U.S. zoo, an extremely unusual event. When an image of this rare creature was shown to three image generation AIs, all claimed against the visible evidence that the spotless giraffe had spots. “I can see that the AIs just memorized a giraffe,” Shane said. The AI lacked the context that spots were not a constant.

This makes assessing the performance of AI difficult because you must be sure that the questions you are testing its answers against haven’t just been memorized. Can AI be fixed? Shane noted several approaches which offer examples of the ways developers have tried to cope with AI’s limitations.

Make the Problem Very Narrow

The first developer workaround is to train algorithms on a very specific data set. Shane mentioned the example of a Stanford research project that proved that it was possible for AI to reliably distinguish grains of sand from different environments such as dunes, riverbeds, or glaciers.

“If you want to work well with AI you have to give it narrow chess problems,” said Shane.

Use Humans Instead

A second solution (or cheat) is to use humans to remote control the AI, correcting the performance gaps as they occur. Shane used the example of self-driving cars or home delivery robots under remote human control as examples of this phenomenon. This approach works but removes the elegance of having machines do all the work.

Lower the Stakes

This approach is about making it acceptable for AI systems to give imprecise answers. Shane used the example of a phone app that can correctly identify a toxic mushroom but avoids identifying other mushrooms that are edible. Essentially, this allows the AI to avoid being wrong by avoiding giving a precise answer.

Don’t use AI

Perhaps there are problems that are too high stakes to risk using AI at all. Today, there are a growing number of examples where this principle should have been applied, including a personality judgment AI that rated human subjects more highly on the basis that there was a bookcase in the background.

A second example is AIs designed to detect whether text has been written by a human or machine with false positive rates of 5.1% for native speakers but 61% for non-native speakers.

“If the consequences of getting it wrong are serious for even some people, it’s not a good application of AI,” argued Shane.

Shane ended with a warning that not admitting to these failings could have bad results for the industry building AI systems. A similar principle applied to the use of AI in cybersecurity.

“An AI backlash is coming. There will be a reckoning regarding which applications are worth it. The conversation needs to be based around what the limitations are.”