Artificial intelligence (AI) is rarely out of the news, particularly generative AI such as ChatGPT, Gemini and Copilot.
These systems can produce effective text, answer complex questions and generate stunning images. But this software also presents challenges for safe business use. In three posts we will explore overcoming errors and hallucinations, environmental impact and copyright issues.
At first sight, generative AI’s abilities feel stunning. It seems that many basic writing tasks can be handed over to the software, whether it’s copy for the company website, researching a legal challenge, or getting a better understanding of a complex business problem. And there is no doubt that AI can produce convincing-looking text. But there is good evidence that AI can easily fall into error, suffering from the entertainingly named hallucinations.
I often write on science issues and was recently researching a telescope known as BICEP2. I don’t employ generative AI in my writing, but I do use search engines and wanted to see explanations of why the telescope was located at the South Pole – not the easiest place to build. Search engines now often lead with an AI summary, or offer AI expansion of results. My search engine of choice came up with this:
“The BICEP2 telescope was located at the South Pole to observe the cosmic microwave background in a clear, dark, and cold environment. This helped the telescope detect primordial gravitational waves, which could provide evidence of a theory of inflation.”
You don’t need to know anything about telescopes or gravitational waves to see how two major errors in this text illustrate the risk of relying on generative AI. Firstly, BICEP2 was a radio telescope. The AI had connected telescopes with good dark locations – and this is perfectly reasonable for those using visible light. But it makes no difference to radio reception. Worse, though, is the final sentence. While an announcement was indeed made in 2014 that BICEP2 had detected these gravitational waves, the claim was withdrawn within a few weeks, as the apparent discovery was caused by interstellar dust interfering with the signal.
What lies behind this ability to hallucinate reflects the inaccuracy of the label ‘artificial intelligence’. Intelligence implies having an understanding of the subject – but generative AI has no comprehension of either the prompt provided by the user or the original sources it was trained on. Instead, what we see is the result of sophisticated pattern matching.
The first mistake was likely to be because BICEP2 is rarely explicitly called a radio telescope, and it will often be said that a telescope is sited in a location with dark skies. Similar problems can arise when the same word has two different meanings – for example, whether ‘code’ refers to software or a cipher. The second error is likely to be because there was a huge amount of online coverage after the initial announcement of the apparent discovery, which, if true, would have been very important to science. The nature of the internet is that incorrect statements (just as much about businesses as telescopes) tend to remain in the data an AI will be trained on.
Other hallucinations can be caused by the phrasing of the prompt. For example, I asked ChatGPT ‘What are the five laptop operating systems used in all businesses.’ My prompt was intentionally ambiguous – but I wanted all five to be used by every company. Technically, then, the correct answer was ‘There aren’t five’, but I was given a list. Admittedly, the generative AI emphasised that Windows was the most widely used OS, and said that most businesses ‘rely primarily on Windows and macOS’, but it still happily came up with five (interestingly, with variants of the prompt, the fifth example differed).
Perhaps most worryingly, AI will not know the quality of the information it is using, or understand the concept of correctly identifying a source. Lawyers have been caught in court using case law that is based on fake precedents, complete with made up references to legal databases.
Should businesses ignore generative AI, then? For written material where quality matters, a good human writer is still the best solution. However, when researching a topic, AI can give a decent starting point. But any factual claims need testing by searching for reliable sources to back them up. This may reduce the labour-saving benefits of generative AI, but it will lessen the chances of making a business faux pas.