In an interview with Daniel Hambury at London's Evening Standard newspaper, Wales chews over some of the issues inherent with the technology – particularly its tendency to "hallucinate," or flat-out make things up – but points out that "using AI to triple the number of Wikipedia entries wouldn’t increase our running costs by more than £1,000 a year."
One early use case, says Wales, might be to use a large language model (LLM) like GPT to compare multiple articles, looking for points that contradict one another, and use its findings to identify pieces that Wikipedia's army of human volunteers could need to put some work into.
But he's definitely considering just having these LLMs write pages.
"I think we’re still a way away from: ‘ChatGPT, please write a Wikipedia entry about the Empire State Building,'" he tells Hambury, "but I don’t know how far away we are from that, certainly closer than I would have thought two years ago."
One possible scenario is to have the AI go through looking for all the many gaps on Wikipedia – potentially useful pages that have never been written – and attempting to create summary entries for them using information from the Web.
But Wales is aware that Wikipedia's entire reputation is founded on the perception of accuracy, and that this is currently a huge problem with LLMs like GPT.
"It has a tendency to just make stuff up out of thin air which is just really bad for Wikipedia," he says. "That’s just not OK. We’ve got to be really careful about that."
If LLMs begin writing a central knowledge repository like Wikipedia, hallucinations or lies that aren't immediately caught will begin to snowball. People will use those non-facts in their own writing, and subsequent AIs will be trained with these non-facts baked in, making it difficult to correct them in the longer term and driving us deeper into this "post-truth" era.
Wales is also concerned about whether using LLMs to expand the resource could help with, or exacerbate, Wikipedia's issues of systemic and unconscious bias; the resource is currently written and maintained by volunteers, an overwhelming majority of whom are white males, so the site has a tendency to ignore topics that aren't of interest to this group, and cover other topics from a certain perspective.
ChatGPT has been explicitly designed to attempt a balanced perspective on topics where it can, in an attempt to bring some nuance back to discussion areas where people from different sides are increasingly finding it harder to start from any common ground. But GPT has its own bias problems inherent in its training data.
It's a thorny topic, and it's certainly got me considering whether I keep donating to the site if it goes down that road. But realistically, any organization that isn't re-orienting around the phenomenal capabilities of next-gen LLMs is putting itself at a huge disadvantage, and it's crazy to expect this stuff won't start getting rolled in everywhere.
Source: Evening Standard