How a computer sees history after "reading" 35 million news stories
So far, humans have relied on the written word to record what we know as history. When artificial intelligence researchers ran billions of those words from decades of news coverage through an automated analysis, however, even more patterns and insights were revealed.
A team from the University of Bristol ran 35 million articles from 100 local British newspapers spanning 150 years through both a simple content analysis and more sophisticated machine learning processes. By having machines "read" the nearly 30 billion words, the simple analysis allowed researchers to easily and accurately identify big events like wars and epidemics.
Perhaps most interesting, the techniques also allowed the researchers to see the rise and fall of different trends during the study range from the years 1800 - 1950. For example, they could track the decline of steam and corresponding rise of electricity – the opposing trajectories crossed each other in 1898. Similarly, they saw when trains overtook horses in popularity in 1902.
By linking famous people to the news from their chosen profession, the team discovered that politicians and writers had the best chance of becoming well-known during their lifetimes. Scientists and mathematicians are less likely to achieve such fame, but those that do will likely see their notoriety last longer.
Not surprisingly, men are more present in the news of the day than women, but a slow increase in mentions of females can be seen after 1900. It would seem that progress continued to be slow even after the study period, as the researchers note that levels of gender bias in the news today aren't much different.
While the large dataset analysis can provide interesting additional insights into history, the researchers have no designs on artificial intelligence replacing historians anytime soon.
"What cannot be automated is the understanding of the implications of these findings for people," said Dr. Tom Lansdall-Welfare, who led the computational part of the study. "That will always be the realm of the humanities and social sciences, and never that of machines."
The study was published in the Proceedings of the National Academy of Sciences.
Source: University of Bristol