The network mapping could also reveal a strategy by malicious actors to split their edit history between multiple accounts in order to avoid detection. By mixing politically sensitive edits and legitimate pages, editors strive to improve their reputation.
“The main message that I have taken away from all of this is that the main danger is not vandalism. It’s entryism,” Miller says.
The theory could be wrong, but it does mean that disinformation campaigns capable of passing unnoticed by state actors can take many years to put together.
“Russian influence operations can be quite sophisticated and go on for a long time, but it’s unclear to me whether the benefits would be that great,” says O’Neil.
Sometimes, governments also have more powerful tools. Autoritarian leaders blocked the site over time, took its ruling organization to court and detained its editors.
Wikipedia has been fighting inaccuracies since 1991. One of the most long-running disinformation attempts went on for more than a decade after a group of ultra-nationalists gamed Wikipedia’s administrator rules to take over the Croatian-language community, rewriting history to rehabilitate World War II fascist leaders of the country. The platform has also been vulnerable to “reputation management” efforts aimed at embellishing powerful people’s biographies. There are also hoaxes. A Chinese Wikipedia editor, 2021, was discovered to have written 200 articles about a fabricated medieval Russian history, including battles and imaginary states.
Wikipedia created a set of rules and governing bodies to combat this. This self-organizing, self-governing group of 43 million users around the globe has also developed public discussion forums.
Nadee Gunasena, chief of staff and executive communications at the Wikimedia Foundation, says the organization “welcomes deep dives into the Wikimedia model and our projects,” particularly in the area of disinformation. But she also adds that the research covers only a part of the article’s edit history.
“Wikipedia content is protected through a combination of machine learning tools and rigorous human oversight from volunteer editors,” says Gunasena. While all content is available, the history is also made public. However, sourcing information is checked for reliability and neutrality.
The fact that the research focused on bad actors who were already found and rooted out may also show that Wikipedia’s system is working, adds O’Neil. But while the study did not produce a “smoking gun,” it could be invaluable to Wikipedia: “The study is really a first attempt at describing suspicious editing behavior so we can use those signals to find it elsewhere,” says Miller.
Victoria Doronina, a member of the Wikimedia Foundation’s board of trustees and a molecular biologist, says that Wikipedia has historically been targeted by coordinated attacks by “cabals” that aim to bias its content.
“While individual editors act in good faith, and a combination of different points of view allows the creation of neutral content, off-Wiki coordination of a specific group allows it to skew the narrative,” she says. If Miller and its researchers are correct in identifying state strategies for influencing Wikipedia, the next struggle on the horizon could be “Wikimedians versus state propaganda,” Doronina adds.
Miller believes that Miller’s analysis of the behavior of these bad actors can be used to build models that can detect disinformation or determine how vulnerable a platform to system manipulation. This has been demonstrated on Facebook and Twitter as well as YouTube and Reddit.
One hundred and twenty-six administrators oversee Wikipedia’s English-language English edition. They have viewed over 6.5 millions pages. It is also the largest edition. Reporting suspicious behavior has been the most effective way to track down criminals. Many of these behaviors may not be easily visible without the proper tools. Wikipedia data can be difficult to analyze in data science. Wikipedia does not have a single version of the text, so it is different from a post on Facebook or Twitter.
As Miller explains it, “a human brain just simply can’t identify hundreds of thousands of edits across hundreds of thousands of pages to see what the patterns are like.”