Reports at The revelation that hackers stole data from over 400 million Twitter users at the end of 2022 was confirmed by researchers. Researchers now believe that an extensive circulated list of email addresses that is linked to more than 200 million users may be just a refined version, with the duplicate entries deleted. While the social network is yet to comment on the huge exposure, researchers now believe that the large cache of data clarifies both the gravity of the leakage and the potential risk for those most vulnerable.
Between June 2021 and January 2022 there was an API bug that enabled attackers to send email addresses, as well as receive the associated Twitter accounts, if they were found. Before it was patched, attackers exploited the flaw to “scrape” data from the social network. While hackers couldn’t access sensitive passwords and information such as DMs via the bug, attackers used it to expose the link between Twitter accounts. These are usually pseudonymous and include email addresses and telephone numbers. This could potentially allow them to identify users.
Multiple actors were able to exploit the vulnerability while it was still live in order to create different data collections. The email addresses and telephone numbers of approximately 5.4 million Twitter users were among the ones that circulated in criminal forums. It appears that the trove only contains email addresses. There is a risk of identity theft and phishing, as well as widespread dissemination, that the data will be circulated widely.
WIRED asked Twitter for comments but did not receive a reply. The company wrote about the API vulnerability in an August disclosure: “When we learned about this, we immediately investigated and fixed it. At that time, we had no evidence to suggest someone had taken advantage of the vulnerability.” Seemingly, Twitter’s telemetry was insufficient to detect the malicious scraping.
Twitter isn’t the first to allow mass scraping of data through API flaws. This is a common scenario that leads to confusion over how many data troves actually exist because of malicious exploit. They are nevertheless significant because they provide more validation and connections to the vast amount of stolen data about criminals.
“Obviously, there are multiple people who were aware of this API vulnerability and multiple people who scraped it. Different people could have scraped different items. Are there any other troves out there? It kind of doesn’t matter,” says Troy Hunt, founder of the breach-tracking site HaveIBeenPwned. Hunt used the Twitter data to create HaveIBeenPwned. Hunt claimed that it contained information on more than 200 million accounts. 98% of the emails had been compromised in previous breaches that HaveIBeenPwned recorded. Hunt claims he also sent notifications emails to almost 1,064,000 subscribers of his email service, which has 4,400,000,000,000 million.
“It’s the first time I’ve sent a seven-figure email,” he says. “Almost a quarter of my entire corpus of subscribers is really significant. This was all out there so I doubt this will have a huge impact. It may also de-anonymize individuals. The thing I’m more worried about is those individuals who wanted to maintain their privacy.”