2.7 billion email addresses exposed online

A huge database of more than 2.7 billion email addresses was left exposed on the web, accessible to anyone with a web browser. More than one billion of those records also contained a plain-text password associated with the email address.

Comparitech collaborated with security researcher Bob Diachenko to uncover the database on December 4, 2019. Although the database owner was not identified, Diachenko immediately alerted the US ISP that hosted the IP address to take it down.

The vast majority of emails were from Chinese domains including qq.com, 139.com, 126.com, gfan.com, and game.sohu.com. Those domains belong to some of China’s biggest internet companies including Tencent, Sina, Sohu, and NetEase.

A few email addresses had Yahoo and Gmail domains, as well as some Russian ones such as rambler.ru and mail.ru.

Upon verification, we concluded that all the emails with passwords originated from the so-called “Big Asian Leak,” first uncovered by HackRead. In January 2017, a dark web vendor was selling the records that included passwords.

Timeline of the leak

Comparitech immediately took steps to take down the database upon discovering in order to mitigate harm to end users, but we don’t know if anyone accessed it in the meantime. Here’s what we know:

  • December 1, 2019: The database was first indexed by the BinaryEdge search engine and since then was publicly available.
  • December 4, 2019: Diachenko discovered the database and immediately took steps to notify responsible parties.
  • December 9, 2019: Access to the database was disabled.

In all, the data was exposed for more than a week, giving malicious parties sufficient time to find it and copy it for their own purposes.

The database appeared to be updating and getting larger in real time. The number of accounts increased from 2.6 to 2.7 billion between the time we sent notification and when the database was taken down.

What information was exposed?

The 1.5 TB of data contained an astonishing 2.7 billion records. More than 1 billion of those included passwords.

big asian data leak

Because many Chinese people have difficulty reading English characters, they often use their phone numbers or other numerical identifiers as usernames. Therefore, we can assume many of these email addresses also contain phone numbers.

In addition to email addresses and passwords, the records contained MD5, SHA1, and SHA256 hashes of each email address. Hashes are encrypted text—the email address, in this case—with a fixed length. They are often used to securely store data in scenarios when it would be too dangerous to store plain-text data. Their inclusion in this database doesn’t serve an obvious purpose but they could be used to ease searches of relational databases.

Dangers of exposed data

A database like this is likely to be used for credential stuffing. Credential stuffing is an attack that attempts to log into various online accounts with known email and password combinations. Hackers take advantage of the fact that many people use the same email and password across multiple accounts. They use an automated system to attempt logins across several sites using the credentials stored in the database.

Once hackers gain access to an account, they can hijack it by changing the password and associated email. It can then be used for a wide variety of purposes including spam, phishing, fraud, theft, and more.

Affected users should immediately change their email account passwords, as well as any other accounts that share the same password.

What is the “Big Asian Data leak”

In January 2017, HackRead reported a dark web vendor was selling 1 billion user accounts stolen from Chinese internet giants. The report mentions more than 60 copies of the data were sold at time of writing for about $615 each in Bitcoin.

Most, but not all, of the records contained email addresses from Chinese domains:

  • Netease: About 322 million records from Netease-owned domains including 126.com, 163.com, 163.net, and Yeah.net.
  • Tencent: About 130 million emails contained the qq.com domain. The company that owns WeChat also owns QQ, one of China’s most popular instant messaging platforms.
  • Sina: 31 million records included the sina.com domain, which belongs to the company that operates China’s Twitter-like social network, Sina Weibo.
  • Sohu: 23 million records contained sohu.com domains. Sohu operates a wide range of online services including a search engine, advertising, and online gaming.

Other notable domain owners whose users are impacted by the leak include: TOM Online (tom.com), Eyou (eyou.com), SK Communications (nate.com), Google (gmail.com), Yahoo (yahoo.com), and Hotmail (hotmail.com).

The vendor, DoubleFlag, is well known for selling high-profile breached data. The notches on his belt include Epic Games, uTorrent Forum, BitcoinTalk.org, Yandex.ru, Mail.ru, Dropbox, Brazzers, and Experian.

How and why we discovered this leak

Comparitech partners with security expert Bob Diachenko to scan the internet and discover databases that have been left exposed to the public. When we find one, we immediately take steps to notify responsible parties to shut it down or remove access.

Diachenko leverages his many years of cybersecurity experience to find and analyze these leaks. He makes every attempt to identify who is responsible for the data so they can secure it.

We then investigate the exposed data to find out whose personal data was leaked, what it contained, for how long it was exposed, and what threats victims might face. We compile our findings into a report like this one to raise awareness among those affected. Our hope is to limit access to and misuse of personal data by malicious parties.

Previous reports

This is the largest data exposure Comparitech has discovered to date. Some of our other reports include: