In 2005, AOL suffered a data breach involving 92 million email addresses and screen names after an ex-employee stole the list. While astronomical, this data breach figure pales in comparison to the more recent breaches of 10.88 billion CAM4 records, 5 billion Cognyte records (which, ironically, were from previous data breach victims), and 3 billion Yahoo user accounts (discovered in 2016).
Based on data breaches affecting over 10m records, over 50bn records have been impacted by the biggest US data breaches since 2005. Some of the biggest have occurred in the last few years.
So how have data breaches developed over the years? How many records have been affected? And what industries have been most impacted?
Check out our interactive dashboards below to find out where the biggest data breaches in the US and worldwide have occurred:
Breach type definitions: Disc (unintentional disclosure, e.g. leaving a database unsecured), Hack (attacked by an external source or with malware), Insd (theft by an employee, contract, or third-party), Port (loss or theft of a portable device, e.g. laptop), Rans (loss of data via a ransomware attack), and Unknwn (unknown source of data loss).
The top 10 biggest US-based data breaches
- Yahoo — 3 billion records affected: In August 2013, hackers attacked Yahoo and compromised user accounts. In its initial acknowledgment of the breach, which was only in December 2016, Yahoo said 1bn user accounts had been affected. But in 2017, it updated this to say it believed all 3bn of its users’ accounts had been impacted. These updated figures make it the largest breach in US history.
- Dropbox, LinkedIn et. al. — 2.2 billion records affected: More than 2.2bn records were stolen from across a number of large websites, including Dropbox and LinkedIn. Hackers dumped the stolen records on the dark web in 2019 in an attempt to sell them. Dubbed “Collection #1”, it appeared the data had been collected over a number of years and included usernames and passwords.
- Comcast — 1.5 billion records affected: A total of 1,507,301,521 records, including Comcast email addresses, client IPs, and hashed passwords were found in a non-password protected database. It was discovered by security researchers in December 2020. It wasn’t the first Comcast data breach, either. A data incident in 2018 saw 26.5m Comcast Xfinity users’ social security numbers and home addresses being exposed. And, in 2014, an employee mistakenly gave two unauthorized people access to a tool that led to the theft of 24.5m records that contained personally identifiable information (PII).
- River City Media — 1.34 billion records affected: An improperly configured backup led to the exposure of 1.34 billion email addresses in 2017. Some records also contained IP addresses, physical addresses, and names, while River City Media’s sensitive business information, e.g. accounts and Hipchat logs, were also available for everyone to see. There was some good to come from the leak, however, as it exposed River City Media’s illicit IP hijacking techniques that had allowed them to create spam campaigns.
- Evite, et. al. — 932 million records affected: During the first few months of 2019, hacker “Gnosticplayers” uploaded almost 1 billion records from 44 companies, including Evite, MindJolt, and Wanelo. The data uploaded included usernames, passwords, email addresses, and IP addresses.
- First American — 885 million records affected: For over two years, nearly 900 million First American customers’ sensitive files were left exposed. The data, which was discovered in May 2019, included social security numbers, bank account numbers and statements, driver’s license images, and more. This gave would-be identity thieves more than enough information to steal money from the victims (although nothing was confirmed). In June 2019, First American settled a $500,000 fine for the breach.
- DreamHost — 815 million records affected: In May 2021, one of the biggest web hosts in the world was reported as having exposed 815 million records in an unprotected database. According to DreamHost, the database was only available for around 12 hours before it was removed, but this still gave threat actors enough time to potentially steal clients’ data, which included names and usernames.
- LinkedIn — 700 million records affected: Just two months after a data breach of 500 million LinkedIn users’ records was exposed, the personal data of 700 million of its users (almost 93 percent of its total users) was posted online. All of the data was available to buy for a mere $5,000. Although the data had been scraped from the website (rather than breached), the information contained full names, email addresses, physical addresses, phone numbers, geolocation records, and more.
- Dubsmash, et. al. — 616.7 million records affected: After 16 websites were hacked, 617 million online account details were put up for sale on the dark web in 2019 for less than $20,000 in Bitcoin. Records came from multiple companies and websites, including Dubsmash (162 million), MyFitnessPal (151 million), MyHeritage (92 million), ShareThis (41 million), HauteLook (28 million), EyeEm (22 million), 8fit (20 million), and Whitepages (18 million).
- Facebook — 540 million records affected: In April 2019, nearly 540 million Facebook users’ records were found on unsecured Amazon servers used by Cultura Colectiva, a Mexican social media firm. Facebook confirmed the data, which contained account names, ID numbers, reactions, and comments, had been removed. Just two years later, Facebook’s user data was exposed again with around 533 million user records from 106 countries being found on a hacking forum in April 2021. The data had been scraped in before August 2019.
The top 10 biggest non-US data breaches
While the US is home to some of the biggest data breaches in US history, there are a couple that eclipse these.
- CAM4 — 10.88 billion records affected: In the biggest-ever breach of data, CAM4, an adult website, left an ElasticSearch database unsecured before it was found by security researchers in March 2020. The data was made up of 7TB of data–a total of 10.88 billion records. Names, email addresses, payment logs, IP addresses, sexual preferences, and chat transcripts were all part of the data set. Experts believe around 6.6m US users, 5.4m Brazilian users, 4.9m Italian users, and 4.2m French users were part of the breach. CAM4 said there was no indication bad actors had accessed the database before it was taken down.
- Cognyte — 5 billion records affected: In May 2021, Bob Diachenko, who leads Comparitech’s security research team, discovered an exposed database that was accessible to all users without any form of authentication. Ironically, the database was stored by cybersecurity analytics firm, Cognyte. It formed part of its cyber intelligence service, which would alert users if their data was part of third-party data exposure. Included within the 5bn records were names, passwords, email addresses, and the original source of the leak.
- Verifications.io — 2.07 billion records affected: Another unsecured database was discovered by Bob Diachenko in February 2019. It contained 808.5m records which, as well as email addresses, also included personally identifiable information. Upon further analysis, researchers suggested as many as 2.07 billion records had been exposed in total. The database was traced back to verifications.io, an email marketing company.
- Aadhaar — 1.1 billion records affected: In 2018, the Indian government’s ID database, Aadhaar, was impacted by a number of breaches which left the 1.1bn citizens registered on the database vulnerable to exploitation. Reports stated that in January 2018, criminals were granting access to the database for 10 minutes at a cost of Rs500 (around $8 at the time).
- Taobao (Alibaba) — 1.1 billion records affected: Joint with the Aaadhaar breach is the hack of Alibaba’s shopping website, Taobao. For eight months (from November 2019), web-crawling software was used by a developer to gather customers’ information, including mobile numbers and user IDs.
- Microloans Database (Organization Unknown) — 870 million records affected: Discovered by Safety Detectives in December 2021, the unsecured ElasticSearch server contained 870 million records that appeared to relate to Russian, Ukrainian, and Kazakh users who had applied for microloans with various companies. The entity responsible for the leak has remained unidentified but it is believed it is based in Russia. Estimates suggest 10 million individual people may have been impacted by the breach.
- Bykea — 400 million records affected: The entire user database of Pakistan-based Bykea (a parcel delivery and vehicle hire company) was left publicly exposed in late 2020 without any encryption or password protection. This gave anyone access to more than 200 GB of data, which featured 400m records including names, locations, and more.
- Airtel — 325 million records affected: A bug in the mobile app for India’s biggest telecom provider, Airtel, meant millions of its users’ personal data was exposed. The 2019 security flaw gave would-be thieves access to names, email addresses, IEMI numbers, addresses, subscription information, and other sensitive data.
- Truecaller — 300 million records affected: In May 2019, a security researcher reported that 300 million Indians’ data from the Truecaller caller ID app was available for sale on the dark web. The Sweden-based company denied that its data had been breached. However, a year later, 47.5m records from Truecaller were leaked on the dark web and Truecaller attributed this data to the theft in May 2019.
- Indian Citizens Database (MongoDB) — 275.3 million records affected: For more than two weeks, a vast MongoDB database that contained the records of 275,265,298 Indians was left exposed. Bob Diachenko discovered the database and found it had first been indexed in April 2019. Personal data included names, email addresses, phone numbers, professional information, education details, salary data, and more. Diachenko was unable to find any information which would link the database to the owner. But, on May 8, hackers ‘Unistellar’ removed the database.
To collate this list of the biggest data breaches in the US, we’ve searched through industry news and company announcements from across the globe. We’ve logged any breaches impacting over 10 million records from 2005 to present.
Some of the users impacted by these data breaches may have been located in other countries but we have used the companies’ headquarters as the location for the maps and data. These locations are just for illustrative purposes, however, and may not be the precise location.
Equally, the number of “records” doesn’t necessarily indicate the number of people impacted by the breach. Records often include a multitude of things, e.g. email addresses, documents, bank account details, social security numbers, and so on. Therefore, one user may have a number of records included–or, the record may be a business-related document that doesn’t disclose user data but private data for the business.
To create a location-based map, we have used the headquarters of the company. These locations are for illustrative purposes only. The attacks on “Dropbox, et. al.” and “Evite, et. al.” haven’t been included in the map due to there being multiple locations involved.
The date of the breach is often the date it has been reported/discovered.
Data researcher: George Moody
For a full list of sources, please request access here.