If you expose it, they will come: data honeypot draws hundreds of attackers

If you expose it, they will come: our data honeypot draws hundreds of attackers

We put a MongoDB honeypot on the web for three months to see who would attempt to view, steal, and destroy exposed data.

Paul Bischoff Tech Writer, Privacy Advocate and VPN Expert

@pabischoff Updated: March 15, 2022

Comparitech’s security research team, led by cybersecurity expert Bob Diachenko, routinely scans for unsecured or misconfigured servers that leak sensitive user data on the web. Vulnerable databases are a prevalent problem among organizations that store data online, allowing unauthorized third parties to find, access, and even modify exposed data without a password or any other authentication. Such missteps often lead to data breaches, putting user privacy and security at risk.

We were interested to find out the nature of these unauthorized requests and where they come from. To do so, Comparitech researchers created a MongoDB database on an unsecured server to use as a honeypot. The honeypot contained fake user data to lure in unauthorized users. The team recorded 428 unauthorized connections over a three-week period between June 12 and July 3, 2020.

The experiment is similar to an earlier one in which we exposed an Elasticsearch database honeypot to see how long it would take unauthorized users to find and attack it. This study focuses on the origin and nature of those attacks.

Here’s a breakdown of what we found:

Half of all requests came from the USA

Of the 428 unauthorized requests to our MongoDB honeypot, 218 of them came from IP addresses registered to the USA.

The next closest country was the Netherlands (51), followed by France (34), Singapore, and Russia. Check out the map below to see hotspots from which requests originated.

Note that just because an IP address is associated with a certain country, it doesn’t necessarily mean the attacker is in that country. Requests can be sent remotely from virtual machines and servers, and through proxies.

Diachenko surmises that US IP addresses are popular among attackers because they’re less likely to be flagged as malicious. Many of the world’s legal web scanners are based in the USA, for example.

40% of requests stole, modified, and/or destroyed data

Researchers categorized each of the unauthorized requests into one of four types:

127 legitimate scans: Requests from IP addresses registered as internet scanners that are clear about their purpose. One example is Intrinsec, a French IT security company that maps open source data on the internet to provide security services for customers. It made 34 requests to our honeypot, none of which were malicious, accounting for all of the requests from France.
130 status checks: Mostly benign requests that check the server and connection statuses. These are similar to legitimate scans but the unauthorized party’s motive and identity are not known. No data is accessed, modified, or removed.
137 data thefts: Requests to view, scrape, and download data from the unsecured server without authorization.
34 destructive requests: Malicious requests that modify and/or destroy the data on the server. In many cases, the data is downloaded by the attacker before being deleted. The attacker then leaves a note demanding ransom for safe return of the data.

Most IPs had been blacklisted

Several organizations maintain blacklists of IP addresses known to be malicious. These include Cisco Talos, AbuseIPDB, and CBL.

In total, Comparitech researchers recorded requests from 108 unique IP addresses. All but five appeared in at least one of these blacklists, accounting for about 85% of all the unauthorized requests recorded.

Three of the five unlisted IPs were from Singapore, and two were from the USA. All three made multiple requests that fall into the “data theft” category above.

Bots play a large role in data theft

Of all the malicious requests our honeypot received, a substantial number appear to come from bots. The IP address that sent the most malicious requests, according to CBL’s blacklist, was part of the Conficker botnet. It made 26 connections to steal content and metadata from the unsecured database.

Two other IP addresses that sent a large number of requests were infected by Conficker and Gamarue botnets.

Two of the IP addresses that hadn’t been blacklisted were associated with personal websites belonging to a web developer and photographer. It’s likely that these were infected by a botnet and the owners are not aware that their sites are being used to find and steal data.

20 unauthorized requests per day

The first attack came 7 hours and 31 minutes after exposing the honeypot. This is in line with our previous honeypot experiment using an Elasticsearch server, which received its first unauthorized request 8 hours and 35 minutes after deployment.

For the entire observation period, the MongoDB honeypot received an average of 20 unauthorized requests per day, versus the Elasticsearch honeypot’s 18 per day.

The most requests in a single day, 45, came on June 19, 2020 when the database was indexed by search engines.

Correction: An earlier version of this article incorrectly stated that the honeypot was open for three months. It was open for three weeks.

Recommended reading: How to establish a honeypot on your network