Best PII Scanning Tools

PII is Personally Identifiable Information. It specifically relates to collections of data fields that can identify a private individual

Legislation in many countries lays down specifications of what is considered to be an abuse of PII. Generally, companies can only hold PII for specific purposes, and that data must be assumed to be accurate and kept confidential.

Failure to follow the guidelines on proper usage of PII can result in a hefty fine, and those whose data you failed to protect can all sue your company. The financial consequences of misuse or disclosure of PII can be disastrous.

Here is our list of the best PII scanning tools:

  1. ManageEngine DataSecurity Plus EDITOR’S CHOICE A system auditing, compliance, and data loss protection package for Windows Server. This tool will track sensitive data and block or log its movement, while also blocking and deleting junk data. Start a 30-day free trial.
  2. ManageEngine Endpoint DLP Plus (FREE TRIAL) This on-premises software package discovers sensitive data and then manages data exfiltration channels to block its movement. Runs on Windows Server. Start a 30-day free trial.
  3. Endpoint Protector A PII scanner embedded in combined threat detection and data loss prevention package. Available as a cloud service or as a virtual appliance.
  4. Digital Guardian DLP A DLP delivered from the cloud and included data discovery and classification for PII. Endpoint agents for Windows, macOS, and Linux.
  5. Teramind DLP A cloud-based user productivity tracking, insider threat detection, and DLP package that include eDiscovery.

You can read more about each of these options in the following sections.

Related post: What is PII Compliance?

What is PII?

It isn’t sufficiently precise to say that PII identifies a person. To be clear, collecting data in its totality can lead someone to find or impersonate a specific person. To understand the concept of data that identifies a person, consider a database with a customer table. Columns in that table might be:

  • First Name
  • Last Name
  • Credit Card Number
  • CVV
  • Card Expiry Date
  • Social Security Number
  • Address Line 1
  • Address Line 2
  • City
  • Postal Code
  • Email Address
  • Telephone Number

Imagine what a con artist could do with that information. If a hacker broke into your database and had just enough time to select out one column of that table, which would it be? Depending on what type of scam the data would be used for, a Social Security Number or Email Address would be the best single field to steal. However, most places that ask for a social security number for identity verification would also expect the person to provide a first and last name and possibly a date of birth. So, although that single field precisely identifies a person, it probably can’t be used effectively without more information.

First name and last name individually do not constitute PII. A list of 400 instances of Dave and 300 cases of Jane doesn’t identify anyone. The first and last names together give a better target. However, this still isn’t enough to identify one person in the world. First name, last name, and social security number would do it, as would first name, last name, and email address.

Credit card scammers would need at least the first name, last name, credit card number, CVV, and card expiry date to stand a chance of putting through a transaction online. As most online payment processors also expect an address and telephone number for verification, the credit card thief would need just about all of the columns from the table.

The whole table gives a data thief a lot of data resale opportunities. Several combinations of columns in this table are valuable, and each grouping can supply a different type of thief or con artist.

While database tables are goldmines for data thieves, files containing documents, images of documents and forms, and images can be handy. Unfortunately, images of documents are almost impossible to search with a standard scan. Such as the search utility on a Web page or in Windows Explorer. This task requires optical character recognition (OCR).

In any standard letter, the data fields that any thief would want are spread out throughout the text. Thus, when looking for PII data, you need to identify combinations of data fields. An effective PII scanner needs to determine the presence of these separate fields and spot their existence in approximate, though not adjacent, positions. In the data security sector, together, these scattered fields to identify a PII instance are called “fingerprinting.”

So, a PII scanning tool needs to include OCR and fingerprinting.

One more vital point about PII is that the term only applies to people’s lives outside work. So, if you keep a business contact database that includes the names, business addresses, and contact data related to the workplace, that isn’t counted as PII. For example, suppose a supplier’s sales rep gives you her out-of-hours telephone number and personal email address that isn’t on the corporate domain. In that case, that can cause problems because those pieces of information cross over into the definition of PII.

Data management strategies

If you have a small business, you probably don’t have that many places to store data, so you probably have a good idea of where to locate all of the PII that your company holds. Large businesses have long been alive to the importance of tracking all types of data. However, even those IT departments with a fully documented data management strategy don’t always know where all PII is.

The complication of PII tracking arises from applications that keep their own stores of data. Even if you have a specific file server or cloud storage service for all of your data, the software is still possibly storing PII locally on the server upon which it is installed. Additionally, employees often copy over details when they are working on a specific task or project. For example, a Customer Care operator might note down information in a file when compiling an incident report or composing a letter to a client. The final document might get stored in the right place, but the notes file containing PII might continue to exist on the operator’s local computer.

You might get a surprise when you run a PII scanner for the first time. The data that you know you have might be in a location that you didn’t know about.

PII scanning tools

PII scanning involves three tasks:

  • Searching all endpoints and devices for data locations
  • Searching through the contents of those data locations for indicators that signal PII
  • A classification of PII by sensitivity and type

So, you will end up with a list of computers and devices that contain any kind of data and then a list of locations that specifically hold PII and the type of PII.

Once those data locations have been recorded, you need to implement a data management strategy. If you don’t have a centralized storage facility, now is the time to implement it. Even if you do have such a strategy in place, you will still probably discover that there are copies on local devices and computers and the original held in a central location.

You can take steps to institute either automated procedures or instructions on working practices to get all data stored in one location and copies deleted from local devices. Those definitive data instances can then be encrypted to control access.

PII scanning is only concerned with locating sensitive data related to the company’s specific security policies. What you do to manage and protect that data is out of scope for the PII discovery task. However, most PII scanners are part of broader systems that implement data security. These packages are called Data Loss Prevention (DLP) services.

The Best Data Loss Prevention Systems

When searching for a PII tool, you are really in the market for a DLP system. However, just finding sensitive data is not enough; you also need to organize and protect it. That task involves redefining access rights to provide a finer granularity of access permissions. For example, one general user classification is not good enough, and you need to split those users into groups that departments and roles define.

Different industries are required to protect different types of PII and, in the case of location-designated legislation, such as EU’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), you also need to define in which physical location data can be stored and the location of the people who access it.

You also need tools that protect data stores directly through file integrity monitoring (FIM). These systems log all user actions on files, specifying the account’s name that performed the act. You also need to implement system-wide security to ensure that the data isn’t stolen. These measures include peripheral device controls and scanning of communications systems, such as email and file transfer utilities.

Data loss prevention systems include security policy creation, data discovery and classification, and data movement controls.

Our methodology for selecting PII scanning tools  

We reviewed the market for data loss prevention systems and analyzed the tools based on the following criteria:

  • A data discovery and classification service
  • An access rights auditor
  • A security policy management system that can be preset with templates to cover a specific data privacy standard
  • File integrity monitoring with file encryption for access controls
  • Constant monitoring of peripheral devices, email, and file transfer systems for data movements
  • A free assessment period or demo system
  • Value for money in a package that combines all data protection functions at a fair price

With these selection criteria in mind, we have created a list of suitable DLP packages.

The Best PII Scanning Tools

1. ManageEngine DataSecurity Plus (FREE TRIAL)

ManageEngine DataSecurity Plus

ManageEngine DataSecurity Plus offers file server auditing, compliance monitoring, and data loss prevention. This system includes a PII scanner that categorizes sensitive data. It will also assess and reorganize your access rights management structure. With pre-set security policy templates for the significant data privacy standards, this service will help you enforce compliance with ease.

Key Features:

  • File Activity Tracking: Monitors file access and modifications to ensure data integrity.
  • Data Classification: Automatically identifies and labels sensitive information.
  • Data Leakage Prevention: Implements robust measures to prevent unauthorized data access or loss.
  • Storage Optimization: Analyzes and reorganizes disk space to improve efficiency.
  • Web Traffic Security: Offers protective measures for online data transactions.

Why do we recommend it?

ManageEngine Data Security Plus is a large package of data protection tools and, in fact, is partitioned so that it is really four modules, each charged for individually. The PII scanning part of the tool is its Data Risk Assessment unit. This will adapt its searches to specific standards requirements.

Policies are enforced through file integrity monitoring with encryption and also data exfiltration channel control. This service allows you to permit certain actions on files for specific user groups while blocking those actions from others. In addition, DataSecurity Plus monitors wireless networks and LAN activity as well as endpoints.

The ManageEngine system traces through all of your servers, looking for sensitive data and while it is scanning, it also looks for duplicated and junk data. This is a great assistance for reducing storage space needs and it also clears up multiple stores of sensitive data, which can be a security weakness.

The data protection measures in the ManageEngine DataSecurity Plus package also extend to cloud services. The tool will also inspect cloud platforms and it can help you to block malware and data thieves from getting into your system.

Who is it recommended for?

Many ManageEngine tools can be installed on cloud accounts or they are offered as SaaS packages as well as being available for on-site installation, but this tool is only available for self-hosting. The software package runs on Windows Server, so if you only have Linux servers, you won’t be able to use this tool.

Pros:

  • Efficient PII Handling: Excellently discovers and categorizes personally identifiable information.
  • Enhanced Access Management: Refines data access rights to bolster security measures.
  • Storage Space Reduction: Identifies and eliminates redundant or unnecessary data to free up space.

Cons:

  • Limited Deployment Options: Lacks support for cloud-hosted environments, confining it to on-premise use only.

ManageEngine DataSecurity Plus runs on Windows Server, and it is offered for a 30-day free trial.

EDITOR'S CHOICE

ManageEngine DataSecurity Plus is an extensive package of data management services that identifies and protects sensitive data while also logging activity and blocking malicious actors. The ManageEngine service will provide insider threat and intruder detection systems and it also lets you impose content filters to stop data being taken out of the network and block incoming phishing attempts. The service is able to check on disks, USB devices, email systems, and cloud platforms.

Official Site: https://www.manageengine.com/data-security

OS: Windows Server

2. ManageEngine Endpoint DLP Plus (FREE TRIAL)

ManageEngine Endpoint DLP Plus

ManageEngine Endpoint DLP Plus is a sensitive data management service that provides a range of tools to protect PII and other sensitive data, such as intellectual property. The duties of the Endpoint DLP system begin with a sweep of all endpoints. Discovered data is classified according to its sensitivity ranking as defined by a specific data protection standard, which you specify in the settings of the package.

Key Features:

  • Sensitive Data Scanning: Conducts thorough scans on endpoints to identify sensitive information.
  • Data Categorization: Classifies discovered data based on sensitivity levels, aiding in compliance.
  • Activity Monitoring: Keeps detailed logs of how data is accessed and used across endpoints.
  • Data Movement Restriction: Implements strict controls on how sensitive data is transferred or accessed.

Why do we recommend it?

ManageEngine Endpoint DLP Plus is another data protection tool from ManageEngine that will discover and classify sensitive data. The tool will adapt to specific PII definitions, according to whichever standard you need to follow. The PII scanner is integrated into a wider data loss prevention system.

The primary method for data protection in the Endpoint SLP package is through containerization. Rather than encrypting the contents of the file, this process puts a wrapper around it, blocking access. The file’s contents can be accessed but only through an application and the system administrator defines which applications are trusted to access files containing each category of data.

The trusted applications that are allowed to access sensitive data files need to be picked carefully. They will need to be protected by authentication, with only specific users allowed to access the application and, therefore, the sensitive data file.

The Endpoint DLP Plus system identifies data exfiltration points. These include peripherals, cloud upload facilities, email systems, and file transfer utilities. The tool doesn’t completely block these channels, but examines each file that a user attempts to move and selectively blocks or allows the action depending on the user’s permissions for that type of action with that category of data.

All of the activities on the system that relate to identified sensitive data files are logged. This is an important requirement of data protection standards and lays down the records for compliance auditing and reporting.

Unusual activity that relates to sensitive data stores will raise an alert – from searches on the log messages that Endpoint DLP generates. This is like a host-based intrusion detection system or a SIEM.

The paid version of Endpoint DLP Plus can be extended to cover multiple sites, reporting to a single console.

Who is it recommended for?

ManageEngine offers a Free edition of Endpoint DLP Plus that provides all the functions of the paid version but is limited to managing data on 25 endpoints, which is great for small businesses. The tool provides graded data management controls rather than outright access blocks and is suitable for any size and type of business.

Pros:

  • Comprehensive IP Protection: Safeguards intellectual property and personal identifiable information effectively.
  • Standard-Specific Adaptability: Configures easily to meet various data protection standards.
  • Detailed Compliance Auditing: Generates logs critical for auditing and compliance verification.
  • Selective Data Access: Enables precise control over who can access or move sensitive data.

Cons:

  • Limited Deployment Flexibility: Only available for on-premise deployment, lacking SaaS options.

The software for ManageEngine Endpoint DLP Plus installs on Windows Server. There are two editions available: Free, which is limited to monitoring files on 25 devices, and Professional, which can operate for a single LAN or a corporate WAN. You can get a 30-day free trial of the Professional edition.

ManageEngine Endpoint DLP Plus Start 30-day FREE Trial

3. Endpoint Protector

Endpoint Protector

Endpoint Protector is a complete DLP solution that provides all of the requirements that we looked for in a PII scanning tool, and it also includes a threat detection system.

Key Features:

  • Automated Data Identification: Efficiently locates sensitive data across endpoints automatically.
  • Policy-Driven Security: Enables customized security policy enforcement for data handling.
  • Behavioral Analysis: Monitors user activities to establish normal patterns and detect anomalies.
  • Flexible Deployment: Offers various deployment options to fit different organizational needs.

Why do we recommend it?

Endpoint Protector is particularly good at controlling the movement of sensitive data. However, an essential part of this task is to identify and categorize PII and that forms part of the package. The tool doesn’t completely block access or usage of protected data. It allows some accounts to perform some actions on some data and logs all activities.

After specifying your security policies in the dashboard of Endpoint Protector, the service swings into action by installing agents on all of your enrolled endpoints. It is possible to create those policies by applying a template from the Endpoint Protector library.

The endpoint agents implement your security policy. First, they sweep each endpoint and discover all data stores. Next, all data instances are categorized to identify the sensitive data. This service is continuous. The central server also audits your access rights management systems to create improved data access permissions. Endpoint Protector will then encrypt those identified data locations and monitor access.

In addition to the data identification service, the system profiles all user activities to create a baseline of everyday activities. The endpoint’s system routines are also examined to find a standard pattern. The service will raise an alert if activity deviates from this norm. This is a helpful way to spot intruder activity, account takeover, or insider threats.

The service also controls all data movements onto peripheral devices or networked printers and fax machines. In addition, the system scans all emails and watches file transfer utilities to block unauthorized data activities.

Endpoint Protector also includes data management and threat protection controls. The Endpoint Protector allows any business to become fully compliant with data privacy standards and can be used effectively by those with no technical skills or legal expertise.

Who is it recommended for?

The Endpoint Protector system includes powerful controls over USB ports and ensures that memory sticks are encrypted if they receive PII. The tool also looks at file transfer systems, Web and cloud upload facilities, and emails. So, it provides extensive scanning of data exfiltration channels.

Pros:

  • Proactive Data Management: Automatically discovers and classifies personal identifiable information.
  • Enhanced Access Auditing: Reviews and improves access rights for tighter security.
  • Peripheral Control: Manages data transfer through USB and other peripherals to prevent leaks.
  • Encryption and Logging: Secures files through encryption and maintains access logs for integrity.

Cons:

  • Missing SIEM Integration: Lacks built-in SIEM functionality for centralized security management.

Endpoint Protector can be accessed as a paid service on Azure, GCP, and AWS. It is also offered as a SaaS package hosted by CoSoSys. Those who want an on-premises solution can install the software as a virtual appliance.  The service also includes endpoint agents for Windows, macOS, and Linux. You can assess Endpoint Protector by accessing a demo.

4. Digital Guardian DLP

Digital Guardian Endpoint DLP

Digital Guardian DLP is a Software-as-a-Service platform that offers both data leak prevention and threat detection. The system implements data controls through endpoint agents that are available for Windows, macOS, and Linux.

Key Features:

  • Cloud-Based Solution: Provides a comprehensive SaaS platform for data leak prevention.
  • Enhanced Access Rights: Refines user permissions for more secure data access and management.
  • Intellectual Property Safeguard: Protects sensitive corporate data including intellectual property.

Why do we recommend it?

Digital Guardian DLP works a little like an XDR. It imposes local controls but central strategy management. This hybrid solution provides a global overview of a company but it isn’t so remote that people can slip out data through the cracks. The service has its own PII discovery and classification routines.

Like the other DLPs on this list, the strategy of Digital Guardian DLP ties together accesses rights management with sensitive data access through security policies. There are security policy templates that take care of the necessary system stings for specific data privacy standards.

The service examines and improves your user account permissions and performs an eDiscovery sweep, which operates continuously. This searches out all data stores and then identifies the PII.

Controls are implemented to watch over USB ports, printers, faxes, emails, and file transfer systems to prevent unauthorized movement of data. This service also protects intellectual property. The Digital Guardian system profiles user activity and system tasks to spot anomalies that indicate intrusion, account takeover, and insider threats.

Who is it recommended for?

The structure of Digital Guardian DLP lends itself to the implementation of sensitive data management for large organizations. There isn’t a free edition and although the architecture of the system could scale down for small businesses, it works best when scaled up. The system also provides threat hunting.

Pros:

  • Sensitivity Awareness: Identifies and categorizes sensitive data for better protection strategies.
  • Integrated Threat Detection: Adds a layer of security by detecting potential threats alongside DLP.
  • Data Transfer Oversight: Controls the movement of sensitive information to minimize risk.

Cons:

  • Opaque Pricing: Does not provide upfront pricing details, requiring direct contact for quotes.

You can assess the platform with a demo account.

5. Teramind DLP

Teramind

Teramind DLP Is a data loss prevention system that includes employee productivity profiling and insider threat detection. This system can be tailored towards GDPR, HIPAA, ISO 27001, and PCI DSS requirements.

Key Features:

  • Policy Customization: Enables the creation of detailed security policies tailored to specific standards.
  • Regulatory Adherence: Facilitates compliance with major data protection regulations like GDPR and HIPAA.
  • Versatile Implementation: Offers deployment flexibility through cloud-based and on-premise solutions.

Why do we recommend it?

Teramind DLP is very similar to Digital Guardian DLP. Both include PII discovery and classification and then control data access. The big difference between these two systems is that the threat hunting in the Tramind system focuses on insider threats. This extends to account takeovers, which can render user access controls worthless.

An essential task performed by this DLP is the discovery and classification of PII. Features in the package also offer data usage analysis and a risk assessor. Security policies are implemented by file integrity monitoring, which is implemented by encryption. The tool also controls all data exfiltration points.

Teramind DLP is offered as a SaaS platform or as a virtual appliance. The cloud-based edition requires endpoint agents on enrolled devices. These are available for Windows and macOS but not for Linux. However, you can run the endpoint agents over a VM.

Who is it recommended for?

Like the Digital Guardian system, this package is more appropriate for use by large businesses. The Teraming group provides its customers with deployment options. The system is offered as a combination of a SaaS system with local agents or as an on-premises virtual appliance.

Pros:

  • Insider Threat Insights: Specializes in identifying and mitigating risks from within the organization.
  • Sensitive Data Identification: Efficiently locates and classifies personal and sensitive information.
  • Standards-Ready Templates: Provides templates for quick compliance with various security standards.

Cons:

  • Linux Support Lacking: Does not offer endpoint agents for Linux, potentially limiting coverage.

You can access a 14-day free trial of Teramind DLP to assess the system.

PII Scanning FAQs

What is PII scanning?

PII scanning is an automated detection service that looks for incidences of personally identifiable information (PII). Some PII discovery systems can identify separated fields of data that, when extracted together can identify a person even though they don’t classify as PII when viewed individually.

What PII Compliant?

PII is personally identifiable information and its theft or misuse can create major problems for the people to whom it relates. The importance of security for PII is enforced by government and industry body standards, such as GDPR, PCI DSS, and HIPAA. Companies can lose business or get fined if they can’t show that they are enforcing these rules. Sticking to data protection standards is known as “compliance”.

Is PII a legal concept?

The term “personally identifiable information” (PII) was coined in legal circles and as enforcement of prediction of PII is down to IT systems, the PII abbreviation has become a commonly-used term in IT.