PII is Personally Identifiable Information. It specifically relates to collections of data fields that can identify a private individual
Legislation in many countries lays down specifications of what is considered to be an abuse of PII. Generally, companies can only hold PII for specific purposes, and that data must be assumed to be accurate and kept confidential.
Failure to follow the guidelines on proper usage of PII can result in a hefty fine, and those whose data you failed to protect can all sue your company. The financial consequences of misuse or disclosure of PII can be disastrous.
Here is our list of the best PII scanning tools:
- ManageEngine DataSecurity Plus A system auditing, compliance, and data loss protection package for Windows Server.
- Endpoint Protector A PII scanner embedded in combined threat detection and data loss prevention package. Available as a cloud service or as a virtual appliance.
- Digital Guardian DLP A DLP delivered from the cloud and included data discovery and classification for PII. Endpoint agents for Windows, macOS, and Linux.
- Teramind DLP A cloud-based user productivity tracking, insider threat detection, and DLP package that include eDiscovery.
You can read more about each of these options in the following sections.
Related post: What is PII Compliance?
What is PII?
It isn’t sufficiently precise to say that PII identifies a person. To be clear, collecting data in its totality can lead someone to find or impersonate a specific person. To understand the concept of data that identifies a person, consider a database with a customer table. Columns in that table might be:
- First Name
- Last Name
- Credit Card Number
- Card Expiry Date
- Social Security Number
- Address Line 1
- Address Line 2
- Postal Code
- Email Address
- Telephone Number
Imagine what a con artist could do with that information. If a hacker broke into your database and had just enough time to select out one column of that table, which would it be? Depending on what type of scam the data would be used for, a Social Security Number or Email Address would be the best single field to steal. However, most places that ask for a social security number for identity verification would also expect the person to provide a first and last name and possibly a date of birth. So, although that single field precisely identifies a person, it probably can’t be used effectively without more information.
First name and last name individually do not constitute PII. A list of 400 instances of Dave and 300 cases of Jane doesn’t identify anyone. The first and last names together give a better target. However, this still isn’t enough to identify one person in the world. First name, last name, and social security number would do it, as would first name, last name, and email address.
Credit card scammers would need at least the first name, last name, credit card number, CVV, and card expiry date to stand a chance of putting through a transaction online. As most online payment processors also expect an address and telephone number for verification, the credit card thief would need just about all of the columns from the table.
The whole table gives a data thief a lot of data resale opportunities. Several combinations of columns in this table are valuable, and each grouping can supply a different type of thief or con artist.
While database tables are goldmines for data thieves, files containing documents, images of documents and forms, and images can be handy. Unfortunately, images of documents are almost impossible to search with a standard scan. Such as the search utility on a Web page or in Windows Explorer. This task requires optical character recognition (OCR).
In any standard letter, the data fields that any thief would want are spread out throughout the text. Thus, when looking for PII data, you need to identify combinations of data fields. An effective PII scanner needs to determine the presence of these separate fields and spot their existence in approximate, though not adjacent, positions. In the data security sector, together, these scattered fields to identify a PII instance are called “fingerprinting.”
So, a PII scanning tool needs to include OCR and fingerprinting.
One more vital point about PII is that the term only applies to people’s lives outside work. So, if you keep a business contact database that includes the names, business addresses, and contact data related to the workplace, that isn’t counted as PII. For example, suppose a supplier’s sales rep gives you her out-of-hours telephone number and personal email address that isn’t on the corporate domain. In that case, that can cause problems because those pieces of information cross over into the definition of PII.
Data management strategies
If you have a small business, you probably don’t have that many places to store data, so you probably have a good idea of where to locate all of the PII that your company holds. Large businesses have long been alive to the importance of tracking all types of data. However, even those IT departments with a fully documented data management strategy don’t always know where all PII is.
The complication of PII tracking arises from applications that keep their own stores of data. Even if you have a specific file server or cloud storage service for all of your data, the software is still possibly storing PII locally on the server upon which it is installed. Additionally, employees often copy over details when they are working on a specific task or project. For example, a Customer Care operator might note down information in a file when compiling an incident report or composing a letter to a client. The final document might get stored in the right place, but the notes file containing PII might continue to exist on the operator’s local computer.
You might get a surprise when you run a PII scanner for the first time. The data that you know you have might be in a location that you didn’t know about.
PII scanning tools
PII scanning involves three tasks:
- Searching all endpoints and devices for data locations
- Searching through the contents of those data locations for indicators that signal PII
- A classification of PII by sensitivity and type
So, you will end up with a list of computers and devices that contain any kind of data and then a list of locations that specifically hold PII and the type of PII.
Once those data locations have been recorded, you need to implement a data management strategy. If you don’t have a centralized storage facility, now is the time to implement it. Even if you do have such a strategy in place, you will still probably discover that there are copies on local devices and computers and the original held in a central location.
You can take steps to institute either automated procedures or instructions on working practices to get all data stored in one location and copies deleted from local devices. Those definitive data instances can then be encrypted to control access.
PII scanning is only concerned with locating sensitive data related to the company’s specific security policies. What you do to manage and protect that data is out of scope for the PII discovery task. However, most PII scanners are part of broader systems that implement data security. These packages are called Data Loss Prevention (DLP) services.
The best data loss prevention systems
When searching for a PII tool, you are really in the market for a DLP system. However, just finding sensitive data is not enough; you also need to organize and protect it. That task involves redefining access rights to provide a finer granularity of access permissions. For example, one general user classification is not good enough, and you need to split those users into groups that departments and roles define.
Different industries are required to protect different types of PII and, in the case of location-designated legislation, such as EU’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), you also need to define in which physical location data can be stored and the location of the people who access it.
You also need tools that protect data stores directly through file integrity monitoring (FIM). These systems log all user actions on files, specifying the account’s name that performed the act. You also need to implement system-wide security to ensure that the data isn’t stolen. These measures include peripheral device controls and scanning of communications systems, such as email and file transfer utilities.
Data loss prevention systems include security policy creation, data discovery and classification, and data movement controls.
What should you look for in PII scanning tools?
We reviewed the market for data loss prevention systems and analyzed the tools based on the following criteria:
- A data discovery and classification service
- An access rights auditor
- A security policy management system that can be preset with templates to cover a specific data privacy standard
- File integrity monitoring with file encryption for access controls
- Constant monitoring of peripheral devices, email, and file transfer systems for data movements
- A free assessment period or demo system
- Value for money in a package that combines all data protection functions at a fair price
With these selection criteria in mind, we have created a list of suitable DLP packages.
The Best PII Scanning Tools
ManageEngine DataSecurity Plus offers file server auditing, compliance monitoring, and data loss prevention. This system includes a PII scanner that categorizes sensitive data. It will also assess and reorganize your access rights management structure. With pre-set security policy templates for the significant data privacy standards, this service will help you enforce compliance with ease.
Policies are enforced through file integrity monitoring with encryption and also data exfiltration channel control. This service allows you to permit certain actions on files for specific user groups while blocking those actions from others. In addition, DataSecurity Plus monitors wireless networks and LAN activity as well as endpoints.
- PII discovery and classification
- Data access controls
- Access rights management improvement
- No cloud version
ManageEngine DataSecurity Plus runs on Windows Server, and it is offered for a 30-day free trial.
Endpoint Protector is a complete DLP solution that provides all of the requirements that we looked for in a PII scanning tool, and it also includes a threat detection system.
After specifying your security policies in the dashboard of Endpoint Protector, the service swings into action by installing agents on all of your enrolled endpoints. It is possible to create those policies by applying a template from the Endpoint Protector library.
The endpoint agents implement your security policy. First, they sweep each endpoint and discover all data stores. Next, all data instances are categorized to identify the sensitive data. This service is continuous. The central server also audits your access rights management systems to create improved data access permissions. Endpoint Protector will then encrypt those identified data locations and monitor access.
In addition to the data identification service, the system profiles all user activities to create a baseline of everyday activities. The endpoint’s system routines are also examined to find a standard pattern. The service will raise an alert if activity deviates from this norm. This is a helpful way to spot intruder activity, account takeover, or insider threats.
The service also controls all data movements onto peripheral devices or networked printers and fax machines. In addition, the system scans all emails and watches file transfer utilities to block unauthorized data activities.
Endpoint Protector is also includes data management and threat protection controls. The Endpoint Protector allows any business to become fully compliant with data privacy standards and can be used effectively by those with no technical skills or legal expertise.
- A discovery and classification service for PII
- Access rights auditing
- Control of data exfiltration channels
- File integrity monitoring through encryption and logging
- Standards compliance enforcement
- Could easily also have a SIEM function added to it
Endpoint Protector can be accessed as a paid service on Azure, GCP, and AWS. It is also offered as a SaaS package hosted by CoSoSys. Those who want an on-premises solution can install the software as a virtual appliance. The service also includes endpoint agents for Windows, macOS, and Linux. You can assess Endpoint Protector by accessing a demo.
Digital Guardian DLP is a Software-as-a-Service platform that offers both data leak prevention and threat detection. The system implements data controls through endpoint agents that are available for Windows, macOS, and Linux.
Like the other DLPs on this list, the strategy of Digital Guardian DLP ties together accesses rights management with sensitive data access through security policies. There are security policy templates that take care of the necessary system stings for specific data privacy standards.
The service examines and improves your user account permissions and performs an eDiscovery sweep, which operates continuously. This searches out all data stores and then identifies the PII.
Controls are implemented to watch over USB ports, printers, faxes, emails, and file transfer systems to prevent unauthorized movement of data. This service also protects intellectual property. The Digital Guardian system profiles user activity and system tasks to spot anomalies that indicate intrusion, account takeover, and insider threats.
- Sensitive data discovery and classification
- Additional threat detection
- Data movement controls
- The provider does not publish a price list
You can assess the platform with a demo account.
Teramind DLP Is a data loss prevention system that includes employee productivity profiling and insider threat detection. This system can be tailored towards GDPR, HIPAA, ISO 27001, and PCI DSS requirements.
An essential task performed by this DLP is the discovery and classification of PII. Features in the package also offer data usage analysis and a risk assessor. Security policies are implemented by file integrity monitoring, which is implemented by encryption. The tool also controls all data exfiltration points.
Teramind DLP is offered as a SaaS platform or as a virtual appliance. The cloud-based edition requires endpoint agents on enrolled devices. These are available for Windows and macOS but not for Linux. However, you can run the endpoint agents over a VM.
- Includes insider threat detection and employee performance assessors as well as a DLP
- Discovers and categorizes PII
- Includes pre-set template for standards compliance
- No endpoint agent for Linux
You can access a 14-day free trial of Teramind DLP to assess the system.