What is Threat Hunting

Whether software-based or human, malicious threats don’t declare their presence. Cybersecurity tools have to search for malware, intruders, and malicious insiders and there is a range of techniques that can be used for this purpose.

Each security software provider has a favored technique and they stick with it because it works. Although there are several different threat hunting strategies, there isn’t a league table of them – there isn’t one that is universally regarded as better than the others. We will look at the main methods of threat detection and explain how they work.

Threat hunting terminology

Like any discipline of IT, cybersecurity and threat hunting have their own terminology. Here are some of the important terms that you should know when studying threat hunting:

  • EDR – Endpoint detection and response Monitors endpoints (desktops, servers, mobile devices, IoT gadgets) for threats and implements automated responses on detection
  • XDR – Extended detection and response A package of security tools that combine to provide threat detection and automated response. The main elements should be delivered by SaaS.
  • IDS – Intrusion detection system This is a threat hunting package that either works on log files in the case of host-based IDS (HIDS) or on live activity monitoring data in the case of network-based IDS (NIDS).
  • IPS – Intrusion prevention system An IDS with automated response rules.
  • SIEM – Security information and event management Searches log files and network monitoring data and raises an alert on threat detection. This is a combination of security information management (SIM), which is a HIDS, and security event management (SEM), which is a NIDS.
  • SOAR – Security orchestration and response Data exchanges between security software that emits either threat information or response instructions.
  • SOC – Security operations center A data center that provides monitoring and management of security software, adding specialist human analysis.
  • CTI – Cyber Threat intelligence Data about attacks that have occurred elsewhere and how they proceeded. It can also be a warning of a data leak or chatter on hacker sites that indicate an imminent attack on a country, a sector, a business, or an individual is likely.
  • IoA – Indicator of attack A sign that an attack is underway. This would be the discovery of a phishing email or an infected email attachment. An RDP connection attempt or excessive failed login attempts for an account are other signs that an attack is underway.
  • IoC – Indicator of compromise A remnant of a recent attack. These can be formulated into a sequence of factors, malware signatures, and malicious addresses. They are communicated as threat intelligence. However, small variations in methods can thwart detection.
  • TTP – Tactics, techniques, and procedures Threat strategies that are communicated as warnings in threat intelligence feeds. They detail recently detected hacker group activity, such as the group name and their methods of investigation, and ingress.
  • UBA – User behavior analytics The tracking of activity per udder account that establishes a pattern of normal behavior. This provides a baseline for anomaly detection, which looks for deviation from the user account’s standard behavior.
  • UEBA – User and entity behavior analytics The same as UBA but with a record of standard activity on a device, which is usually an endpoint.

Threat hunting data sources

Threat hunting can take place at two levels: in the enterprise or globally. Global searches rely on data uploaded from enterprises.

The large volumes of data involved in such systems can be greatly reduced by installing pre-processing modules on each contributing system. This strategy could filter out some information that the local analytical engine doesn’t identify as a potential indicator. So, the quality of this source data greatly depends on the algorithm used by the threat intelligence group. The data searching speed and capacity of the central hunting unit also influence the amount of filtering needed by data collectors.

Enterprise threat hunting relies on three main sources for input data:

  • Log messages
  • System monitoring
  • Observability

All three types of data need to be gathered from every component of the system – both hardware and software – to gain a complete picture of an attack.

Log messages

The main source of data for threat hunting comes from log messages. Every operating system and just about every application generates log messages and collecting these provide a rich source of information about system activities.

System monitoring

Just as log messages are constantly circulating in an IT system so is monitoring data. Again, it just needs to be collected. This information includes live network performance information available through the Simple Network Monitoring Protocol (SNMP). Other sources of information that are readily available for very little effort include process data from the Task Manager in Windows or the ps utility in Linux, Unix, and macOS.

Observability

Serverless systems and fileless activities are harder to track and require active data gathering with distributed tracing telemetry systems. The information derived from tracing running processes can be supplemented by code profiling where activities are implemented by plain text coding languages. Memory dumps and string scanning can also reveal indications of attack through the presence of DLL functions that are known to be used in malicious attacks, such as memory and register manipulating utilities.

Data flows for threat hunting

While there are some distributed threat hunting systems available, mainly for endpoint protection, these services are usually centralized. A next-generation AV uses threat hunting on the device but all other systems, including IDS, EDR, XDR, and SIEM tools gather data from multiple devices on the system into a centralized pool.

Software-as-a-Service threat hunters can unify the data lakes used for all hosted clients into a service-wide threat detection package. This flow of data into a cloud collection is also the main source of information used for threat intelligence feeds.

In all cases, data will be gathered from different types of sources, such as log files or system monitoring records, and are translated into a common format so that those different record layouts can be collected into a uniform list with the same data fields in the same location. Threat hunting is implemented as a series of searches along with data filtering and grouping.

Source data for threat hunting flow towards a central pool. Threat intelligence, which informs threat detection, travels in the other direction – from a central location out to individual corporate accounts and then on to local data processors, such as device-based AVs. Response instructions also travel from the central system out to local devices.

Threat hunting strategies

All threat hunting systems fall into two strategy categories: signature-based and anomaly-based searches.

Signature-based threat hunting

Signature-based detection methods are the oldest strategy used for cybersecurity products. The original anti-virus systems used this approach, which scanned systems for the presence of specific files, which are usually identified by a hash signature rather than their names.

Although signature-based threat hunting is an old strategy, it isn’t outdated – this is how systems that work with threat intelligence feeds operate. The CTI feed supplies a list of processes, files, or addresses to look for (TTPs) and the threat hunter scans its data lake for their presence.

Threat hunting based on threat intelligence is also referred to as hypothesis-driven investigation. Signature-based threat hunting occurs in real-time, scanning data as it arrives from distributed collectors and also retrospectively, looking through stores of event records. A retrospective search is needed whenever a new TTP is delivered. This is because previous insights did not include these attack strategies and the intrusion detection system needs to find out whether the system has already been compromised by hackers, using the newly-discovered methods.

IoAs and IoCs, delivered by the threat hunting system provider, also implement signature-based detection. These information sources are used to search through live and historical data.

Anomaly-based threat hunting

Rather than looking for the existence of something, anomaly-based threat hunting identifies irregularities. Specifically, it looks for changes in the pattern of activity in the system. This type of detection is particularly important for guarding against account takeover and insider threats.

Events such as data theft don’t require hackers to install any new software – the facilities that you already have available for authorized users are good enough to help data thieves extract valuable information. User behavior analytics (UBA) and user and entity behavior analytics (UEBA) aim to combat the unauthorized use of authorized applications.

Anomaly-based detection first has to work out what is normal behavior before it can spot deviations. Therefore, UBA and UEBA systems use machine learning to register a baseline of regular activity for each user or device. Machine learning is a discipline of Artificial Intelligence (AI).

Automated detection vs manual threat analysis

Threat hunting is a constant process and involves searching through large volumes of data. This is a task that computers are very well suited for. For all but the smallest business, the idea of manually searching for threats is just a non-starter.

Manual analysis of data does have a role in threat hunting as a supplement to constant automated detection. The best combination of computerized and human analysis is to let the automated system sort through data and then flag borderline cases for human assessment.

A big problem with rule-based threat detection systems that include automated responses is that they can misattribute normal behavior and lock out legitimate users. UBA and UEBA were applied to threat hunting in an attempt to cut down this false-positive reporting.  No matter how finely-tuned a detection system gets there is always the possibility that a user will suddenly perform an action that is part of his regular work routine.

The acceptable anomaly might be an infrequent task that the user is expected to perform but that the machine learning system hasn’t seen before because it hasn’t been running long enough.

The type of manual intervention in a threat hunting system is down to the business’s security policy. For example, an unusual and rarely occurring event could be referred to as a human through the issuing of an alert as the only response. Alternatively, the settings of an IPS could mandate a security-first strategy that blocks accounts involved in anomalous behavior and then leaves the assessment of the activity and possible reversal of the response action to an administrator.

The role of security analyst isn’t a job that can be allocated at random to a technician in the support team. This is a highly specialized skill and qualified analysts are hard to come by. When they can be found, they are very expensive. This is one reason why automated cybersecurity tools are worth buying.

A solution to the need for human expertise in cybersecurity is to contract a managed threat hunting service. This includes a SaaS-based package of the security software and the server to run it and adds on a team of security experts to analyze those unusual anomalies that the computer system cannot categorize.

Threat Response

Although it is out of scope for this article, threat response needs a mention. DEndpoint Detection and Response (EDR), Extended Detection and Response (XDR), and Intrusion Prevention System (IPS) tools all link responses to threat detection. These automated systems can be set up to trigger specific actions for different threats. For example, a network threat detetion system that identifies unusual traffic can be set up to write a firewall rule that blocks packets from the mistrusted source.

Heimdal provides a threat hunting service that can be linked to automated responses through its Action Center module. The mechanism behind systems like Heimdal is a rule base. This is a list of instructions with a condition that triggers an action. The importance of linking responses to threats is illustrated by Heidal’s treatment of “Detect and Respond” and a single category of cybersecurity services on its website.

Heimdal Access a FREE Demo

Threat hunting in cybersecurity tools

You can read our recommendations on systems for threat hunting in the The Best Threat Hunting Tools. For an illustration of how different tools can perform threat hunting individually and as part of a suite of services, we can look at the packages offered by CrowdStrike.

CrowdStrike’s cybersecurity tools are offered from a SaaS platform canned Falcon. One of their services is a next-generation anti-virus service and this is the one application of the Falcon suite that runs on endpoints instead of on the Falcon cloud platform.

Here is how the tools fit together:

  • CrowdStrike Falcon Prevent – on-premises Performs threat hunting locally and also acts as a data collection agent for cloud-based Falcon systems.
  • CrowdStrike Falcon Insight – cloud-based An EDR that coordinates the activities of all Falcon Prevent instances for a business and searches through uploaded data in the manner of a SIEM.
  • CrowdStrike Falcon XDR – cloud-based Operates in the same manner as Falcon Insight but also draws activity data from third-party tools and can communicate automated responses to Falcon discover or other on-premises security systems.
  • CrowdStrike Falcon Intelligence – cloud-based Threat intelligence compiled by CrowdStrike from all of the EDR and XDR systems operating for clients plus third-party application experiences.
  • CrowdStrike Falcon Overwatch – cloud-based A managed service that provides the Falcon Insight system along with technicians and analysts to run the system and provide a manual assessment of data.