Threat Hunting Series: The Threat Hunting Process
Originally posted on July 17, 2022 on Medium.com
In the previous posts of the series, I covered the basics of threat hunting and the core competencies a threat hunter should have. This post will show you the structural process I follow for threat hunting. Anyone who works solely as a threat hunter understands how chaotic the task can get when there is no structure. The threat hunting process doesn’t have to be complicated. The aim of having a process is to guide us through every step of the way, from building the initial hypothesis to analyzing the data and to the final findings.
Something I haven’t mentioned in the previous posts, which I think is a good idea to do now, is that all the information I am trying to communicate is neither new nor revolutionary. I used the resources I posted in “Threat Hunting Series: What Makes a Good Threat Hunter” to learn and then apply this knowledge to my day job. We all stand on the shoulders of giants, and Chris Sanders is one of them, from whom I learned a ton. Some of the concepts I will cover below are based on his methodology.
The threat hunting mental models
Before diving into the actual process, I want to cover the two different mental models that a threat hunter can apply to the threat hunting process. Based on the initial hypothesis, I use these two models to separate the different types of threat hunting.
Attack-based hunting is applicable when hunting for a certain attack technique. In most cases, I find myself hunting for an attack that is known and documented. Attack-based hunting is faster, especially if Indicators of Attack(IOAs) are readily available through third-party research.
If the attack I am hunting for is not well researched, I would have to spend more time emulating the technique in my lab environment. By doing so, I have a better understanding of the attack technique and see the artifacts generated upon successful execution in a windows/Linux/Mac environment.
Attack emulation in threat hunting is an important topic that deserves a post on its own. Keep an eye out for that post coming up.
Below are two examples of attack-based threat hunting. The first one is a good example of attack-based threat hunting. It has a proactive approach looking for a specific technique and is not IOC-based.
The second one is a bad example based on what we know so far about threat hunting and the attack-based threat hunting model. Hunting for specific IOCs is a reactive task.
Hunting for suspicious process execution activity originating from Microsoft Word documents.
Mitre ATT&CK ID: T1204.002 (Execution)
Hunting for domains/IPs associated with the recent campaign of <insert fav malware here>.[Text Wrapping Break][Text Wrapping Break]- Checking for pre-defined malicious IOCs is not threat hunting. Check out my previous posts here and here to find out why.
“Hunting” (not really) across all hosts in our environment for a malicious word document that was detected on one of the hosts.[Text Wrapping Break][Text Wrapping Break]- A detection event cannot be the trigger of a threat hunting operation.
Unlike attack-based threat hunting, this mental model is more advanced since it does not follow a predetermined path. During data-based hunting, the threat hunter is not searching for specific evidence of an attack technique but instead looking for abnormal activity in the dataset of interest.
When using data-based threat hunting, the threat hunter should be familiar with various attack techniques and how they can manifest within the available data sources. Once the data is collected, threat hunters could use certain data analysis techniques depending on how they would like to view and analyse the information at hand. As with the attack-based mental model, I put together a couple of examples to help people understand what data-based threat hunting should be based on. Good example:
Search for suspicious process execution of unknown binaries launched from non-system directories.
Search for suspicious process execution of PowerShell that downloads and executes the payload in memory.
This could be a good example of attack-based threat hunting, but the hypothesis is too specific to be considered data-based threat hunting.
The threat hunting process
The steps involved in threat hunting are listed below. I’ll go through each one, explain how they work, and then give some examples.
1. Establish a hypothesis
The hypothesis drives the threat hunt. This is where threat hunters decide what they will hunt for in the environment. As was already established, the threat hunter assumes that this malicious activity has occurred within the network.
2. Establish evidence
Based on the hypothesis, the threat hunter should research the evidence of the expected malicious activity. Searching for existing write ups from other researchers could be enough to collect the IOAs needed to start hunting.
However, on some occasions, the attacks are not well-documented, and the reports that describe them don’t have enough information. This makes it challenging to understand the attack technique and create threat hunting queries.
In these cases, the threat hunter should be able to emulate the attack in a lab environment and establish the evidence based on the generated telemetry.
3. Identify Sources
Identify the data sources that should contain evidence of the malicious activity. Some examples of data sources are:
Network traffic logs
Process execution logs
4. Identify Fields
After establishing the type of attack or the specific IOAs of an attack on our hunting operation, we can concentrate on the specific fields we should query. Whether the data source has network or process execution-related logs, we can choose the individual fields that will help us spot the malicious activity.
5. Query the data
We now have all of the information we need to build our queries. We could adjust a couple of core components when forming these queries. The first variable is the time frame. This is how far back we choose to search in the available data.
The second one is the scope of the search. We can make the query more or less specific. For example, we could be specific and focus on the tools of the attack or be less specific and focus on the attack technique itself. In detection engineering, this is known as “capability-abstraction”. SpecterOps has a lot of resources that explain what capability-abstraction is and how one could use it to create well-informed detection rules. We can use capability-abstraction in threat hunting to structure our hypothesis based on the attack techniques. Hunting for the technique could help us uncover other tools that may have been used to compromise our network.
For example, a specific query would include the name of the DLL threat actors are using to load into memory. On the other hand, a less specific query would focus on the process execution method without including any command line details.
A picture is worth a thousand words, so the example below could help illustrate the difference between a specific and a less specific query. The first example at the top includes the targeted query with the specific command line arguments. The second image is an example of a less specific query that contains the execution flow of a word document running commands on the host using cmd.exe.
As we can see from this example, the second, less specific query showed more activity linked to the attack technique we are hunting for.
The more flexible the query is, the more false-positive results we may have. In contrast, the more targeted the query is, the fewer false-positive results we will have. Although having a targeted query may cause us to overlook instances where the attack technique we are hunting for manifests differently. A balance between those two is key when deciding on the final version of the query.
I usually start with a broader query and a short time frame (<1day). Depending on the returned results, I will either make the query more specific or keep it the same and expand the time frame. Applying this method makes analyzing the results easier (fewer data and FPs become apparent), and we avoid potential impact on the back-end databases serving the requested data.
5. Analyze the data
Once we have the results from our queries, we can start manipulating the data to make it as easy as possible to analyze and spot anomalies. We can apply several analysis techniques depending on what we are hunting for and which mental model we follow. This article from CyborgSecurity —