I often quote Spaf who says "A system is good if it does what it's supposed to do, and secure if it doesn't do anything else." Making our systems secure requires a few things. We first have to know what the system is supposed to do, but that's usually not where things start with cybersecurity, probably because that's hard and it requires a bit of Know Thyself, at which we are terrible. Instead we start at "well, what do we know it's not supposed to do FOR SURE". Obviously, systems shouldn't be executing malware. Detecting malware via hash, or signature, or known behavior is looking for that "known bad". This is not threat hunting. This is detection. This is putting up a most wanted list to catch criminals.

Sun-Tzu_TW_1200x670

Detection is very important, but it doesn't really use defenders advantage. Detection at one company is going to look just like detection at another company. There is risk for companies who rely solely on detection and as companies mature they want to add threat hunting capabilities. You don't have to read very much Sun Tzu to know that failing to utilize defenders advantage is going to result in losing the battle. The famous quote doesn't even explain what happens if you know your enemy but not yourself, probably because it's unthinkable.

 

Threat hunting is a little closer to knowing what a system is supposed to do in order to make sure it isn't doing anything else. A great example to illustrate this point is DNS. Anyone who has had to admin a system/network knows that "it's always DNS" isn't just a meme phrase. Practically everything relies on DNS to operate. Thus, data exfiltration or C2 traffic operating over the top of DNS can be one of the more reliable techniques for attackers. Once malicious DNS is identified it's certainly easy to blacklist a domain or alert when traffic is seen, but that's back in the "known bad" area. What about an attacker's first activity? How easy is it for an attacker to change the domain they are using? What about activity spread over a broad time range? Threat hunting is required to answer these questions. This is true for other "living off the land" techniques as well.

 

For threat hunting to be successful, analysts need multiple data sources and they need it in one spot. Threat hunting involves a lot of fusing of weak signals to identify and remediate attacker activity. Back to DNS as example, a threat hunter might identify a spike on activity for a new domain, or notice an increase in record entropy, or even get anomaly alerts from machine learning trained against her environment DNS (ML is only as good as the training data, and be careful getting sold ML trained by some 3rd party in an environment that's not yours, but that's a topic for another post). All of these are weak signals but threat hunters, equipped with solid tools and management backing, can create and test hypotheses to find attackers and reduce the (appalling) average attacker dwell time. 

 

As I talk to potential customers about their current tool usage, I hear plenty of aired frustrations about an inability to conduct hunting activity or root-cause forensics due to a lack of tooling that enables analysts to really dig in and sculpt with their data. The "pre-fabbed" search and IOC-centric dashboarding isn't sufficient; they need threat hunting capability. Gravwell is a structure-on-read platform that operates on raw data records, which is fantastic for threat hunters because they never have to stop asking questions due to data limitations. Data Fusion takes that a step further as we help analysts correlate these weak signals, automate steps of threat hunting activity, and reap the benefits of defenders' advantage. Combine all that with unlimited data ingestion pricing models and a platform that actually scales into the hundreds of terabytes of data per day, it's not too daunting to "Know Thyself".

 

P.S. - This post started out as a tweet to add to a recent thread by Richard Bejtlich discussing hunting as IOC-free analysis, but it turned into a bit more.