Introduction

The lookup module is a powerful tool for enriching data from static data sets. But did you know that it can also be used as a powerful filtering tool? When you use the “-s” (strict) or “-v” (invert) flags, you can omit any enrichment options, causing lookup to behave like a filter. With “-s”, if the data you attempt to match on isn’t in the lookup data, the entry will be dropped! 

This quick blog post shows an example of inverted filtering by walking through an example of looking for IP addresses in our data that we saw last week, but didn’t see this week. But first, let’s talk about inverted matching.

Inverted matching with lookup

The only thing you need to do to have lookup invert matches is add the “-v” flag. For example:

tag=data json ip | lookup -s -r data ip ip

This query will use lookup as a “filter”, only allowing entries that have an IP that matches in the lookup data through. We can invert this with “-v”, which causes lookup to drop entries that match in the lookup data, and pass all others.

tag=data json ip | lookup -v -s -r data ip ip

With this in hand, let’s walk through our example.

Example: IPs we saw last week but not this week

Let’s pretend we have a dataset containing IP addresses that accessed some imaginary server, along with the time the access occurred.

image5-1

Last week’s IP accesses


image4-1

This week’s accesses

This is a pretty small list, and we can tell by inspection that 10.25.209.9 and 10.25.211.1 are the only two IPs we saw last week that weren’t present this week. But at scale, this quickly becomes a problem for automating. 

Our goal is to get a list of IPs (the two mentioned above in this example) that we saw last week, but not this week. We’ll accomplish this by building a dataset with lookup for this week, and then inverting it in another query run over last week.

We’ll start by looking at the IPs accessed this week, and we’ll use the “-save” option in the table module to create a resource for us:

image1-2

The table module simply creates a resource (named “ips_this_week”) containing a list of the IPs in the table.

Now we can go look at the list of IPs from last week:

image2-2

If we add an inverted lookup module to this, using the “ip” column as a match, we’ll be asking lookup to drop any entries that match in the list of IPs from this week, effectively leaving just the IPs not seen this week:

image3-2

Voilà!

Conclusion


When you have complex filtering requirements in your dataset, consider all the tools in the toolkit, including the lookup module, to get you to an answer quickly.