Tips for KQL Data Sampling as part of Azure Sentinel Investigations

When you’re working against the data ingested in your Azure Sentinel Log Analytics workspace, you sometimes don’t know right away exactly where the data exists or even what data is available. For example, what if you simply want to figure out if ‘zoom.exe’ exists in your data store?

A lot of times someone has already created a query you might be looking for. In those cases, work with your colleagues, your teammates, search blogs, GitHub and other resources. But, where nothing exists, you’ll need to start building your own queries.

The process to building your KQL queries for Hunting and Investigations starts with data sampling. This allows you to identify what data exists in the specific tables available to you through ingestion, along with understanding where the data resides. Data sampling allows you to return smaller results so you can review and begin honing your queries to produce more specific data. Every journey starts with the first step.

So, how do you do this? Here’s a few methods.

Limit, Take and Top Operators

The Limit and Take operators offer similar functions. They retrieve and display a random set of records and are a great way to test new queries or viewing data (data sampling).

Example:

SecurityEvent
| limit 10

The previous queries the SecurityEvent table but only returns 10 random results, allowing you to sift through the data that is represented by returned row and column.

The Top operator returns the first number records sorted by the specified columns.

Example:

SecurityEvent
| top 100 by TimeGenerated desc

In the case of the Top example above, this returns 100 of the most current records in the dataset and sorts it in descending order by the time the data was generated.

These operators are great to use to sample data to determine the types of data you can find in specific tables. But, what happens – if like the zoom.exe example – when you have no clue which table the data reference might exist in? That’s where Search comes in.

Search

The Search operator is my go-to data sampling tool. I use this quite a lot and is especially useful when you have no idea where a string of data might exist – particularly when you have a lot of different data connected and have a multitude of tables listed in the Azure Sentinel schema tab.

Search is easy…the following example simply searches all known tables for the specific string and the results will show me which tables it exists in.

search "zoom.exe"

You can also continue to hone your script using search, too, by adding a specific table to search in much like the Limit, Take, and Top operators. The format for that would look like this:

search in (SecurityEvents) "zoom.exe"
| take 10

The Eyeball

Here’s the fun part of data sampling. Built into Azure Sentinel (Log Analytics) is the ability to generate a very quick preview of data using the “eyeball.”

Hover over any table in the table list and an eyeball will pop-up. Click the eyeball and you’ll get a quick data sampling which results it generating a data preview showing 10 results (limit 10) within the last 24 hours (|where TimeGenerated > ago(24h)).

Once you have gone through the process of data sampling and identifying where and what data exists, you can start to hone your KQL queries to produce very specific results for digging deep into security matters in the environment.

Authors