---
name: kql
description: "KQL language expertise for writing correct, efficient Kusto Query Language queries. Covers syntax gotchas, join patterns, dynamic types, datetime pitfalls, regex patterns, serialization, memory management, result-size discipline, and advanced functions (geo, vector, graph). USE THIS SKILL whenever writing, debugging, or reviewing KQL queries — even simple ones — because the gotchas section prevents the most common errors that waste tool calls and cause expensive retry cascades. Trigger on: KQL, Kusto, ADX, Azure Data Explorer, Fabric Real-Time Intelligence, EventHouse, Log Analytics, log analysis, data exploration, time series, anomaly detection, summarize, where clause, join, extend, project, let statement, parse operator, extract function, any mention of pipe-forward query syntax."
---

# KQL Mastery

> **Try it yourself**: All `✅` examples in this skill can be run against the public help cluster:
> `https://help.kusto.windows.net`, database `Samples` (contains `StormEvents`, `SimpleGraph_Nodes`/`Edges`, `nyc_taxi`, and more).

## 1. KQL Basics

Kusto Query Language (KQL) is a pipe-forward query language for exploring data. It is the native query language for Azure Data Explorer (ADX), Microsoft Fabric Real-Time Intelligence (EventHouse), Azure Monitor Log Analytics, Microsoft Sentinel, and other Microsoft data services.

### Pipe-forward syntax

KQL queries are a chain of operators separated by `|`. Data flows left to right:

```kql
StormEvents                          // start with a table
| where State == "TEXAS"             // filter rows
| summarize count() by EventType     // aggregate
| top 5 by count_ desc              // limit results
```

### Query vs management commands

KQL has two execution planes:

| Plane | Starts with | Examples |
|-------|-------------|----------|
| **Query** | Table name, `let`, `print`, `datatable` | `StormEvents \| where State == "TEXAS"` |
| **Management** | `.show`, `.create`, `.set`, `.drop`, `.alter` | `.show tables`, `.show table T schema` |

Management commands can be followed by query operators (the output is tabular), but the entire request runs on the management plane. You cannot start with a query and pipe into a management command.

```kql
// ✅ WORKS — management command piped to query operators
.show tables | project TableName | where TableName has "Events"

// ❌ WRONG — query piped into management command
StormEvents | take 5 | .show tables
```

When in doubt: if the first token starts with `.`, it's a management command. For a full catalog of schema exploration commands, see `references/discovery-queries.md`.

## 2. Dynamic Type Discipline

KQL's `dynamic` type is flexible but strict in certain contexts. A common mistake is using a dynamic column in `summarize by`, `order by`, or `join on` without casting.

**The rule**: Any time you use a dynamic-typed column in `by`, `on`, or `order by`, wrap it in an explicit cast.

```kql
// ❌ ERROR: "Summarize group key ... is of a 'dynamic' type"
StormEvents | summarize count() by StormSummary.Details.Location

// ✅ FIX
StormEvents | summarize count() by tostring(StormSummary.Details.Location)
```

```kql
// ❌ ERROR: "order operator: key can't be of dynamic type"
StormEvents | order by StormSummary.TotalDamages desc

// ✅ FIX
StormEvents | order by tolong(StormSummary.TotalDamages) desc
```

```kql
// ❌ ERROR in join: dynamic join key
StormEvents | join kind=inner (PopulationData) on $left.StormSummary == $right.State

// ✅ FIX — cast both sides
StormEvents
| extend State_str = tostring(StormSummary.Details.Location)
| join kind=inner (PopulationData) on $left.State_str == $right.State
```

**Self-correction**: When you see "is of a 'dynamic' type" in an error, add `tostring()`, `tolong()`, or `todouble()`.

## 3. Join Patterns & Pitfalls

KQL joins have constraints that differ from SQL.

### Equality only
KQL join conditions support **only `==`**. No `<`, `>`, `!=`, or function calls in join predicates.

```kql
// ❌ ERROR: "Only equality is allowed in this context"
StormEvents | join (nyc_taxi) on geo_distance_2points(BeginLon, BeginLat, pickup_longitude, pickup_latitude) < 1000

// ✅ WORKAROUND — pre-bucket into spatial cells, then join on cell ID
StormEvents
| extend cell = geo_point_to_s2cell(BeginLon, BeginLat, 8)
| join kind=inner (nyc_taxi | extend cell = geo_point_to_s2cell(pickup_longitude, pickup_latitude, 8)) on cell
```

For range joins, pre-bin values: `| extend bin_val = bin(Value, 100)`, then join on `bin_val`. Note: values near bin boundaries may land in adjacent bins — consider checking neighboring bins or overlapping the range for precision.

### Left/right attribute matching
Both sides of a join `on` clause must reference **column entities only** — not expressions, not aggregates.

```kql
// ❌ ERROR: "for each left attribute, right attribute should be selected"
StormEvents | join kind=inner (PopulationData) on $left.State

// ✅ FIX — specify both sides explicitly
StormEvents | join kind=inner (PopulationData) on $left.State == $right.State
```

### Cardinality check before large joins
**Always** check cardinality before joining tables with >10K rows. A cross-join explosion was the source of the single `E_RUNAWAY_QUERY` error (25K × 195 = potential 4.8M rows).

```kql
// Before joining, check how many rows each side contributes
StormEvents | summarize dcount(State)        // → 67 distinct states
PopulationData | summarize dcount(State)     // → 52 — safe to join
```

## 4. Regex in KQL

KQL handles regex natively — no need for Python.

### The `extract_all` gotcha
Unlike Python's `re.findall()`, KQL's `extract_all` **requires capturing groups** in the regex:

```kql
// ❌ ERROR: "extractall(): argument 2 must be a valid regex with [1..16] matching groups"
StormEvents | extend words = extract_all(@"[a-zA-Z]{3,}", EventNarrative)

// ✅ FIX — add parentheses around the pattern
StormEvents | extend words = extract_all(@"([a-zA-Z]{3,})", EventNarrative)
```

### Regex toolkit — don't fall back to Python
| Function | Use case | Example |
|----------|----------|---------|
| `extract(regex, group, source)` | Single match | `extract(@"User '([^']+)'", 1, Msg)` |
| `extract_all(regex, source)` | All matches (needs `()`) | `extract_all(@"(\w+)", Text)` |
| `parse` | Structured extraction | `parse Msg with * "User '" Sender "' sent" *` |
| `matches regex` | Boolean filter | `where Url matches regex @"^https?://"` |
| `replace_regex` | Find and replace | `replace_regex(Text, @"\s+", " ")` |

## 5. Serialization Requirements

Window functions need serialized (ordered) input.

```kql
// ❌ ERROR: "Function 'row_cumsum' cannot be invoked. The row set must be serialized."
StormEvents
| where State == "TEXAS"
| summarize DailyCount = count() by bin(StartTime, 1d)
| extend CumulativeCount = row_cumsum(DailyCount)

// ✅ FIX — add | serialize (or | order by, which implicitly serializes)
StormEvents
| where State == "TEXAS"
| summarize DailyCount = count() by bin(StartTime, 1d)
| order by StartTime asc
| extend CumulativeCount = row_cumsum(DailyCount)
```

Functions requiring serialization: `row_number()`, `row_cumsum()`, `prev()`, `next()`, `row_window_session()`.

## 6. Memory-Safe Query Patterns

The most common memory error. Caused by scanning too much data without pre-filtering.

### The progression of safety
```
Safest ──────────────────────────────────────────────── Most dangerous
| count    | take 10    | where + summarize    | summarize (no filter)    | full scan
```

### Rules for large tables (>1M rows)

1. **Always start with `| count`** to understand table size
2. **Always `| where` before `| summarize`** — filter time range, partition key, or category first
3. **Never `dcount()` on high-cardinality columns** without pre-filtering
4. **Check join cardinality** before executing (see Section 3)
5. **Use `materialize()`** for subqueries referenced multiple times

```kql
// ❌ OUT OF MEMORY — large table, no filter, many group-by columns
StormEvents
| summarize dcount(EventType), count() by StartTime, State, Source
| where dcount_EventType > 1

// ✅ SAFE — filter first, then aggregate
StormEvents
| where StartTime between (datetime(2007-04-15) .. datetime(2007-04-16))
| summarize dcount(EventType) by State, Source
| where dcount_EventType > 1
```

### When you see `E_LOW_MEMORY_CONDITION`
The query touched too much data. Your options:
- Add `| where` filters (time range, partition key)
- Reduce the number of `by` columns in `summarize`
- Break into smaller time windows and union results
- Use `| sample 10000` for exploratory work instead of full scans

### When you see `E_RUNAWAY_QUERY`
A join or aggregation produced too many output rows. Check join cardinality — one or both sides is too large.

## 7. Result Size Discipline

Large results slow down analysis. Prevention:

| Query type | Safeguard |
|-----------|-----------|
| Exploratory | Always end with `\| take 10` or `\| take 20` |
| Aggregation | Use `\| top 20 by ...` not unbounded `summarize` |
| Wide rows (vectors, JSON) | `\| project` only needed columns |
| `make_list()` / `make_set()` | Avoid on high-cardinality groups (produces huge cells) |
| Unknown size | Run `\| count` first |

**The vector trap**: Tables with embedding columns (1536-dim float arrays) produce ~30KB per row. Even `| take 20` yields 600KB. Always `| project` away vector columns unless you specifically need them.

## 8. String Comparison Strictness

KQL sometimes requires explicit casts when comparing computed string values — even when both sides are already strings.

```kql
// ❌ ERROR: "Cannot compare values of types string and string. Try adding explicit casts"
StormEvents | where geo_point_to_s2cell(BeginLon, BeginLat, 16) == other_cell

// ✅ FIX — wrap both sides in tostring()
StormEvents | where tostring(geo_point_to_s2cell(BeginLon, BeginLat, 16)) == tostring(other_cell)
```

This is most common with computed values from `geo_point_to_s2cell()` and `strcat()` comparisons. When in doubt, cast with `tostring()`.

## 9. Advanced Functions

KQL handles these natively — no need for Python:

### Vector similarity
```kql
// try it! — cosine similarity on Iris feature vectors
let target = pack_array(5.1, 3.5, 1.4, 0.2);
Iris
| extend Vec = pack_array(SepalLength, SepalWidth, PetalLength, PetalWidth)
| extend sim = series_cosine_similarity(Vec, target)
| top 5 by sim desc
```

### Geo operations
```kql
// Distance between two points (meters)
StormEvents | extend dist = geo_distance_2points(BeginLon, BeginLat, EndLon, EndLat)

// Spatial bucketing for joins
StormEvents | extend cell = geo_point_to_s2cell(BeginLon, BeginLat, 8)
```

### Graph queries
```kql
// Persistent graph model — try it on the help cluster!
graph("Simple")
| graph-match (src)-[e*1..3]->(dst)
  where src.name == "Alice"
  project src.name, dst.name, path_length = array_length(e)

// Transient graph — build inline with make-graph
SimpleGraph_Edges
| make-graph source --> target with SimpleGraph_Nodes on id
| graph-match (src)-[e*1..5]->(dst)
  where src.name == "Alice"
  project src.name, dst.name, path_length = array_length(e)
```

### Time series
```kql
// try it! — create a time series and detect anomalies
StormEvents
| make-series count() default=0 on StartTime step 1d
| extend anomalies = series_decompose_anomalies(count_)
```

For detailed examples and patterns, consult `references/advanced-patterns.md`.

## 10. Self-Correction Lookup Table

When you encounter an error, look it up here before retrying:

| Error message contains | Likely cause | Fix |
|---|---|---|
| `is of a 'dynamic' type` | Dynamic column in `by`/`on`/`order by` | Wrap in `tostring()`/`tolong()` |
| `Only equality is allowed` | Range predicate in join condition | Pre-bucket with S2/H3 cells or `bin()` |
| `extractall(): matching groups` | Missing `()` in regex | Add `()`: `@"(\w+)"` not `@"\w+"` |
| `row set must be serialized` | Window function on unsorted data | Add `\| serialize` or `\| order by` before it |
| `Cannot compare values of types string and string` | Computed string comparison | Add `tostring()` on both sides |
| `Failed to resolve column named 'X'` | Wrong column name or wrong table | Run `.show table T schema` to check column names |
| `E_LOW_MEMORY_CONDITION` | Query touched too much data | Add `\| where` filters, reduce time range, break into steps |
| `E_RUNAWAY_QUERY` | Join/aggregation produced too many rows | Check cardinality before joining; add pre-filters |
| `for each left attribute, right attribute` | Join `on` clause incomplete | Use explicit form: `on $left.X == $right.Y` |
| `needs to be bracketed` | Reserved word used as identifier | Use `['keyword']` syntax |
| `plugin doesn't exist` | Unavailable plugin on this cluster | Fall back to equivalent function or Python |
| `Expected string literal in datetime()` | Bare integer in datetime literal | Use `datetime(2024-01-01)` not `datetime(2024)` |
| `Unexpected token` after `by` | Complex expression in summarize by-clause | `extend` the expression first, then `summarize by` the column |
| `not recognized` / `unknown operator` | Operator not available on this engine | Check operator support; try equivalent (`order by` = `sort by`) |

## 11. Datetime Pitfalls

Datetime literals are a common source of errors. A wrong literal format can cascade into completely different approaches instead of fixing the small issue.

### Literal format
```kql
// ❌ WRONG — bare year is not a valid datetime
StormEvents | where StartTime > datetime(2007)

// ✅ RIGHT — always use full date format
StormEvents | where StartTime > datetime(2007-01-01)
```

### Filtering by year, month, or hour
```kql
// ❌ WRONG — comparing datetime column to integer
StormEvents | where StartTime == 2007

// ✅ RIGHT — use datetime_part() to extract components
StormEvents | where datetime_part("year", StartTime) == 2007

// ✅ ALSO RIGHT — use between with datetime range
StormEvents | where StartTime between (datetime(2007-01-01) .. datetime(2007-12-31T23:59:59))
```

### Time bucketing in summarize
```kql
// This works, but can be harder to read and reuse in complex queries
StormEvents | summarize count() by startofmonth(StartTime)

// Clearer — extend first, then summarize by the computed column
StormEvents
| extend Month = startofmonth(StartTime)
| summarize count() by Month
| order by Month asc
```

### Useful datetime functions
| Function | Purpose | Example |
|----------|---------|---------|
| `bin(ts, 1h)` | Round down to bucket boundary | `bin(Timestamp, 1d)` |
| `startofmonth(ts)` | First day of month | `startofmonth(Timestamp)` |
| `datetime_part("hour", ts)` | Extract component | `datetime_part("year", Timestamp)` |
| `format_datetime(ts, fmt)` | Format as string | `format_datetime(Timestamp, "yyyy-MM")` |
| `ago(1d)` | Relative time | `where Timestamp > ago(1d)` |
| `between(a .. b)` | Range filter (inclusive) | `where Timestamp between (datetime(2024-01-01) .. datetime(2024-01-31T23:59:59))` |
| `todatetime(str)` | Parse string → datetime | `todatetime("2024-01-15T10:30:00Z")` |
| `totimespan(str)` | Parse string → timespan | `totimespan("01:30:00")` |

## 12. Operator Naming & Equality

KQL has subtle differences from SQL syntax.

### Naming conventions

| Entity | Convention | Example |
|--------|-----------|---------|
| Tables | UpperCamelCase | `StormEvents`, `NetworkLogs` |
| Columns | UpperCamelCase | `StartTime`, `EventType` |
| Variables (`let`) | snake_case | `let filtered_events = ...` |
| Built-in functions | snake_case | `format_bytes()`, `geo_distance_2points()` |
| Stored functions | UpperCamelCase | `.create function GetTopUsers` |

### Equality operators
```kql
// In where clauses, == is case-sensitive, =~ is case-insensitive
StormEvents | where State == "TEXAS" | count        // exact match
StormEvents | where State =~ "texas" | count        // case-insensitive

// In joins, use == only
StormEvents | join kind=inner (PopulationData) on State
```

### sort vs order
Both `sort by` and `order by` work identically in KQL — they are aliases. Use whichever you prefer, but be consistent.

### contains vs has
```kql
// contains: substring match (slower)
StormEvents | where EventNarrative contains "tree"   // finds "trees", "treetop" too

// has: term/word match (faster, uses index)
StormEvents | where EventNarrative has "tree"        // matches word boundaries only

// For exact prefix/suffix
StormEvents | where EventType startswith "Thunder"
StormEvents | where Source endswith "Spotter"
```

## 13. Error Recovery Strategy

When a first KQL query fails, the temptation is to abandon the entire approach and try something completely different. The correct response is almost always to **fix the specific error**, not change strategy.

### The pattern to avoid
```
Query 1: extract(@"pattern", 1, col)  → Parse error
Query 2: todynamic(col)               → Different error  
Query 3: parse_json(col)              → Another error
Query 4: Python script                → Works but 10x tokens
```

### The correct pattern
```
Query 1: extract(@"pattern", 1, col)  → Parse error (bad escaping)
Query 2: extract(@"pattern", 1, col)  → Fix the specific escaping issue → Success
```

**Rules for error recovery:**
1. Read the error message carefully — it almost always tells you exactly what's wrong
2. Fix the **specific** syntax/escaping issue, don't switch approaches
3. Use the self-correction table (Section 10) to map errors to fixes
4. Only switch approaches after 2 failed fixes of the same query
5. The `parse` operator is often simpler than `extract()` for structured text:

```kql
// Instead of complex regex on TraceLogs:
// extract(@"file path: \"\"([^\"]+)\"\"", 1, Message)

// Use parse for structured extraction (try it on help cluster, SampleLogs db):
cluster("help").database("SampleLogs").TraceLogs
| where Message has "file path"
| parse Message with * "file path: \"\"" FilePath "\"\"" *
| project Timestamp, FilePath
| take 5
```

## 14. Query Writing Checklist

Before running any KQL query, mentally check:

1. **Pre-filtered?** Large tables have a `| where` before any `| summarize`
2. **Result bounded?** Exploratory queries end with `| take N` or `| top N`
3. **Dynamic columns cast?** Any dynamic column in `by`/`on`/`order by` is wrapped
4. **Regex has groups?** `extract_all` patterns have `()` around what you want to capture
5. **Join cardinality safe?** Both sides checked with `dcount()` before joining
6. **Needed columns only?** Wide tables get `| project` to drop unneeded columns
7. **Datetime literals valid?** Using `datetime(2024-01-01)` not `datetime(2024)` or bare integers
8. **Complex by-expressions?** Use `| extend` first, then `| summarize by` the computed column
9. **Error recovery plan?** If a query fails, fix the specific error — don't change strategy