---
append_help_link: true
slug: pattern-syntax
title: Rule pattern syntax
description: "Learn Semgrep's pattern syntax to search code for a given code pattern."
tags:
- Rule writing
---
# Rule pattern syntax
:::tip
Getting started with rule writing? Try the [Semgrep Tutorial](https://semgrep.dev/learn) 🎓
:::
This document describes Semgrep’s pattern syntax. You can also see pattern [examples by language](/writing-rules/pattern-examples). In the command line, patterns are specified with the flag `--pattern` (or `-e`). Multiple
coordinating patterns may be specified in a configuration file. See
[rule syntax](/writing-rules/rule-syntax) for more information.
## Pattern matching
Pattern matching searches code for a given pattern. For example, the
expression pattern `1 + func(42)` can match a full expression or be
part of a subexpression:
```python
foo(1 + func(42)) + bar()
```
In the same way, the statement pattern `return 42` can match a top
statement in a function or any nested statement:
```python
def foo(x):
if x > 1:
if x > 2:
return 42
return 42
```
## Ellipsis operator
The `...` ellipsis operator abstracts away a sequence of zero or more
items such as arguments, statements, parameters, fields, characters.
The `...` ellipsis can also match any single item that is not part of
a sequence when the context allows it.
See the use cases in the subsections below.
### Function calls
Use the ellipsis operator to search for function calls or
function calls with specific arguments. For example, the pattern `insecure_function(...)` finds calls regardless of its arguments.
```python
insecure_function("MALICIOUS_STRING", arg1, arg2)
```
Functions and classes can be referenced by their fully qualified name, e.g.,
- `django.utils.safestring.mark_safe(...)` or `mark_safe(...)`
- `System.out.println(...)` or `println(...)`
You can also search for calls with arguments after a match. The pattern `func(1, ...)` will match both:
```python
func(1, "extra stuff", False)
func(1) # Matches no arguments as well
```
Or find calls with arguments before a match with `func(..., 1)`:
```python
func("extra stuff", False, 1)
func(1) # Matches no arguments as well
```
The pattern `requests.get(..., verify=False, ...)` finds calls where an argument appears anywhere:
```python
requests.get(verify=False, url=URL)
requests.get(URL, verify=False, timeout=3)
requests.get(URL, verify=False)
```
Match the keyword argument value with the pattern `$FUNC(..., $KEY=$VALUE, ...)`.
### Method calls
The ellipsis operator can also be used to search for method calls.
For example, the pattern `$OBJECT.extractall(...)` matches:
```python
tarball.extractall('/path/to/directory') # Oops, potential arbitrary file overwrite
```
You can also use the ellipsis in chains of method calls. For example,
the pattern `$O.foo(). ... .bar()` will match:
```python
obj = MakeObject()
obj.foo().other_method(1,2).again(3,4).bar()
```
### Function definitions
The ellipsis operator can be used in function parameter lists or in the function
body. To find function definitions with [mutable default arguments](https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments):
```text
pattern: |
def $FUNC(..., $ARG={}, ...):
...
```
```python
def parse_data(parser, data={}): # Oops, mutable default arguments
pass
```
:::tip
The YAML `|` operator allows for [multiline strings](https://yaml-multiline.info/).
:::
The ellipsis operator can match the function name.
Match any function definition:
Regular functions, methods, and also anonymous functions (such as lambdas).
To match named or anonymous functions use an ellipsis `...` in place of the name of the function.
For example, in JavaScript the pattern `function ...($X) { ... }` matches
any function with one parameter:
```javascript
function foo(a) {
return a;
}
var bar = function (a) {
return a;
};
```
### Class definitions
The ellipsis operator can be used in class definitions. To find classes that
inherit from a certain parent:
```text
pattern: |
class $CLASS(InsecureBaseClass):
...
```
```python
class DataRetriever(InsecureBaseClass):
def __init__(self):
pass
```
:::tip
The YAML `|` operator allows for [multiline strings](https://yaml-multiline.info/).
:::
#### Ellipsis operator scope
The `...` ellipsis operator matches everything in its current scope. The current scope of this operator is defined by the patterns that precede `...` in a rule. See the following example:
Semgrep matches the first occurrence of `bar` and `baz` in the test code as these objects fall under the scope of `foo` and `...`. The ellipsis operator does not match the second occurrence of `bar` and `baz` as they are not inside of the function definition, therefore these objects in their second occurrence are not inside the scope of the ellipsis operator.
### Strings
The ellipsis operator can be used to search for strings containing any data. The pattern `crypto.set_secret_key("...")` matches:
```python
crypto.set_secret_key("HARDCODED SECRET")
```
This also works with [constant propagation](#constants).
In languages where regular expressions use a special syntax
(for example JavaScript), the pattern `/.../` will match
any regular expression construct:
```javascript
re1 = /foo|bar/;
re2 = /a.*b/;
```
### Binary operations
The ellipsis operator can match any number of arguments to binary operations. The pattern `$X = 1 + 2 + ...` matches:
```python
foo = 1 + 2 + 3 + 4
```
### Containers
The ellipsis operator can match inside container data structures like lists, arrays, and key-value stores.
The pattern `user_list = [..., 10]` matches:
```python
user_list = [8, 9, 10]
```
The pattern `user_dict = {...}` matches:
```python
user_dict = {'username': 'password'}
```
The pattern `user_dict = {..., $KEY: $VALUE, ...}` matches the following and allows for further metavariable queries:
```python
user_dict = {'username': 'password', 'address': 'zipcode'}
```
You can also match just a key-value pair in
a container, for example in JSON the pattern `"foo": $X` matches
just a single line in:
```json
{ "bar": True,
"name": "self",
"foo": 42
}
```
### Conditionals and loops
The ellipsis operator can be used inside conditionals or loops. The pattern:
```text
pattern: |
if $CONDITION:
...
```
:::tip
The YAML `|` operator allows for [multiline strings](https://yaml-multiline.info/).
:::
matches:
```python
if can_make_request:
check_status()
make_request()
return
```
A metavariable can match a conditional or loop body if the body statement information is re-used later. The pattern:
```text
pattern: |
if $CONDITION:
$BODY
```
matches:
```python
if can_make_request:
single_request_statement()
```
:::tip
Half or partial statements can't be matches; both of the examples above must specify the contents of the condition’s body (e.g., `$BODY` or `...`), otherwise they are not valid patterns.
:::
### Matching single items with an ellipsis
Ellipsis `...` is generally used to match sequences of similar elements.
However, you can also match single item using ellipsis `...` operator.
The following pattern is valid in languages with a C-like
syntax even though `...` matches a single Boolean value rather
than a sequence:
```java
if (...)
return 42;
```
Another example where a single expression is matched by an ellipsis is
the right-hand side of assignments:
```java
foo = ...;
```
However, matching a sequence of items remains the default meaning of an
ellipsis. For example, the pattern `bar(...)` matches `bar(a)`,
but also `bar(a, b)` and `bar()`. To force a match on a single item,
use a metavariable as in `bar($X)`.
## Metavariables
Metavariables are an abstraction to match code when you don’t know the value or contents ahead of time, similar to [capture groups](https://regexone.com/lesson/capturing_groups) in regular expressions.
Metavariables can be used to track values across a specific code scope. This
includes variables, functions, arguments, classes, object methods, imports,
exceptions, and more.
Metavariables look like `$X`, `$WIDGET`, or `$USERS_2`. They begin with a `$` and can only
contain uppercase characters, `_`, or digits. Names like `$x` or `$some_value` are invalid.
### Expression metavariables
The pattern `$X + $Y` matches the following code examples:
```python
foo() + bar()
```
```python
current + total
```
### Import metavariables
Metavariables can also be used to match imports. For example, `import $X` matches:
```python
import random
```
### Reoccurring metavariables
Re-using metavariables shows their true power. Detect useless assignments:
```text
pattern: |
$X = $Y
$X = $Z
```
Useless assignment detected:
```python
initial_value = 10 # Oops, useless assignment
initial_value = get_initial_value()
```
:::tip
The YAML `|` operator allows for [multiline strings](https://yaml-multiline.info/).
:::
### Literal Metavariables
You can use `"$X"` to match any string literal. This is similar
to using `"..."`, but the content of the string is stored in the
metavariable `$X`, which can then be used in a message
or in a [`metavariable-regex`](/writing-rules/rule-syntax/#metavariable-regex).
You can also use `/$X/` and `:$X` to respectively match
any regular expressions or atoms (in languages that support
those constructs, e.g., Ruby).
:::info
Because literal metavariables bind to strings that may not be valid code, if you want to match them in more detail with a [`metavariable-pattern`](/writing-rules/rule-syntax/#metavariable-pattern), you must [specify `generic` language](/writing-rules/rule-syntax#metavariable-pattern-with-nested-language) inside the `metavariable-pattern`. For example:
```
rules:
- id: match-literal-string
languages:
- python
severity: LOW
message: Found "$STRING"
patterns:
- pattern: '"$STRING"'
- metavariable-pattern:
language: generic
metavariable: $STRING
pattern: "literal string contents"
```
:::
### Typed metavariables
#### Syntax
Typed metavariables only match a metavariable if it’s declared as a specific type.
##### Java:
For example, to look for calls to the `log` method on `Logger` objects.
A simple pattern for this purpose could use a metavariable for the Logger object.
```text
pattern: $LOGGER.log(...)
```
But if we are concerned about finding calls to the `Math.log()` method as well, we can use a typed metavariable to put a type constraint on the `$LOGGER` metavariable.
```text
pattern: (java.util.logging.Logger $LOGGER).log(...)
```
Alternatively, if we want to capture more logger types, for example custom logger types, we could instead add a constraint to the type of the argument in this method call instead.
```text
pattern: $LOGGER.log(java.util.logging.LogRecord $RECORD)
```
##### C:
In this example in C, we want to capture all cases where something is compared to a char array.
We start with a simple pattern that looks for comparison between two variables.
```text
pattern: $X == $Y
```
We can then put a type constraint on one of the metavariables used in this pattern by turning it into a typed metavariable.
```text
pattern: $X == (char *$Y)
```
```c
int main() {
char *a = "Hello";
int b = 1;
// Matched
if (a == "world") {
return 1;
}
// Not matched
if (b == 2) {
return -1;
}
return 0;
}
```
##### Go:
The syntax for a typed metavariable in Go looks different from the syntax for Java.
In this Go example we look for calls to the `Open` function, but only on an object of the `zip.Reader` type.
```text
pattern: |
($READER : *zip.Reader).Open($INPUT)
```
```go
func read_file(reader *zip.Reader, filename) {
// Matched
reader.Open(filename)
dir := http.Dir("/")
// Not matched
f, err := dir.Open(c.Param("file"))
}
```
:::caution
For Go, Semgrep currently does not recognize the type of all variables that are declared on the same line. That is, the following will not take both `a` and `b` as `int`s: `var a, b = 1, 2`
:::
##### TypeScript:
In this example, we want to look for uses of the DomSanitizer function.
```text
pattern: ($X: DomSanitizer).sanitize(...)
```
```typescript
constructor(
private _activatedRoute: ActivatedRoute,
private sanitizer: DomSanitizer,
) { }
ngOnInit() {
// Not matched
this.sanitizer.bypassSecurityTrustHtml(DOMPurify.sanitize(this._activatedRoute.snapshot.queryParams['q']))
// Matched
this.sanitizer.bypassSecurityTrustHtml(this.sanitizer.sanitize(this._activatedRoute.snapshot.queryParams['q']))
}
```
#### Using typed metavariables
Type inference applies to the entire file! One common way to use typed metavariables is to check for a function called on a specific type of object. For example, let's say you're looking for calls to a potentially unsafe logger in a class like this:
```
class Test {
static Logger logger;
public static void run_test(String input, int num) {
logger.log("Running a test with " + input);
test(input, Math.log(num));
}
}
```
If you searched for `$X.log(...)`, you can also match `Math.log(num)`. Instead, you can search for `(Logger $X).log(...)` which gives you the call to `logger`. See the rule [`logger_search`](https://semgrep.dev/playground/s/lgAo).
:::caution
Since matching happens within a single file, this is only guaranteed to work for local variables and arguments. Additionally, Semgrep currently understands types on a shallow level. For example, if you have `int[] A`, it will not recognize `A[0]` as an integer. If you have a class with fields, you will not be able to use typechecking on field accesses, and it will not recognize the class’s field as the expected type. Literal types are understood to a limited extent. Expanded type support is under active development.
:::
### Ellipsis metavariables
You can combine ellipses and metavariables to match a sequence
of arguments and store the matched sequence in a metavariable.
For example the pattern `foo($...ARGS, 3, $...ARGS)` will
match:
```python
foo(1,2,3,1,2)
```
When referencing an ellipsis metavariable in a rule message or [metavariable-pattern](/writing-rules/rule-syntax#metavariable-pattern), include the ellipsis:
```yaml
- message: Call to foo($...ARGS)
```
### Anonymous metavariables
Anonymous metavariables are used to specify that a metavariable exists in the pattern you want to capture.
An anonymous metavariable always takes the form `$_`. Variables such as `$_1` or `$_2` are **not** anonymous. You can use more than one anonymous metavariable in a rule definition.
For example, if you want to specify that a function should **always** have 3 arguments, then you can use anonymous metavariables:
```yaml
- pattern: def function($_, $_, $_)
```
An anonymous metavariable does not produce any binding to the code it matched. This means it does not enforce that it matches the same code at each place it is used. The pattern:
```yaml
- pattern: def function($A, $B, $C)
```
is not equivalent to the former example, as `$A`, `$B`, and `$C` bind to the code that matched the pattern. You can then use `$A` or any other metavariable in your rule definition to specify that specific code. Anonymous metavariables cannot be used this way.
Anonymous metavariables also communicate to the reader that their values are not relevant, but rather their occurrence in the pattern.
### Metavariable unification
For search mode rules, metavariables with the same name are treated as the same metavariable within the `patterns` operator. This is called metavariable unification.
For taint mode rules, patterns defined **within** `pattern-sinks` and `pattern-sources` still unify. However, metavariable unification **between** `pattern-sinks` and `pattern-sources` is **not** enabled by default.
To enforce unification, set `taint_unify_mvars: true` under the rule `options` key. When `taint_unify_mvars: true` is set, a metavariable defined in `pattern-sinks` and `pattern-sources` with the same name is treated as the same metavariable. See [Metavariables, rule message, and unification](/writing-rules/data-flow/taint-mode/advanced#metavariables-rule-messages-and-unification) for more information.
### Display matched metavariables in rule messages
Display values of matched metavariables in rule messages. Add a metavariable to the rule message (for example `Found $X`) and Semgrep replaces it with the value of the detected metavariable.
To display matched metavariable in a rule message, add the same metavariable as you are searching for in your rule to the rule message.
1. Find the metavariable used in the Semgrep rule. See the following example of a part Semgrep rule (formula):
```yaml
- pattern: $MODEL.set_password(…)
```
This formula uses `$MODEL` as a metavariable.
2. Insert the metavariable to rule message:
```yaml
- message: Setting a password on $MODEL
```
3. Use the formula displayed above against the following code:
```python
user.set_password(new_password)
```
The resulting message is:
```
Setting a password on user
```
Run the following example in Semgrep Playground to see the message (click **Open in Editor**, and then **Run**, unroll the **1 Match** to see the message):
:::info
If you're using Semgrep's advanced dataflow features, see documentation of experimental feature [Displaying propagated value of metavariable](/writing-rules/experiments/display-propagated-metavariable).
:::
## Equivalences
Semgrep automatically searches for code that is semantically equivalent.
### Imports
Equivalent imports using aliasing or submodules are matched.
The pattern `subprocess.Popen(...)` matches:
```python
import subprocess.Popen as sub_popen
sub_popen('ls')
```
The pattern `foo.bar.baz.qux(...)` matches:
```python
from foo.bar import baz
baz.qux()
```
### Constants
Semgrep performs constant propagation.
The pattern `set_password("password")` matches:
```python
HARDCODED_PASSWORD = "password"
def update_system():
set_password(HARDCODED_PASSWORD)
```
Basic constant propagation support like in the example above is a stable feature.
Experimentally, Semgrep also supports [intraprocedural flow-sensitive constant propagation](/writing-rules/data-flow/constant-propagation).
The pattern `set_password("...")` also matches:
```python
def update_system():
if cond():
password = "abc"
else:
password = "123"
set_password(password)
```
:::tip
It is possible to disable constant propagation in a per-rule basis via the [`options` rule field](/writing-rules/rule-syntax#options).
:::
### Associative and commutative operators
Semgrep performs associative-commutative (AC) matching. For example, `... && B && C` will match both `B && C` and `(A && B) && C` (i.e., `&&` is associative). Also, `A | B | C` will match `A | B | C`, and `B | C | A`, and `C | B | A`, and any other permutation (i.e., `|` is associative and commutative).
Under AC-matching metavariables behave similarly to `...`. For example, `A | $X` can match `A | B | C` in four different ways (`$X` can bind to `B`, or `C`, or `B | C`). In order to avoid a combinatorial explosion, Semgrep will only perform AC-matching with metavariables if the number of potential matches is _small_, otherwise it will produce just one match (if possible) where each metavariable is bound to a single operand.
Using [`options`](/writing-rules/rule-syntax#options) it is possible to entirely disable AC-matching. It is also possible to treat Boolean AND and OR operators (e.g., `&&` in `||` in C-family languages) as commutative, which can be useful despite not being semantically accurate.
## Deep expression operator
Use the deep expression operator `<... [your_pattern] ...>` to match an expression that could be deeply nested within another expression. An example is looking for a pattern anywhere within an `if` statement. The deep expression operator matches your pattern in the current expression context and recursively in any subexpressions.
For example, this pattern:
```yaml
pattern: |
if <... $USER.is_admin() ...>:
...
```
matches:
```python
if user.authenticated() and user.is_admin() and user.has_group(gid):
[ CONDITIONAL BODY ]
```
The deep expression operator works in:
- `if` statements: `if <... $X ...>:`
- nested calls: `sql.query(<... $X ...>)`
- operands of a binary expression: `"..." + <... $X ...>`
- any other expression context
## Limitations
### Statements types
Semgrep handles some statement types differently than others, particularly when searching for fragments inside statements. For example, the pattern `foo` will match these statements:
```python
x += foo()
return bar + foo
foo(1, 2)
```
But `foo` will not match the following statement (`import foo` will match it though):
```python
import foo
```
#### Statements as expressions
Many programming languages differentiate between expressions and statements. Expressions can appear inside if conditions, in function call arguments, etc. Statements can not appear everywhere; they are sequence of operations (in many languages using `;` as a separator/terminator) or special control flow constructs (if, while, etc.).
`foo()` is an expression (in most languages).
`foo();` is a statement (in most languages).
If your search pattern is a statement, Semgrep will automatically try to search for it as _both_ an expression and a statement.
When you write the expression `foo()` in a pattern, Semgrep will visit every expression and sub-expression in your program and try to find a match.
Many programmers don't really see the difference between `foo()` and `foo();`. This is why when one looks for `foo()`; Semgrep thinks the user wants to match statements like `a = foo();`, or `print(foo());`.
:::info
Note that in some programming languages such as Python, which does not use semicolons as a separator or terminator, the difference between expressions and statements is even more confusing. Indentation in Python matters, and a newline after `foo()` is really the same than `foo();` in other programming languages such as C.
:::
### Partial expressions
Partial expressions are not valid patterns. For example, the following is invalid:
```text
pattern: 1+
```
A complete expression is needed (like `1 + $X`)
### Ellipses and statement blocks
The [ellipsis operator](#ellipsis-operator) does _not_ jump from inner to outer statement blocks.
For example, this pattern:
```text
foo()
...
bar()
```
matches:
```python
foo()
baz()
bar()
```
and also matches:
```python
foo()
baz()
if cond:
bar()
```
but it does _not_ match:
```python
if cond:
foo()
baz()
bar()
```
because `...` cannot jump from the inner block where `foo()` is, to the outer block where `bar()` is.
### Partial statements
Partial statements are partially supported. For example,
you can just match the header of a conditional with `if ($E)`,
or just the try part of an exception statement with `try { ... }`.
This is especially useful when used in a
[pattern-inside](/writing-rules/rule-syntax#pattern-inside) to restrict the
context in which to search for other things.
### Other partial constructs
It is possible to just match the header of a function (without its body),
for example `int foo(...)` to match just the header part of the
function `foo`. In the same way, you can just match a class header
(e.g., with `class $A`).
## Deprecated features
### String matching
:::warning
String matching has been deprecated. You should use [`metavariable-regex`](/writing-rules/rule-syntax#metavariable-regex) instead.
:::
Search string literals within code with [Perl Compatible Regular Expressions (PCRE)](https://learnxinyminutes.com/docs/pcre/).
The pattern `requests.get("=~/dev\./i")` matches:
```python
requests.get("api.dev.corp.com") # Oops, development API left in
```
To search for specific strings, use the syntax `"=~//"`. Advanced regexp features are available, such as case-insensitive regexps with `'/i'` (e.g., `"=~/foo/i"`). Matching occurs anywhere in the string unless the regexp `^` anchor character is used: `"=~/^foo.*/"` checks if a string begins with `foo`.