--- layout: docu title: Profiling --- Profiling is important to help understand why certain queries exhibit specific performance characteristics. DuckDB contains several built-in features to enable query profiling that will be explained on this page. For the examples on this page we will use the following example data set: ```sql CREATE TABLE students (sid INTEGER PRIMARY KEY, name VARCHAR); CREATE TABLE exams (cid INTEGER, sid INTEGER, grade INTEGER, PRIMARY KEY (cid, sid)); INSERT INTO students VALUES (1, 'Mark'), (2, 'Hannes'), (3, 'Pedro'); INSERT INTO exams VALUES (1, 1, 8), (1, 2, 8), (1, 3, 7), (2, 1, 9), (2, 2, 10); ``` ## `EXPLAIN` Statement The first step to profiling a database engine is figuring out what execution plan the engine is using. The `EXPLAIN` statement allows you to peek into the query plan and see what is going on under the hood. The `EXPLAIN` statement displays the physical plan, i.e., the query plan that will get executed. To demonstrate, see the below example: ```sql CREATE TABLE students (name VARCHAR, sid INTEGER); CREATE TABLE exams (eid INTEGER, subject VARCHAR, sid INTEGER); INSERT INTO students VALUES ('Mark', 1), ('Joe', 2), ('Matthew', 3); INSERT INTO exams VALUES (10, 'Physics', 1), (20, 'Chemistry', 2), (30, 'Literature', 3); EXPLAIN SELECT name FROM students JOIN exams USING (sid) WHERE name LIKE 'Ma%'; ``` ```text ┌─────────────────────────────┐ │┌───────────────────────────┐│ ││ Physical Plan ││ │└───────────────────────────┘│ └─────────────────────────────┘ ┌───────────────────────────┐ │ PROJECTION │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ name │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ HASH_JOIN │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ INNER │ │ sid = sid ├──────────────┐ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ │ EC: 1 │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ SEQ_SCAN ││ FILTER │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ exams ││ prefix(name, 'Ma') │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ sid ││ EC: 1 │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ │ │ EC: 3 ││ │ └───────────────────────────┘└─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ SEQ_SCAN │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ students │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ sid │ │ name │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Filters: name>=Ma AND name│ │ =Ma AND name│ │ This file is overwritten with each query that is issued. If you want to store the profile output for later it should be copied to a different file. Now let us run the query that we inspected before: ```sql SELECT name FROM students JOIN exams USING (sid) WHERE name LIKE 'Ma%'; ``` After the query is completed, the JSON file containing the profiling output has been written to the specified file. We can then render the query graph using the Python script, provided we have the `duckdb` python module installed. This script will generate a HTML file and open it in your web browser. ```batch python -m duckdb.query_graph /path/to/file.json ``` ## Notation in Query Plans In query plans, the [hash join](https://en.wikipedia.org/wiki/Hash_join) operators adhere to the following convention: the _probe side_ of the join is the left operand, while the _build side_ is the right operand. Join operators in the query plan show the join type used: * Inner joins are denoted as `INNER`. * Left outer joins and right outer joins are denoted as `LEFT` and `RIGHT`, respectively. * Full outer joins are denoted as `FULL`.