--- layout: docu redirect_from: - /docs/api/python/relational_api - /docs/api/python/relational_api/ - /docs/clients/python/relational_api title: Relational API --- The Relational API is an alternative API that can be used to incrementally construct queries. The API is centered around `DuckDBPyRelation` nodes. The relations can be seen as symbolic representations of SQL queries. They do not hold any data – and nothing is executed – until a method that triggers execution is called. ## Constructing Relations Relations can be created from SQL queries using the `duckdb.sql` method. Alternatively, they can be created from the various data ingestion methods (`read_parquet`, `read_csv`, `read_json`). For example, here we create a relation from a SQL query: ```python import duckdb rel = duckdb.sql("SELECT * FROM range(100_000) tbl(id)") rel.show() ``` ```text ┌────────────────────────┐ │ id │ │ int64 │ ├────────────────────────┤ │ 0 │ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │ 5 │ │ 6 │ │ 7 │ │ 8 │ │ 9 │ │ · │ │ · │ │ · │ │ 9990 │ │ 9991 │ │ 9992 │ │ 9993 │ │ 9994 │ │ 9995 │ │ 9996 │ │ 9997 │ │ 9998 │ │ 9999 │ ├────────────────────────┤ │ ? rows │ │ (>9999 rows, 20 shown) │ └────────────────────────┘ ``` Note how we are constructing a relation that computes an immense amount of data (10B rows or 74 GB of data). The relation is constructed instantly – and we can even print the relation instantly. When printing a relation using `show` or displaying it in the terminal, the first `10K` rows are fetched. If there are more than `10K` rows, the output window will show `>9999 rows` (as the amount of rows in the relation is unknown). ## Data Ingestion Outside of SQL queries, the following methods are provided to construct relation objects from external data. * `from_arrow` * `from_df` * `read_csv` * `read_json` * `read_parquet` ## SQL Queries Relation objects can be queried through SQL through [replacement scans]({% link docs/stable/clients/c/replacement_scans.md %}). If you have a relation object stored in a variable, you can refer to that variable as if it was a SQL table (in the `FROM` clause). This allows you to incrementally build queries using relation objects. ```python import duckdb rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") duckdb.sql("SELECT sum(id) FROM rel").show() ``` ```text ┌──────────────┐ │ sum(id) │ │ int128 │ ├──────────────┤ │ 499999500000 │ └──────────────┘ ``` ## Operations There are a number of operations that can be performed on relations. These are all short-hand for running the SQL queries – and will return relations again themselves. ### `aggregate(expr, groups = {})` Apply an (optionally grouped) aggregate over the relation. The system will automatically group by any columns that are not aggregates. ```python import duckdb rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") rel.aggregate("id % 2 AS g, sum(id), min(id), max(id)") ``` ```text ┌───────┬──────────────┬─────────┬─────────┐ │ g │ sum(id) │ min(id) │ max(id) │ │ int64 │ int128 │ int64 │ int64 │ ├───────┼──────────────┼─────────┼─────────┤ │ 0 │ 249999500000 │ 0 │ 999998 │ │ 1 │ 250000000000 │ 1 │ 999999 │ └───────┴──────────────┴─────────┴─────────┘ ``` ### `except_(rel)` Select all rows in the first relation, that do not occur in the second relation. The relations must have the same number of columns. ```python import duckdb r1 = duckdb.sql("SELECT * FROM range(10) tbl(id)") r2 = duckdb.sql("SELECT * FROM range(5) tbl(id)") r1.except_(r2).show() ``` ```text ┌───────┐ │ id │ │ int64 │ ├───────┤ │ 5 │ │ 6 │ │ 7 │ │ 8 │ │ 9 │ └───────┘ ``` ### `filter(condition)` Apply the given condition to the relation, filtering any rows that do not satisfy the condition. ```python import duckdb rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") rel.filter("id > 5").limit(3).show() ``` ```text ┌───────┐ │ id │ │ int64 │ ├───────┤ │ 6 │ │ 7 │ │ 8 │ └───────┘ ``` ### `intersect(rel)` Select the intersection of two relations – returning all rows that occur in both relations. The relations must have the same number of columns. ```python import duckdb r1 = duckdb.sql("SELECT * FROM range(10) tbl(id)") r2 = duckdb.sql("SELECT * FROM range(5) tbl(id)") r1.intersect(r2).show() ``` ```text ┌───────┐ │ id │ │ int64 │ ├───────┤ │ 0 │ │ 1 │ │ 2 │ │ 3 │ │ 4 │ └───────┘ ``` ### `join(rel, condition, type = "inner")` Combine two relations, joining them based on the provided condition. ```python import duckdb r1 = duckdb.sql("SELECT * FROM range(5) tbl(id)").set_alias("r1") r2 = duckdb.sql("SELECT * FROM range(10, 15) tbl(id)").set_alias("r2") r1.join(r2, "r1.id + 10 = r2.id").show() ``` ```text ┌───────┬───────┐ │ id │ id │ │ int64 │ int64 │ ├───────┼───────┤ │ 0 │ 10 │ │ 1 │ 11 │ │ 2 │ 12 │ │ 3 │ 13 │ │ 4 │ 14 │ └───────┴───────┘ ``` ### `limit(n, offset = 0)` Select the first *n* rows, optionally offset by *offset*. ```python import duckdb rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") rel.limit(3).show() ``` ```text ┌───────┐ │ id │ │ int64 │ ├───────┤ │ 0 │ │ 1 │ │ 2 │ └───────┘ ``` ### `order(expr)` Sort the relation by the given set of expressions. ```python import duckdb rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") rel.order("id DESC").limit(3).show() ``` ```text ┌────────┐ │ id │ │ int64 │ ├────────┤ │ 999999 │ │ 999998 │ │ 999997 │ └────────┘ ``` ### `project(expr)` Apply the given expression to each row in the relation. ```python import duckdb rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") rel.project("id + 10 AS id_plus_ten").limit(3).show() ``` ```text ┌─────────────┐ │ id_plus_ten │ │ int64 │ ├─────────────┤ │ 10 │ │ 11 │ │ 12 │ └─────────────┘ ``` ### `union(rel)` Combine two relations, returning all rows in `r1` followed by all rows in `r2`. The relations must have the same number of columns. ```python import duckdb r1 = duckdb.sql("SELECT * FROM range(5) tbl(id)") r2 = duckdb.sql("SELECT * FROM range(10, 15) tbl(id)") r1.union(r2).show() ``` ```text ┌───────┐ │ id │ │ int64 │ ├───────┤ │ 0 │ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │ 10 │ │ 11 │ │ 12 │ │ 13 │ │ 14 │ └───────┘ ``` ## Result Output The result of relations can be converted to various types of Python structures, see the [result conversion page]({% link docs/stable/clients/python/conversion.md %}) for more information. The result of relations can also be directly written to files using the below methods. * [`write_csv`]({% link docs/stable/clients/python/reference/index.md %}#duckdb.DuckDBPyRelation.write_csv) * [`write_parquet`]({% link docs/stable/clients/python/reference/index.md %}#duckdb.DuckDBPyRelation.write_parquet)