---
blurb: The FROM clause can contain a single table, a combination of multiple tables
  that are joined together, or another SELECT query inside a subquery node.
layout: docu
railroad: query_syntax/from.js
redirect_from:
- /docs/sql/query_syntax/from
title: FROM and JOIN Clauses
---

The `FROM` clause specifies the *source* of the data on which the remainder of the query should operate. Logically, the `FROM` clause is where the query starts execution. The `FROM` clause can contain a single table, a combination of multiple tables that are joined together using `JOIN` clauses, or another `SELECT` query inside a subquery node. DuckDB also has an optional `FROM`-first syntax which enables you to also query without a `SELECT` statement.

## Examples

Select all columns from the table called `table_name`:

```sql
SELECT *
FROM table_name;
```

Select all columns from the table using the `FROM`-first syntax:

```sql
FROM table_name
SELECT *;
```

Select all columns using the `FROM`-first syntax and omitting the `SELECT` clause:

```sql
FROM table_name;
```

Select all columns from the table called `table_name` through an alias `tn`:

```sql
SELECT tn.*
FROM table_name tn;
```

Use a prefix alias:

```sql
SELECT tn.*
FROM tn: table_name;
```

Select all columns from the table `table_name` in the schema `schema_name`:

```sql
SELECT *
FROM schema_name.table_name;
```

Select the column `i` from the table function `range`, where the first column of the range function is renamed to `i`:

```sql
SELECT t.i
FROM range(100) AS t(i);
```

Select all columns from the CSV file called `test.csv`:

```sql
SELECT *
FROM 'test.csv';
```

Select all columns from a subquery:

```sql
SELECT *
FROM (SELECT * FROM table_name);
```

Select the entire row of the table as a struct:

```sql
SELECT t
FROM t;
```

Select the entire row of the subquery as a struct (i.e., a single column):

```sql
SELECT t
FROM (SELECT unnest(generate_series(41, 43)) AS x, 'hello' AS y) t;
```

Join two tables together:

```sql
SELECT *
FROM table_name
JOIN other_table
  ON table_name.key = other_table.key;
```

Select a 10% sample from a table:

```sql
SELECT *
FROM table_name
TABLESAMPLE 10%;
```

Select a sample of 10 rows from a table:

```sql
SELECT *
FROM table_name
TABLESAMPLE 10 ROWS;
```

Use the `FROM`-first syntax with `WHERE` clause and aggregation:

```sql
FROM range(100) AS t(i)
SELECT sum(t.i)
WHERE i % 2 = 0;
```

## Joins

Joins are a fundamental relational operation used to connect two tables or relations horizontally.
The relations are referred to as the _left_ and _right_ sides of the join
based on how they are written in the join clause.
Each result row has the columns from both relations.

A join uses a rule to match pairs of rows from each relation.
Often this is a predicate, but there are other implied rules that may be specified.

### Outer Joins

Rows that do not have any matches can still be returned if an `OUTER` join is specified.
Outer joins can be one of:

* `LEFT` (All rows from the left relation appear at least once)
* `RIGHT` (All rows from the right relation appear at least once)
* `FULL` (All rows from both relations appear at least once)

A join that is not `OUTER` is `INNER` (only rows that get paired are returned).

When an unpaired row is returned, the attributes from the other table are set to `NULL`.

### Cross Product Joins (Cartesian Product)

The simplest type of join is a `CROSS JOIN`.
There are no conditions for this type of join,
and it just returns all the possible pairs.

Return all pairs of rows:

```sql
SELECT a.*, b.*
FROM a
CROSS JOIN b;
```

This is equivalent to omitting the `JOIN` clause:

```sql
SELECT a.*, b.*
FROM a, b;
```

### Conditional Joins

Most joins are specified by a predicate that connects
attributes from one side to attributes from the other side.
The conditions can be explicitly specified using an `ON` clause
with the join (clearer) or implied by the `WHERE` clause (old-fashioned).

We use the `l_regions` and the `l_nations` tables from the TPC-H schema:

```sql
CREATE TABLE l_regions (
    r_regionkey INTEGER NOT NULL PRIMARY KEY,
    r_name      CHAR(25) NOT NULL,
    r_comment   VARCHAR(152)
);

CREATE TABLE l_nations (
    n_nationkey INTEGER NOT NULL PRIMARY KEY,
    n_name      CHAR(25) NOT NULL,
    n_regionkey INTEGER NOT NULL,
    n_comment   VARCHAR(152),
    FOREIGN KEY (n_regionkey) REFERENCES l_regions(r_regionkey)
);
```

Return the regions for the nations:

```sql
SELECT n.*, r.*
FROM l_nations n
JOIN l_regions r ON (n_regionkey = r_regionkey);
```

If the column names are the same and are required to be equal,
then the simpler `USING` syntax can be used:

```sql
CREATE TABLE l_regions (regionkey INTEGER NOT NULL PRIMARY KEY,
                        name      CHAR(25) NOT NULL,
                        comment   VARCHAR(152));

CREATE TABLE l_nations (nationkey INTEGER NOT NULL PRIMARY KEY,
                        name      CHAR(25) NOT NULL,
                        regionkey INTEGER NOT NULL,
                        comment   VARCHAR(152),
                        FOREIGN KEY (regionkey) REFERENCES l_regions(regionkey));
```

Return the regions for the nations:

```sql
SELECT n.*, r.*
FROM l_nations n
JOIN l_regions r USING (regionkey);
```

The expressions do not have to be equalities – any predicate can be used:

Return the pairs of jobs where one ran longer but cost less:

```sql
SELECT s1.t_id, s2.t_id
FROM west s1, west s2
WHERE s1.time > s2.time
  AND s1.cost < s2.cost;
```

### Natural Joins

Natural joins join two tables based on attributes that share the same name.

For example, take the following example with cities, airport codes and airport names. Note that both tables are intentionally incomplete, i.e., they do not have a matching pair in the other table.

```sql
CREATE TABLE city_airport (city_name VARCHAR, iata VARCHAR);
CREATE TABLE airport_names (iata VARCHAR, airport_name VARCHAR);
INSERT INTO city_airport VALUES
    ('Amsterdam', 'AMS'),
    ('Rotterdam', 'RTM'),
    ('Eindhoven', 'EIN'),
    ('Groningen', 'GRQ');
INSERT INTO airport_names VALUES
    ('AMS', 'Amsterdam Airport Schiphol'),
    ('RTM', 'Rotterdam The Hague Airport'),
    ('MST', 'Maastricht Aachen Airport');
```

To join the tables on their shared [`IATA`](https://en.wikipedia.org/wiki/IATA_airport_code) attributes, run:

```sql
SELECT *
FROM city_airport
NATURAL JOIN airport_names;
```

This produces the following result:

| city_name | iata |        airport_name         |
|-----------|------|-----------------------------|
| Amsterdam | AMS  | Amsterdam Airport Schiphol  |
| Rotterdam | RTM  | Rotterdam The Hague Airport |

Note that only rows where the same `iata` attribute was present in both tables were included in the result.

We can also express query using the vanilla `JOIN` clause with the `USING` keyword:

```sql
SELECT *
FROM city_airport
JOIN airport_names
USING (iata);
```

### Semi and Anti Joins

Semi joins return rows from the left table that have at least one match in the right table.
Anti joins return rows from the left table that have _no_ matches in the right table.
When using a semi or anti join the result will never have more rows than the left hand side table.
Semi joins provide the same logic as the [`IN` operator]({% link docs/stable/sql/expressions/in.md %}) statement.
Anti joins provide the same logic as the `NOT IN` operator, except anti joins ignore `NULL` values from the right table.

#### Semi Join Example

Return a list of city–airport code pairs from the `city_airport` table where the airport name **is available** in the `airport_names` table:

```sql
SELECT *
FROM city_airport
SEMI JOIN airport_names
    USING (iata);
```

| city_name | iata |
|-----------|------|
| Amsterdam | AMS  |
| Rotterdam | RTM  |

This query is equivalent with:

```sql
SELECT *
FROM city_airport
WHERE iata IN (SELECT iata FROM airport_names);
```

#### Anti Join Example

Return a list of city–airport code pairs from the `city_airport` table where the airport name **is not available** in the `airport_names` table:

```sql
SELECT *
FROM city_airport
ANTI JOIN airport_names
    USING (iata);
```

| city_name | iata |
|-----------|------|
| Eindhoven | EIN  |
| Groningen | GRQ  |

This query is equivalent with:

```sql
SELECT *
FROM city_airport
WHERE iata NOT IN (SELECT iata FROM airport_names WHERE iata IS NOT NULL);
```

### Lateral Joins

The `LATERAL` keyword allows subqueries in the `FROM` clause to refer to previous subqueries. This feature is also known as a _lateral join_.

```sql
SELECT *
FROM range(3) t(i), LATERAL (SELECT i + 1) t2(j);
```

<div class="center_aligned_header_table"></div>

| i | j |
|--:|--:|
| 0 | 1 |
| 2 | 3 |
| 1 | 2 |

Lateral joins are a generalization of correlated subqueries, as they can return multiple values per input value rather than only a single value.

```sql
SELECT *
FROM
    generate_series(0, 1) t(i),
    LATERAL (SELECT i + 10 UNION ALL SELECT i + 100) t2(j);
```

<div class="center_aligned_header_table"></div>

| i |  j  |
|--:|----:|
| 0 | 10  |
| 1 | 11  |
| 0 | 100 |
| 1 | 101 |

It may be helpful to think about `LATERAL` as a loop where we iterate through the rows of the first subquery and use it as input to the second (`LATERAL`) subquery.
In the examples above, we iterate through table `t` and refer to its column `i` from the definition of table `t2`. The rows of `t2` form column `j` in the result.

It is possible to refer to multiple attributes from the `LATERAL` subquery. Using the table from the first example:

```sql
CREATE TABLE t1 AS
    SELECT *
    FROM range(3) t(i), LATERAL (SELECT i + 1) t2(j);

SELECT *
    FROM t1, LATERAL (SELECT i + j) t2(k)
    ORDER BY ALL;
```

<div class="center_aligned_header_table"></div>

| i | j | k |
|--:|--:|--:|
| 0 | 1 | 1 |
| 1 | 2 | 3 |
| 2 | 3 | 5 |

> DuckDB detects when `LATERAL` joins should be used, making the use of the `LATERAL` keyword optional.

### Positional Joins

When working with data frames or other embedded tables of the same size,
the rows may have a natural correspondence based on their physical order.
In scripting languages, this is easily expressed using a loop:

```cpp
for (i = 0; i < n; i++) {
    f(t1.a[i], t2.b[i]);
}
```

It is difficult to express this in standard SQL because
relational tables are not ordered, but imported tables such as [data frames]({% link docs/stable/clients/python/data_ingestion.md %}#pandas-dataframes-–-object-columns)
or disk files (like [CSVs]({% link docs/stable/data/csv/overview.md %}) or [Parquet files]({% link docs/stable/data/parquet/overview.md %})) do have a natural ordering.

Connecting them using this ordering is called a _positional join:_

```sql
CREATE TABLE t1 (x INTEGER);
CREATE TABLE t2 (s VARCHAR);

INSERT INTO t1 VALUES (1), (2), (3);
INSERT INTO t2 VALUES ('a'), ('b');

SELECT *
FROM t1
POSITIONAL JOIN t2;
```

<div class="center_aligned_header_table"></div>

| x |  s   |
|--:|------|
| 1 | a    |
| 2 | b    |
| 3 | NULL |

Positional joins are always `FULL OUTER` joins, i.e., missing values (the last values in the shorter column) are set to `NULL`.

### As-Of Joins

A common operation when working with temporal or similarly-ordered data
is to find the nearest (first) event in a reference table (such as prices).
This is called an _as-of join:_

Attach prices to stock trades:

```sql
SELECT t.*, p.price
FROM trades t
ASOF JOIN prices p
       ON t.symbol = p.symbol AND t.when >= p.when;
```

The `ASOF` join requires at least one inequality condition on the ordering field.
The inequality can be any inequality condition (`>=`, `>`, `<=`, `<`)
on any data type, but the most common form is `>=` on a temporal type.
Any other conditions must be equalities (or `NOT DISTINCT`).
This means that the left/right order of the tables is significant.

`ASOF` joins each left side row with at most one right side row.
It can be specified as an `OUTER` join to find unpaired rows
(e.g., trades without prices or prices which have no trades.)

Attach prices or NULLs to stock trades:

```sql
SELECT *
FROM trades t
ASOF LEFT JOIN prices p
            ON t.symbol = p.symbol
           AND t.when >= p.when;
```

`ASOF` joins can also specify join conditions on matching column names with the `USING` syntax,
but the *last* attribute in the list must be the inequality,
which will be greater than or equal to (`>=`):

```sql
SELECT *
FROM trades t
ASOF JOIN prices p USING (symbol, "when");
```

Returns symbol, trades.when, price (but NOT prices.when):

If you combine `USING` with a `SELECT *` like this,
the query will return the left side (probe) column values for the matches,
not the right side (build) column values.
To get the `prices` times in the example, you will need to list the columns explicitly:

```sql
SELECT t.symbol, t.when AS trade_when, p.when AS price_when, price
FROM trades t
ASOF LEFT JOIN prices p USING (symbol, "when");
```

### Self-Joins

DuckDB allows self-joins for all types of joins.
Note that tables need to be aliased, using the same table name without aliases will result in an error:

```sql
CREATE TABLE t (x int);
SELECT * FROM t JOIN t USING(x);
```

```console
Binder Error:
Duplicate alias "t" in query!
```

Adding the aliases allows the query to parse successfully:

```sql
SELECT * FROM t AS t t1 JOIN t t2 USING(x);
```

## `FROM`-First Syntax

DuckDB's SQL supports the `FROM`-first syntax, i.e., it allows putting the `FROM` clause before the `SELECT` clause or completely omitting the `SELECT` clause. We use the following example to demonstrate it:

```sql
CREATE TABLE tbl AS
    SELECT *
    FROM (VALUES ('a'), ('b')) t1(s), range(1, 3) t2(i);
```

### `FROM`-First Syntax with a `SELECT` Clause

The following statement demonstrates the use of the `FROM`-first syntax:

```sql
FROM tbl
SELECT i, s;
```

This is equivalent to:

```sql
SELECT i, s
FROM tbl;
```

<div class="center_aligned_header_table"></div>

| i | s |
|--:|---|
| 1 | a |
| 2 | a |
| 1 | b |
| 2 | b |

### `FROM`-First Syntax without a `SELECT` Clause

The following statement demonstrates the use of the optional `SELECT` clause:

```sql
FROM tbl;
```

This is equivalent to:

```sql
SELECT *
FROM tbl;
```

<div class="center_aligned_header_table"></div>

| s | i |
|---|--:|
| a | 1 |
| a | 2 |
| b | 1 |
| b | 2 |

## Syntax

<div id="rrdiagram"></div>