---
layout: docu
redirect_from:
- /docs/data/parquet
- /docs/data/parquet/
- /docs/extensions/parquet
- /docs/extensions/parquet/
- /docs/data/parquet/overview
title: Reading and Writing Parquet Files
---

## Examples

Read a single Parquet file:

```sql
SELECT * FROM 'test.parquet';
```

Figure out which columns/types are in a Parquet file:

```sql
DESCRIBE SELECT * FROM 'test.parquet';
```

Create a table from a Parquet file:

```sql
CREATE TABLE test AS
    SELECT * FROM 'test.parquet';
```

If the file does not end in `.parquet`, use the `read_parquet` function:

```sql
SELECT *
FROM read_parquet('test.parq');
```

Use list parameter to read three Parquet files and treat them as a single table:

```sql
SELECT *
FROM read_parquet(['file1.parquet', 'file2.parquet', 'file3.parquet']);
```

Read all files that match the glob pattern:

```sql
SELECT *
FROM 'test/*.parquet';
```

Read all files that match the glob pattern, and include a `filename` column that specifies which file each row came from:

```sql
SELECT *
FROM read_parquet('test/*.parquet', filename = true);
```

Use a list of globs to read all Parquet files from two specific folders:

```sql
SELECT *
FROM read_parquet(['folder1/*.parquet', 'folder2/*.parquet']);
```

Read over HTTPS:

```sql
SELECT *
FROM read_parquet('https://some.url/some_file.parquet');
```

Query the [metadata of a Parquet file]({% link docs/stable/data/parquet/metadata.md %}#parquet-metadata):

```sql
SELECT *
FROM parquet_metadata('test.parquet');
```

Query the [file metadata of a Parquet file]({% link docs/stable/data/parquet/metadata.md %}#parquet-file-metadata):

```sql
SELECT *
FROM parquet_file_metadata('test.parquet');
```

Query the [key-value metadata of a Parquet file]({% link docs/stable/data/parquet/metadata.md %}#parquet-key-value-metadata):

```sql
SELECT *
FROM parquet_kv_metadata('test.parquet');
```

Query the [schema of a Parquet file]({% link docs/stable/data/parquet/metadata.md %}#parquet-schema):

```sql
SELECT *
FROM parquet_schema('test.parquet');
```

Write the results of a query to a Parquet file using the default compression (Snappy):

```sql
COPY
    (SELECT * FROM tbl)
    TO 'result-snappy.parquet'
    (FORMAT parquet);
```

Write the results from a query to a Parquet file with specific compression and row group size:

```sql
COPY
    (FROM generate_series(100_000))
    TO 'test.parquet'
    (FORMAT parquet, COMPRESSION zstd, ROW_GROUP_SIZE 100_000);
```

Export the table contents of the entire database as parquet:

```sql
EXPORT DATABASE 'target_directory' (FORMAT parquet);
```

## Parquet Files

Parquet files are compressed columnar files that are efficient to load and process. DuckDB provides support for both reading and writing Parquet files in an efficient manner, as well as support for pushing filters and projections into the Parquet file scans.

> Parquet data sets differ based on the number of files, the size of individual files, the compression algorithm used, row group size, etc. These have a significant effect on performance. Please consult the [Performance Guide]({% link docs/stable/guides/performance/file_formats.md %}) for details.

## `read_parquet` Function

| Function | Description | Example |
|:--|:--|:-----|
| `read_parquet(path_or_list_of_paths)` | Read Parquet file(s)     | `SELECT * FROM read_parquet('test.parquet');` |
| `parquet_scan(path_or_list_of_paths)` | Alias for `read_parquet` | `SELECT * FROM parquet_scan('test.parquet');` |

If your file ends in `.parquet`, the function syntax is optional. The system will automatically infer that you are reading a Parquet file:

```sql
SELECT * FROM 'test.parquet';
```

Multiple files can be read at once by providing a glob or a list of files. Refer to the [multiple files section]({% link docs/stable/data/multiple_files/overview.md %}) for more information.

### Parameters

There are a number of options exposed that can be passed to the `read_parquet` function or the [`COPY` statement]({% link docs/stable/sql/statements/copy.md %}).

| Name | Description | Type | Default |
|:--|:-----|:-|:-|
| `binary_as_string` | Parquet files generated by legacy writers do not correctly set the `UTF8` flag for strings, causing string columns to be loaded as `BLOB` instead. Set this to true to load binary columns as strings. | `BOOL` | `false` |
| `encryption_config` | Configuration for [Parquet encryption]({% link docs/stable/data/parquet/encryption.md %}). | `STRUCT` | - |
| `filename` | Whether or not an extra `filename` column should be included in the result. | `BOOL` | `false` |
| `file_row_number` | Whether or not to include the `file_row_number` column. | `BOOL` | `false` |
| `hive_partitioning` | Whether or not to interpret the path as a [Hive partitioned path]({% link docs/stable/data/partitioning/hive_partitioning.md %}). | `BOOL` | `true` |
| `union_by_name` | Whether the columns of multiple schemas should be [unified by name]({% link docs/stable/data/multiple_files/combining_schemas.md %}), rather than by position. | `BOOL` | `false` |

## Partial Reading

DuckDB supports projection pushdown into the Parquet file itself. That is to say, when querying a Parquet file, only the columns required for the query are read. This allows you to read only the part of the Parquet file that you are interested in. This will be done automatically by DuckDB.

DuckDB also supports filter pushdown into the Parquet reader. When you apply a filter to a column that is scanned from a Parquet file, the filter will be pushed down into the scan, and can even be used to skip parts of the file using the built-in zonemaps. Note that this will depend on whether or not your Parquet file contains zonemaps.

Filter and projection pushdown provide significant performance benefits. See [our blog post “Querying Parquet with Precision Using DuckDB”]({% post_url 2021-06-25-querying-parquet %}) for more information.

## Inserts and Views

You can also insert the data into a table or create a table from the Parquet file directly. This will load the data from the Parquet file and insert it into the database:

Insert the data from the Parquet file in the table:

```sql
INSERT INTO people
    SELECT * FROM read_parquet('test.parquet');
```

Create a table directly from a Parquet file:

```sql
CREATE TABLE people AS
    SELECT * FROM read_parquet('test.parquet');
```

If you wish to keep the data stored inside the Parquet file, but want to query the Parquet file directly, you can create a view over the `read_parquet` function. You can then query the Parquet file as if it were a built-in table:

Create a view over the Parquet file:

```sql
CREATE VIEW people AS
    SELECT * FROM read_parquet('test.parquet');
```

Query the Parquet file:

```sql
SELECT * FROM people;
```

## Writing to Parquet Files

DuckDB also has support for writing to Parquet files using the `COPY` statement syntax. See the [`COPY` Statement page]({% link docs/stable/sql/statements/copy.md %}) for details, including all possible parameters for the `COPY` statement.

Write a query to a Snappy-compressed Parquet file:

```sql
COPY
    (SELECT * FROM tbl)
    TO 'result-snappy.parquet'
    (FORMAT parquet);
```

Write `tbl` to a zstd-compressed Parquet file:

```sql
COPY tbl
    TO 'result-zstd.parquet'
    (FORMAT parquet, COMPRESSION zstd);
```

Write `tbl` to a zstd-compressed Parquet file with the lowest compression level yielding the fastest compression:

```sql
COPY tbl
    TO 'result-zstd.parquet'
    (FORMAT parquet, COMPRESSION zstd, COMPRESSION_LEVEL 1);
```

Write to Parquet file with [key-value metadata]({% link docs/stable/data/parquet/metadata.md %}):

```sql
COPY (
    SELECT
        42 AS number,
        true AS is_even
) TO 'kv_metadata.parquet' (
    FORMAT parquet,
    KV_METADATA {
        number: 'Answer to life, universe, and everything',
        is_even: 'not ''odd''' -- single quotes in values must be escaped
    }
);
```

Write a CSV file to an uncompressed Parquet file:

```sql
COPY
    'test.csv'
    TO 'result-uncompressed.parquet'
    (FORMAT parquet, COMPRESSION uncompressed);
```

Write a query to a Parquet file with zstd-compression and row group size:

```sql
COPY
    (FROM generate_series(100_000))
    TO 'row-groups-zstd.parquet'
    (FORMAT parquet, COMPRESSION zstd, ROW_GROUP_SIZE 100_000);
```

Write data to an LZ4-compressed Parquet file:

```sql
COPY
    (FROM generate_series(100_000))
    TO 'result-lz4.parquet'
    (FORMAT parquet, COMPRESSION lz4);
```

Or, equivalently:

```sql
COPY
    (FROM generate_series(100_000))
    TO 'result-lz4.parquet'
    (FORMAT parquet, COMPRESSION lz4_raw);
```

Write data to a Brotli-compressed Parquet file:

```sql
COPY
    (FROM generate_series(100_000))
    TO 'result-brotli.parquet'
    (FORMAT parquet, COMPRESSION brotli);
```

DuckDB's `EXPORT` command can be used to export an entire database to a series of Parquet files. See the [“`EXPORT` statement” page]({% link docs/stable/sql/statements/export.md %}) for more details:

Export the table contents of the entire database as Parquet:

```sql
EXPORT DATABASE 'target_directory' (FORMAT parquet);
```

## Encryption

DuckDB supports reading and writing [encrypted Parquet files]({% link docs/stable/data/parquet/encryption.md %}).

## Supported Features

The list of supported Parquet features is available in the [Parquet documentation's “Implementation status” page](https://parquet.apache.org/docs/file-format/implementationstatus/).

## Installing and Loading the Parquet Extension

The support for Parquet files is enabled via extension. The `parquet` extension is bundled with almost all clients. However, if your client does not bundle the `parquet` extension, the extension must be installed separately:

```sql
INSTALL parquet;
```