---
layout: docu
title: Text Functions
---

<!-- markdownlint-disable MD001 -->

## Text Functions and Operators

This section describes functions and operators for examining and manipulating [`STRING` values]({% link docs/1.3/sql/data_types/text.md %}).

<!-- Start of section generated by scripts/generate_sql_function_docs.py; categories: [string, regex] -->
<!-- markdownlint-disable MD056 -->

| Name | Description |
|:--|:-------|
| [`string[index]`](#stringindex) | Extracts a single character using a (1-based) index. |
| [`string[begin:end]`](#stringbeginend) | Extracts a string using [slice conventions]({% link docs/1.3/sql/functions/list.md %}#slicing) similar to Python. Missing `begin` or `end` arguments are interpreted as the beginning or end of the list respectively. Negative values are accepted. |
| [`string LIKE target`](#string-like-target) | Returns `true` if the `string` matches the like specifier (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})). |
| [`string SIMILAR TO regex`](#string-similar-to-regex) | Returns `true` if the `string` matches the `regex` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})). |
| [`string ^@ search_string`](#string--search_string) | Returns `true` if `string` begins with `search_string`. |
| [`arg1 || arg2`](#arg1--arg2) | Concatenates two strings, lists, or blobs. Any `NULL` input results in `NULL`. See also [`concat(arg1, arg2, ...)`](#concatvalue-) and [`list_concat(list1, list2)`]({% link docs/1.3/sql/functions/list.md %}#list_concatlist1-list2). |
| [`array_extract(string, index)`](#array_extractstring-index) | Extracts a single character from a `string` using a (1-based) `index`. |
| [`array_slice(list, begin, end)`](#array_slicelist-begin-end) | Extracts a sublist or substring using [slice conventions]({% link docs/1.3/sql/functions/list.md %}#slicing). Negative values are accepted. |
| [`ascii(string)`](#asciistring) | Returns an integer that represents the Unicode code point of the first character of the `string`. |
| [`bar(x, min, max[, width])`](#barx-min-max-width) | Draws a band whose width is proportional to (`x - min`) and equal to `width` characters when `x` = `max`. `width` defaults to 80. |
| [`base64(blob)`](#base64blob) | Converts a `blob` to a base64 encoded string. |
| [`bin(string)`](#binstring) | Converts the `string` to binary representation. |
| [`bit_length(string)`](#bit_lengthstring) | Number of bits in a `string`. |
| [`char_length(string)`](#char_lengthstring) | Number of characters in `string`. |
| [`character_length(string)`](#character_lengthstring) | Number of characters in `string`. |
| [`chr(code_point)`](#chrcode_point) | Returns a character which is corresponding the ASCII code value or Unicode code point. |
| [`concat(value, ...)`](#concatvalue-) | Concatenates multiple strings, lists, or blobs. `NULL` inputs are skipped. See also [operator `||`](#arg1--arg2). |
| [`concat_ws(separator, string, ...)`](#concat_wsseparator-string-) | Concatenates many strings, separated by `separator`. `NULL` inputs are skipped. |
| [`contains(string, search_string)`](#containsstring-search_string) | Returns `true` if `search_string` is found within `string`. Note that [collations]({% link docs/1.3/sql/expressions/collations.md %}) are not supported. |
| [`ends_with(string, search_string)`](#ends_withstring-search_string) | Returns `true` if `string` ends with `search_string`. |
| [`format(format, ...)`](#formatformat-) | Formats a string using the [fmt syntax](#fmt-syntax). |
| [`formatReadableDecimalSize(integer)`](#formatreadabledecimalsizeinteger) | Converts `integer` to a human-readable representation using units based on powers of 10 (KB, MB, GB, etc.). |
| [`formatReadableSize(integer)`](#formatreadablesizeinteger) | Converts `integer` to a human-readable representation using units based on powers of 2 (KiB, MiB, GiB, etc.). |
| [`format_bytes(integer)`](#format_bytesinteger) | Converts `integer` to a human-readable representation using units based on powers of 2 (KiB, MiB, GiB, etc.). |
| [`from_base64(string)`](#from_base64string) | Converts a base64 encoded `string` to a character string (`BLOB`). |
| [`from_binary(value)`](#from_binaryvalue) | Converts a `value` from binary representation to a blob. |
| [`from_hex(value)`](#from_hexvalue) | Converts a `value` from hexadecimal representation to a blob. |
| [`greatest(arg1, ...)`](#greatestarg1-) | Returns the largest value in lexicographical order. Note that lowercase characters are considered larger than uppercase characters and [collations]({% link docs/1.3/sql/expressions/collations.md %}) are not supported. |
| [`hash(value, ...)`](#hashvalue-) | Returns a `UBIGINT` with the hash of the `value`. Note that this is not a cryptographic hash. |
| [`hex(string)`](#hexstring) | Converts the `string` to hexadecimal representation. |
| [`ilike_escape(string, like_specifier, escape_character)`](#ilike_escapestring-like_specifier-escape_character) | Returns `true` if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})) using case-insensitive matching. `escape_character` is used to search for wildcard characters in the `string`. |
| [`instr(string, search_string)`](#instrstring-search_string) | Returns location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. |
| [`lcase(string)`](#lcasestring) | Converts `string` to lower case. |
| [`least(arg1, ...)`](#leastarg1-) | Returns the smallest value in lexicographical order. Note that uppercase characters are considered smaller than lowercase characters and [collations]({% link docs/1.3/sql/expressions/collations.md %}) are not supported. |
| [`left(string, count)`](#leftstring-count) | Extracts the left-most count characters. |
| [`left_grapheme(string, count)`](#left_graphemestring-count) | Extracts the left-most count grapheme clusters. |
| [`len(string)`](#lenstring) | Number of characters in `string`. |
| [`length(string)`](#lengthstring) | Number of characters in `string`. |
| [`length_grapheme(string)`](#length_graphemestring) | Number of grapheme clusters in `string`. |
| [`like_escape(string, like_specifier, escape_character)`](#like_escapestring-like_specifier-escape_character) | Returns `true` if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})) using case-sensitive matching. `escape_character` is used to search for wildcard characters in the `string`. |
| [`lower(string)`](#lowerstring) | Converts `string` to lower case. |
| [`lpad(string, count, character)`](#lpadstring-count-character) | Pads the `string` with the `character` on the left until it has `count` characters. Truncates the `string` on the right if it has more than `count` characters. |
| [`ltrim(string[, characters])`](#ltrimstring-characters) | Removes any occurrences of any of the `characters` from the left side of the `string`. `characters` defaults to `space`. |
| [`md5(string)`](#md5string) | Returns the MD5 hash of the `string` as a `VARCHAR`. |
| [`md5_number(string)`](#md5_numberstring) | Returns the MD5 hash of the `string` as a `HUGEINT`. |
| [`md5_number_lower(string)`](#md5_number_lowerstring) | Returns the lower 64-bit segment of the MD5 hash of the `string` as a `UBIGINT`. |
| [`md5_number_upper(string)`](#md5_number_upperstring) | Returns the upper 64-bit segment of the MD5 hash of the `string` as a `UBIGINT`. |
| [`nfc_normalize(string)`](#nfc_normalizestring) | Converts `string` to Unicode NFC normalized string. Useful for comparisons and ordering if text data is mixed between NFC normalized and not. |
| [`not_ilike_escape(string, like_specifier, escape_character)`](#not_ilike_escapestring-like_specifier-escape_character) | Returns `false` if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})) using case-insensitive matching. `escape_character` is used to search for wildcard characters in the `string`. |
| [`not_like_escape(string, like_specifier, escape_character)`](#not_like_escapestring-like_specifier-escape_character) | Returns `false` if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})) using case-sensitive matching. `escape_character` is used to search for wildcard characters in the `string`. |
| [`ord(string)`](#ordstring) | Returns an `INTEGER` representing the `unicode` codepoint of the first character in the `string`. |
| [`parse_dirname(path[, separator])`](#parse_dirnamepath-separator) | Returns the top-level directory name from the given `path`. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. |
| [`parse_dirpath(path[, separator])`](#parse_dirpathpath-separator) | Returns the head of the `path` (the pathname until the last slash) similarly to Python's [`os.path.dirname`](https://docs.python.org/3.7/library/os.path.html#os.path.dirname). `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. |
| [`parse_filename(string[, trim_extension][, separator])`](#parse_filenamestring-trim_extension-separator) | Returns the last component of the `path` similarly to Python's [`os.path.basename`](https://docs.python.org/3.7/library/os.path.html#os.path.basename) function. If `trim_extension` is `true`, the file extension will be removed (defaults to `false`). `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. |
| [`parse_path(path[, separator])`](#parse_pathpath-separator) | Returns a list of the components (directories and filename) in the `path` similarly to Python's [`pathlib.parts`](https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.parts) function. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. |
| [`position(search_string IN string)`](#positionsearch_string-in-string) | Return location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. |
| [`prefix(string, search_string)`](#prefixstring-search_string) | Returns `true` if `string` starts with `search_string`. |
| [`printf(format, ...)`](#printfformat-) | Formats a `string` using [printf syntax](#printf-syntax). |
| [`read_text(source)`](#read_textsource) | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `VARCHAR`. The file content is first validated to be valid UTF-8. If `read_text` attempts to read a file with invalid UTF-8 an error is thrown suggesting to use `read_blob` instead. See the [`read_text` guide]({% link docs/1.3/guides/file_formats/read_file.md %}#read_text) for more details. |
| [`regexp_escape(string)`](#regexp_escapestring) | Escapes special patterns to turn `string` into a regular expression similarly to Python's [`re.escape` function](https://docs.python.org/3/library/re.html#re.escape). |
| [`regexp_extract(string, regex[, group][, options])`](#regexp_extractstring-regex-group-options) | If `string` contains the `regex` pattern, returns the capturing group specified by optional parameter `group`; otherwise, returns the empty string. The `group` must be a constant value. If no `group` is given, it defaults to 0. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| [`regexp_extract(string, regex, name_list[, options])`](#regexp_extractstring-regex-name_list-options) | If `string` contains the `regex` pattern, returns the capturing groups as a struct with corresponding names from `name_list`; otherwise, returns a struct with the same keys and empty strings as values. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| [`regexp_extract_all(string, regex[, group][, options])`](#regexp_extract_allstring-regex-group-options) | Finds non-overlapping occurrences of the `regex` in the `string` and returns the corresponding values of the capturing `group`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| [`regexp_full_match(string, regex[, col2])`](#regexp_full_matchstring-regex-col2) | Returns `true` if the entire `string` matches the `regex`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| [`regexp_matches(string, regex[, options])`](#regexp_matchesstring-regex-options) | Returns `true` if `string` contains the `regex`, `false` otherwise. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| [`regexp_replace(string, regex, replacement[, options])`](#regexp_replacestring-regex-replacement-options) | If `string` contains the `regex`, replaces the matching part with `replacement`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| [`regexp_split_to_array(string, regex[, options])`](#regexp_split_to_arraystring-regex-options) | Splits the `string` along the `regex`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| [`regexp_split_to_table(string, regex)`](#regexp_split_to_tablestring-regex) | Splits the `string` along the `regex` and returns a row for each part. |
| [`repeat(string, count)`](#repeatstring-count) | Repeats the `string` `count` number of times. |
| [`replace(string, source, target)`](#replacestring-source-target) | Replaces any occurrences of the `source` with `target` in `string`. |
| [`reverse(string)`](#reversestring) | Reverses the `string`. |
| [`right(string, count)`](#rightstring-count) | Extract the right-most `count` characters. |
| [`right_grapheme(string, count)`](#right_graphemestring-count) | Extracts the right-most `count` grapheme clusters. |
| [`rpad(string, count, character)`](#rpadstring-count-character) | Pads the `string` with the `character` on the right until it has `count` characters. Truncates the `string` on the right if it has more than `count` characters. |
| [`rtrim(string[, characters])`](#rtrimstring-characters) | Removes any occurrences of any of the `characters` from the right side of the `string`. `characters` defaults to `space`. |
| [`sha1(value)`](#sha1value) | Returns a `VARCHAR` with the SHA-1 hash of the `value`. |
| [`sha256(value)`](#sha256value) | Returns a `VARCHAR` with the SHA-256 hash of the `value` |
| [`split(string, separator)`](#splitstring-separator) | Splits the `string` along the `separator`. |
| [`split_part(string, separator, index)`](#split_partstring-separator-index) | Splits the `string` along the `separator` and returns the data at the (1-based) `index` of the list. If the `index` is outside the bounds of the list, return an empty string (to match PostgreSQL's behavior). |
| [`starts_with(string, search_string)`](#starts_withstring-search_string) | Returns `true` if `string` begins with `search_string`. |
| [`str_split(string, separator)`](#str_splitstring-separator) | Splits the `string` along the `separator`. |
| [`str_split_regex(string, regex[, options])`](#str_split_regexstring-regex-options) | Splits the `string` along the `regex`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| [`string_split(string, separator)`](#string_splitstring-separator) | Splits the `string` along the `separator`. |
| [`string_split_regex(string, regex[, options])`](#string_split_regexstring-regex-options) | Splits the `string` along the `regex`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| [`string_to_array(string, separator)`](#string_to_arraystring-separator) | Splits the `string` along the `separator`. |
| [`strip_accents(string)`](#strip_accentsstring) | Strips accents from `string`. |
| [`strlen(string)`](#strlenstring) | Number of bytes in `string`. |
| [`strpos(string, search_string)`](#strposstring-search_string) | Returns location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. |
| [`substr(string, start[, length])`](#substrstring-start-length) | Extracts substring starting from character `start` up to the end of the string. If optional argument `length` is set, extracts a substring of `length` characters instead. Note that a `start` value of `1` refers to the first character of the `string`. |
| [`substring(string, start[, length])`](#substringstring-start-length) | Extracts substring starting from character `start` up to the end of the string. If optional argument `length` is set, extracts a substring of `length` characters instead. Note that a `start` value of `1` refers to the first character of the `string`. |
| [`substring_grapheme(string, start[, length])`](#substring_graphemestring-start-length) | Extracts substring starting from grapheme clusters `start` up to the end of the string. If optional argument `length` is set, extracts a substring of `length` grapheme clusters instead. Note that a `start` value of `1` refers to the `first` character of the `string`. |
| [`suffix(string, search_string)`](#suffixstring-search_string) | Returns `true` if `string` ends with `search_string`. |
| [`to_base(number, radix[, min_length])`](#to_basenumber-radix-min_length) | Converts `number` to a string in the given base `radix`, optionally padding with leading zeros to `min_length`. |
| [`to_base64(blob)`](#to_base64blob) | Converts a `blob` to a base64 encoded string. |
| [`to_binary(string)`](#to_binarystring) | Converts the `string` to binary representation. |
| [`to_hex(string)`](#to_hexstring) | Converts the `string` to hexadecimal representation. |
| [`translate(string, from, to)`](#translatestring-from-to) | Replaces each character in `string` that matches a character in the `from` set with the corresponding character in the `to` set. If `from` is longer than `to`, occurrences of the extra characters in `from` are deleted. |
| [`trim(string[, characters])`](#trimstring-characters) | Removes any occurrences of any of the `characters` from either side of the `string`. `characters` defaults to `space`. |
| [`ucase(string)`](#ucasestring) | Converts `string` to upper case. |
| [`unbin(value)`](#unbinvalue) | Converts a `value` from binary representation to a blob. |
| [`unhex(value)`](#unhexvalue) | Converts a `value` from hexadecimal representation to a blob. |
| [`unicode(string)`](#unicodestring) | Returns an `INTEGER` representing the `unicode` codepoint of the first character in the `string`. |
| [`upper(string)`](#upperstring) | Converts `string` to upper case. |
| [`url_decode(string)`](#url_decodestring) | Decodes a URL from a representation using [Percent-Encoding](https://datatracker.ietf.org/doc/html/rfc3986#section-2.1). |
| [`url_encode(string)`](#url_encodestring) | Encodes a URL to a representation using [Percent-Encoding](https://datatracker.ietf.org/doc/html/rfc3986#section-2.1). |

<!-- markdownlint-enable MD056 -->

#### `string[index]`

<div class="nostroke_table"></div>

| **Description** | Extracts a single character using a (1-based) index. |
| **Example** | `'DuckDB'[4]` |
| **Result** | `k` |
| **Alias** | `array_extract` |

#### `string[begin:end]`

<div class="nostroke_table"></div>

| **Description** | Extracts a string using [slice conventions]({% link docs/1.3/sql/functions/list.md %}#slicing) similar to Python. Missing `begin` or `end` arguments are interpreted as the beginning or end of the list respectively. Negative values are accepted. |
| **Example** | `'DuckDB'[:4]` |
| **Result** | `Duck` |
| **Alias** | `array_slice` |

#### `string LIKE target`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if the `string` matches the like specifier (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})). |
| **Example** | `'hello' LIKE '%lo'` |
| **Result** | `true` |

#### `string SIMILAR TO regex`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if the `string` matches the `regex` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})). |
| **Example** | `'hello' SIMILAR TO 'l+'` |
| **Result** | `false` |
| **Alias** | `regexp_full_match` |

#### `string ^@ search_string`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if `string` begins with `search_string`. |
| **Example** | `'abc' ^@ 'a'` |
| **Result** | `true` |
| **Alias** | `starts_with` |

#### `arg1 || arg2`

<div class="nostroke_table"></div>

| **Description** | Concatenates two strings, lists, or blobs. Any `NULL` input results in `NULL`. See also [`concat(arg1, arg2, ...)`](#concatvalue-) and [`list_concat(list1, list2)`]({% link docs/1.3/sql/functions/list.md %}#list_concatlist1-list2). |
| **Example 1** | `'Duck' || 'DB'` |
| **Result** | `DuckDB` |
| **Example 2** | `[1, 2, 3] || [4, 5, 6]` |
| **Result** | `[1, 2, 3, 4, 5, 6]` |
| **Example 3** | `'\xAA'::BLOB || '\xBB'::BLOB` |
| **Result** | `\xAA\xBB` |

#### `array_extract(string, index)`

<div class="nostroke_table"></div>

| **Description** | Extracts a single character from a `string` using a (1-based) `index`. |
| **Example** | `array_extract('DuckDB', 2)` |
| **Result** | `u` |

#### `array_slice(list, begin, end)`

<div class="nostroke_table"></div>

| **Description** | Extracts a sublist or substring using [slice conventions]({% link docs/1.3/sql/functions/list.md %}#slicing). Negative values are accepted. |
| **Example 1** | `array_slice('DuckDB', 3, 4)` |
| **Result** | `ck` |
| **Example 2** | `array_slice('DuckDB', 3, NULL)` |
| **Result** | `NULL` |
| **Example 3** | `array_slice('DuckDB', 0, -3)` |
| **Result** | `Duck` |
| **Alias** | `list_slice` |

#### `ascii(string)`

<div class="nostroke_table"></div>

| **Description** | Returns an integer that represents the Unicode code point of the first character of the `string`. |
| **Example** | `ascii('Ω')` |
| **Result** | `937` |

#### `bar(x, min, max[, width])`

<div class="nostroke_table"></div>

| **Description** | Draws a band whose width is proportional to (`x - min`) and equal to `width` characters when `x` = `max`. `width` defaults to 80. |
| **Example** | `bar(5, 0, 20, 10)` |
| **Result** | `██▌       ` |

#### `base64(blob)`

<div class="nostroke_table"></div>

| **Description** | Converts a `blob` to a base64 encoded string. |
| **Example** | `base64('A'::BLOB)` |
| **Result** | `QQ==` |
| **Alias** | `to_base64` |

#### `bin(string)`

<div class="nostroke_table"></div>

| **Description** | Converts the `string` to binary representation. |
| **Example** | `bin('Aa')` |
| **Result** | `0100000101100001` |
| **Alias** | `to_binary` |

#### `bit_length(string)`

<div class="nostroke_table"></div>

| **Description** | Number of bits in a `string`. |
| **Example** | `bit_length('abc')` |
| **Result** | `24` |

#### `char_length(string)`

<div class="nostroke_table"></div>

| **Description** | Number of characters in `string`. |
| **Example** | `char_length('Hello🦆')` |
| **Result** | `6` |
| **Aliases** | `character_length`, `len`, `length` |

#### `character_length(string)`

<div class="nostroke_table"></div>

| **Description** | Number of characters in `string`. |
| **Example** | `character_length('Hello🦆')` |
| **Result** | `6` |
| **Aliases** | `char_length`, `len`, `length` |

#### `chr(code_point)`

<div class="nostroke_table"></div>

| **Description** | Returns a character which is corresponding the ASCII code value or Unicode code point. |
| **Example** | `chr(65)` |
| **Result** | `A` |

#### `concat(value, ...)`

<div class="nostroke_table"></div>

| **Description** | Concatenates multiple strings, lists, or blobs. `NULL` inputs are skipped. See also [operator `||`](#arg1--arg2). |
| **Example** | `concat('Hello', ' ', 'World')` |
| **Result** | `Hello World` |

#### `concat_ws(separator, string, ...)`

<div class="nostroke_table"></div>

| **Description** | Concatenates many strings, separated by `separator`. `NULL` inputs are skipped. |
| **Example** | `concat_ws(', ', 'Banana', 'Apple', 'Melon')` |
| **Result** | `Banana, Apple, Melon` |

#### `contains(string, search_string)`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if `search_string` is found within `string`. |
| **Example** | `contains('abc', 'a')` |
| **Result** | `true` |

#### `ends_with(string, search_string)`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if `string` ends with `search_string`. Note that [collations]({% link docs/1.3/sql/expressions/collations.md %}) are not supported. |
| **Example** | `ends_with('abc', 'bc')` |
| **Result** | `true` |
| **Alias** | `suffix` |

#### `format(format, ...)`

<div class="nostroke_table"></div>

| **Description** | Formats a string using the [fmt syntax](#fmt-syntax). |
| **Example** | `format('Benchmark "{}" took {} seconds', 'CSV', 42)` |
| **Result** | `Benchmark "CSV" took 42 seconds` |

#### `formatReadableDecimalSize(integer)`

<div class="nostroke_table"></div>

| **Description** | Converts `integer` to a human-readable representation using units based on powers of 10 (KB, MB, GB, etc.). |
| **Example** | `formatReadableDecimalSize(16_000)` |
| **Result** | `16.0 kB` |

#### `formatReadableSize(integer)`

<div class="nostroke_table"></div>

| **Description** | Converts `integer` to a human-readable representation using units based on powers of 2 (KiB, MiB, GiB, etc.). |
| **Example** | `formatReadableSize(16_000)` |
| **Result** | `15.6 KiB` |
| **Alias** | `format_bytes` |

#### `format_bytes(integer)`

<div class="nostroke_table"></div>

| **Description** | Converts `integer` to a human-readable representation using units based on powers of 2 (KiB, MiB, GiB, etc.). |
| **Example** | `format_bytes(16_000)` |
| **Result** | `15.6 KiB` |
| **Alias** | `formatReadableSize` |

#### `from_base64(string)`

<div class="nostroke_table"></div>

| **Description** | Converts a base64 encoded `string` to a character string (`BLOB`). |
| **Example** | `from_base64('QQ==')` |
| **Result** | `A` |

#### `from_binary(value)`

<div class="nostroke_table"></div>

| **Description** | Converts a `value` from binary representation to a blob. |
| **Example** | `from_binary('0110')` |
| **Result** | `\x06` |
| **Alias** | `unbin` |

#### `from_hex(value)`

<div class="nostroke_table"></div>

| **Description** | Converts a `value` from hexadecimal representation to a blob. |
| **Example** | `from_hex('2A')` |
| **Result** | `*` |
| **Alias** | `unhex` |

#### `greatest(arg1, ...)`

<div class="nostroke_table"></div>

| **Description** | Returns the largest value in lexicographical order. Note that lowercase characters are considered larger than uppercase characters and [collations]({% link docs/1.3/sql/expressions/collations.md %}) are not supported. |
| **Example 1** | `greatest(42, 84)` |
| **Result** | `84` |
| **Example 2** | `greatest('abc', 'bcd', 'cde', 'EFG')` |
| **Result** | `cde` |

#### `hash(value, ...)`

<div class="nostroke_table"></div>

| **Description** | Returns a `UBIGINT` with the hash of the `value`. Note that this is not a cryptographic hash. |
| **Example** | `hash('🦆')` |
| **Result** | `4164431626903154684` |

#### `hex(string)`

<div class="nostroke_table"></div>

| **Description** | Converts the `string` to hexadecimal representation. |
| **Example** | `hex('Hello')` |
| **Result** | `48656C6C6F` |
| **Alias** | `to_hex` |

#### `ilike_escape(string, like_specifier, escape_character)`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})) using case-insensitive matching. `escape_character` is used to search for wildcard characters in the `string`. |
| **Example** | `ilike_escape('A%c', 'a$%C', '$')` |
| **Result** | `true` |

#### `instr(string, search_string)`

<div class="nostroke_table"></div>

| **Description** | Returns location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. |
| **Example** | `instr('test test', 'es')` |
| **Result** | `2` |
| **Aliases** | `position`, `strpos` |

#### `lcase(string)`

<div class="nostroke_table"></div>

| **Description** | Converts `string` to lower case. |
| **Example** | `lcase('Hello')` |
| **Result** | `hello` |
| **Alias** | `lower` |

#### `least(arg1, ...)`

<div class="nostroke_table"></div>

| **Description** | Returns the smallest value in lexicographical order. Note that uppercase characters are considered smaller than lowercase characters and [collations]({% link docs/1.3/sql/expressions/collations.md %}) are not supported. |
| **Example 1** | `least(42, 84)` |
| **Result** | `42` |
| **Example 2** | `least('abc', 'bcd', 'cde', 'EFG')` |
| **Result** | `EFG` |

#### `left(string, count)`

<div class="nostroke_table"></div>

| **Description** | Extracts the left-most count characters. |
| **Example** | `left('Hello🦆', 2)` |
| **Result** | `He` |

#### `left_grapheme(string, count)`

<div class="nostroke_table"></div>

| **Description** | Extracts the left-most count grapheme clusters. |
| **Example** | `left_grapheme('🤦🏼‍♂️🤦🏽‍♀️', 1)` |
| **Result** | `🤦🏼‍♂️` |

#### `len(string)`

<div class="nostroke_table"></div>

| **Description** | Number of characters in `string`. |
| **Example** | `length('Hello🦆')` |
| **Result** | `6` |
| **Aliases** | `char_length`, `character_length`, `length` |

#### `length(string)`

<div class="nostroke_table"></div>

| **Description** | Number of characters in `string`. |
| **Example** | `length('Hello🦆')` |
| **Result** | `6` |
| **Aliases** | `char_length`, `character_length`, `len` |

#### `length_grapheme(string)`

<div class="nostroke_table"></div>

| **Description** | Number of grapheme clusters in `string`. |
| **Example** | `length_grapheme('🤦🏼‍♂️🤦🏽‍♀️')` |
| **Result** | `2` |

#### `like_escape(string, like_specifier, escape_character)`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})) using case-sensitive matching. `escape_character` is used to search for wildcard characters in the `string`. |
| **Example** | `like_escape('a%c', 'a$%c', '$')` |
| **Result** | `true` |

#### `lower(string)`

<div class="nostroke_table"></div>

| **Description** | Converts `string` to lower case. |
| **Example** | `lower('Hello')` |
| **Result** | `hello` |
| **Alias** | `lcase` |

#### `lpad(string, count, character)`

<div class="nostroke_table"></div>

| **Description** | Pads the `string` with the `character` on the left until it has `count` characters. Truncates the `string` on the right if it has more than `count` characters. |
| **Example** | `lpad('hello', 8, '>')` |
| **Result** | `>>>hello` |

#### `ltrim(string[, characters])`

<div class="nostroke_table"></div>

| **Description** | Removes any occurrences of any of the `characters` from the left side of the `string`. `characters` defaults to `space`. |
| **Example 1** | `ltrim('    test  ')` |
| **Result** | `test  ` |
| **Example 2** | `ltrim('>>>>test<<', '><')` |
| **Result** | `test<<` |

#### `md5(string)`

<div class="nostroke_table"></div>

| **Description** | Returns the MD5 hash of the `string` as a `VARCHAR`. |
| **Example** | `md5('abc')` |
| **Result** | `900150983cd24fb0d6963f7d28e17f72` |

#### `md5_number(string)`

<div class="nostroke_table"></div>

| **Description** | Returns the MD5 hash of the `string` as a `HUGEINT`. |
| **Example** | `md5_number('abc')` |
| **Result** | `152195979970564155685860391459828531600` |

#### `md5_number_lower(string)`

<div class="nostroke_table"></div>

| **Description** | Returns the lower 64-bit segment of the MD5 hash of the `string` as a `UBIGINT`. |
| **Example** | `md5_number_lower('abc')` |
| **Result** | `8250560606382298838` |

#### `md5_number_upper(string)`

<div class="nostroke_table"></div>

| **Description** | Returns the upper 64-bit segment of the MD5 hash of the `string` as a `UBIGINT`. |
| **Example** | `md5_number_upper('abc')` |
| **Result** | `12704604231530709392` |

#### `nfc_normalize(string)`

<div class="nostroke_table"></div>

| **Description** | Converts `string` to Unicode NFC normalized string. Useful for comparisons and ordering if text data is mixed between NFC normalized and not. |
| **Example** | `nfc_normalize('ardèch')` |
| **Result** | `ardèch` |

#### `not_ilike_escape(string, like_specifier, escape_character)`

<div class="nostroke_table"></div>

| **Description** | Returns `false` if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})) using case-insensitive matching. `escape_character` is used to search for wildcard characters in the `string`. |
| **Example** | `not_ilike_escape('A%c', 'a$%C', '$')` |
| **Result** | `false` |

#### `not_like_escape(string, like_specifier, escape_character)`

<div class="nostroke_table"></div>

| **Description** | Returns `false` if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/1.3/sql/functions/pattern_matching.md %})) using case-sensitive matching. `escape_character` is used to search for wildcard characters in the `string`. |
| **Example** | `not_like_escape('a%c', 'a$%c', '$')` |
| **Result** | `false` |

#### `ord(string)`

<div class="nostroke_table"></div>

| **Description** | Returns an `INTEGER` representing the `unicode` codepoint of the first character in the `string`. |
| **Example** | `[unicode('âbcd'), unicode('â'), unicode(''), unicode(NULL)]` |
| **Result** | `[226, 226, -1, NULL]` |
| **Alias** | `unicode` |

#### `parse_dirname(path[, separator])`

<div class="nostroke_table"></div>

| **Description** | Returns the top-level directory name from the given `path`. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. |
| **Example** | `parse_dirname('path/to/file.csv', 'system')` |
| **Result** | `path` |

#### `parse_dirpath(path[, separator])`

<div class="nostroke_table"></div>

| **Description** | Returns the head of the `path` (the pathname until the last slash) similarly to Python's [`os.path.dirname`](https://docs.python.org/3.7/library/os.path.html#os.path.dirname). `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. |
| **Example** | `parse_dirpath('path/to/file.csv', 'forward_slash')` |
| **Result** | `path/to` |

#### `parse_filename(string[, trim_extension][, separator])`

<div class="nostroke_table"></div>

| **Description** | Returns the last component of the `path` similarly to Python's [`os.path.basename`](https://docs.python.org/3.7/library/os.path.html#os.path.basename) function. If `trim_extension` is `true`, the file extension will be removed (defaults to `false`). `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. |
| **Example** | `parse_filename('path/to/file.csv', true, 'forward_slash')` |
| **Result** | `file` |

#### `parse_path(path[, separator])`

<div class="nostroke_table"></div>

| **Description** | Returns a list of the components (directories and filename) in the `path` similarly to Python's [`pathlib.parts`](https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.parts) function. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. |
| **Example** | `parse_path('path/to/file.csv', 'system')` |
| **Result** | `[path, to, file.csv]` |

#### `position(search_string IN string)`

<div class="nostroke_table"></div>

| **Description** | Return location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. |
| **Example** | `position('b' IN 'abc')` |
| **Result** | `2` |
| **Aliases** | `instr`, `strpos` |

#### `prefix(string, search_string)`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if `string` starts with `search_string`. |
| **Example** | `prefix('abc', 'ab')` |
| **Result** | `true` |

#### `printf(format, ...)`

<div class="nostroke_table"></div>

| **Description** | Formats a `string` using [printf syntax](#printf-syntax). |
| **Example** | `printf('Benchmark "%s" took %d seconds', 'CSV', 42)` |
| **Result** | `Benchmark "CSV" took 42 seconds` |

#### `read_text(source)`

<div class="nostroke_table"></div>

| **Description** | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `VARCHAR`. The file content is first validated to be valid UTF-8. If `read_text` attempts to read a file with invalid UTF-8 an error is thrown suggesting to use `read_blob` instead. See the [`read_text` guide]({% link docs/1.3/guides/file_formats/read_file.md %}#read_text) for more details. |
| **Example** | `read_text('hello.txt')` |
| **Result** | `hello\n` |

#### `regexp_escape(string)`

<div class="nostroke_table"></div>

| **Description** | Escapes special patterns to turn `string` into a regular expression similarly to Python's [`re.escape` function](https://docs.python.org/3/library/re.html#re.escape). |
| **Example** | `regexp_escape('https://duckdb.org')` |
| **Result** | `https\:\/\/duckdb\.org` |

#### `regexp_extract(string, regex[, group][, options])`

<div class="nostroke_table"></div>

| **Description** | If `string` contains the `regex` pattern, returns the capturing group specified by optional parameter `group`; otherwise, returns the empty string. The `group` must be a constant value. If no `group` is given, it defaults to 0. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| **Example** | `regexp_extract('ABC', '([a-z])(b)', 1, 'i')` |
| **Result** | `A` |

#### `regexp_extract(string, regex, name_list[, options])`

<div class="nostroke_table"></div>

| **Description** | If `string` contains the `regex` pattern, returns the capturing groups as a struct with corresponding names from `name_list`; otherwise, returns a struct with the same keys and empty strings as values. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| **Example** | `regexp_extract('John Doe', '([a-z]+) ([a-z]+)', ['first_name', 'last_name'], 'i')` |
| **Result** | `{'first_name': John, 'last_name': Doe}` |

#### `regexp_extract_all(string, regex[, group][, options])`

<div class="nostroke_table"></div>

| **Description** | Finds non-overlapping occurrences of the `regex` in the `string` and returns the corresponding values of the capturing `group`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| **Example** | `regexp_extract_all('Peter: 33, Paul:14', '(\w+):\s*(\d+)', 2)` |
| **Result** | `[33, 14]` |

#### `regexp_full_match(string, regex[, col2])`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if the entire `string` matches the `regex`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| **Example** | `regexp_full_match('anabanana', '(an)*')` |
| **Result** | `false` |

#### `regexp_matches(string, regex[, options])`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if `string` contains the `regex`, `false` otherwise. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| **Example** | `regexp_matches('anabanana', '(an)*')` |
| **Result** | `true` |

#### `regexp_replace(string, regex, replacement[, options])`

<div class="nostroke_table"></div>

| **Description** | If `string` contains the `regex`, replaces the matching part with `replacement`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| **Example** | `regexp_replace('hello', '[lo]', '-')` |
| **Result** | `he-lo` |

#### `regexp_split_to_array(string, regex[, options])`

<div class="nostroke_table"></div>

| **Description** | Splits the `string` along the `regex`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| **Example** | `regexp_split_to_array('hello world; 42', ';? ')` |
| **Result** | `[hello, world, 42]` |
| **Aliases** | `str_split_regex`, `string_split_regex` |

#### `regexp_split_to_table(string, regex)`

<div class="nostroke_table"></div>

| **Description** | Splits the `string` along the `regex` and returns a row for each part. |
| **Example** | `regexp_split_to_table('hello world; 42', ';? ')` |
| **Result** | Multiple rows: `'hello'`, `'world'`, `'42'` |

#### `repeat(string, count)`

<div class="nostroke_table"></div>

| **Description** | Repeats the `string` `count` number of times. |
| **Example** | `repeat('A', 5)` |
| **Result** | `AAAAA` |

#### `replace(string, source, target)`

<div class="nostroke_table"></div>

| **Description** | Replaces any occurrences of the `source` with `target` in `string`. |
| **Example** | `replace('hello', 'l', '-')` |
| **Result** | `he--o` |

#### `reverse(string)`

<div class="nostroke_table"></div>

| **Description** | Reverses the `string`. |
| **Example** | `reverse('hello')` |
| **Result** | `olleh` |

#### `right(string, count)`

<div class="nostroke_table"></div>

| **Description** | Extract the right-most `count` characters. |
| **Example** | `right('Hello🦆', 3)` |
| **Result** | `lo🦆` |

#### `right_grapheme(string, count)`

<div class="nostroke_table"></div>

| **Description** | Extracts the right-most `count` grapheme clusters. |
| **Example** | `right_grapheme('🤦🏼‍♂️🤦🏽‍♀️', 1)` |
| **Result** | `🤦🏽‍♀️` |

#### `rpad(string, count, character)`

<div class="nostroke_table"></div>

| **Description** | Pads the `string` with the `character` on the right until it has `count` characters. Truncates the `string` on the right if it has more than `count` characters. |
| **Example** | `rpad('hello', 10, '<')` |
| **Result** | `hello<<<<<` |

#### `rtrim(string[, characters])`

<div class="nostroke_table"></div>

| **Description** | Removes any occurrences of any of the `characters` from the right side of the `string`. `characters` defaults to `space`. |
| **Example 1** | `rtrim('    test  ')` |
| **Result** | `    test` |
| **Example 2** | `rtrim('>>>>test<<', '><')` |
| **Result** | `>>>>test` |

#### `sha1(value)`

<div class="nostroke_table"></div>

| **Description** | Returns a `VARCHAR` with the SHA-1 hash of the `value`. |
| **Example** | `sha1('🦆')` |
| **Result** | `949bf843dc338be348fb9525d1eb535d31241d76` |

#### `sha256(value)`

<div class="nostroke_table"></div>

| **Description** | Returns a `VARCHAR` with the SHA-256 hash of the `value` |
| **Example** | `sha256('🦆')` |
| **Result** | `d7a5c5e0d1d94c32218539e7e47d4ba9c3c7b77d61332fb60d633dde89e473fb` |

#### `split(string, separator)`

<div class="nostroke_table"></div>

| **Description** | Splits the `string` along the `separator`. |
| **Example** | `split('hello-world', '-')` |
| **Result** | `[hello, world]` |
| **Aliases** | `str_split`, `string_split`, `string_to_array` |

#### `split_part(string, separator, index)`

<div class="nostroke_table"></div>

| **Description** | Splits the `string` along the `separator` and returns the data at the (1-based) `index` of the list. If the `index` is outside the bounds of the list, return an empty string (to match PostgreSQL's behavior). |
| **Example** | `split_part('a;b;c', ';', 2)` |
| **Result** | `b` |

#### `starts_with(string, search_string)`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if `string` begins with `search_string`. |
| **Example** | `starts_with('abc', 'a')` |
| **Result** | `true` |
| **Alias** | `^@` |

#### `str_split(string, separator)`

<div class="nostroke_table"></div>

| **Description** | Splits the `string` along the `separator`. |
| **Example** | `str_split('hello-world', '-')` |
| **Result** | `[hello, world]` |
| **Aliases** | `split`, `string_split`, `string_to_array` |

#### `str_split_regex(string, regex[, options])`

<div class="nostroke_table"></div>

| **Description** | Splits the `string` along the `regex`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| **Example** | `str_split_regex('hello world; 42', ';? ')` |
| **Result** | `[hello, world, 42]` |
| **Aliases** | `regexp_split_to_array`, `string_split_regex` |

#### `string_split(string, separator)`

<div class="nostroke_table"></div>

| **Description** | Splits the `string` along the `separator`. |
| **Example** | `string_split('hello-world', '-')` |
| **Result** | `[hello, world]` |
| **Aliases** | `split`, `str_split`, `string_to_array` |

#### `string_split_regex(string, regex[, options])`

<div class="nostroke_table"></div>

| **Description** | Splits the `string` along the `regex`. A set of optional [regex `options`]({% link docs/1.3/sql/functions/regular_expressions.md %}#options-for-regular-expression-functions) can be set. |
| **Example** | `string_split_regex('hello world; 42', ';? ')` |
| **Result** | `[hello, world, 42]` |
| **Aliases** | `regexp_split_to_array`, `str_split_regex` |

#### `string_to_array(string, separator)`

<div class="nostroke_table"></div>

| **Description** | Splits the `string` along the `separator`. |
| **Example** | `string_to_array('hello-world', '-')` |
| **Result** | `[hello, world]` |
| **Aliases** | `split`, `str_split`, `string_split` |

#### `strip_accents(string)`

<div class="nostroke_table"></div>

| **Description** | Strips accents from `string`. |
| **Example** | `strip_accents('mühleisen')` |
| **Result** | `muhleisen` |

#### `strlen(string)`

<div class="nostroke_table"></div>

| **Description** | Number of bytes in `string`. |
| **Example** | `strlen('🦆')` |
| **Result** | `4` |

#### `strpos(string, search_string)`

<div class="nostroke_table"></div>

| **Description** | Returns location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. |
| **Example** | `strpos('test test', 'es')` |
| **Result** | `2` |
| **Aliases** | `instr`, `position` |

#### `substr(string, start[, length])`

<div class="nostroke_table"></div>

| **Description** | Extracts substring starting from character `start` up to the end of the string. If optional argument `length` is set, extracts a substring of `length` characters instead. Note that a `start` value of `1` refers to the first character of the `string`. |
| **Example 1** | `substring('Hello', 2)` |
| **Result** | `ello` |
| **Example 2** | `substring('Hello', 2, 2)` |
| **Result** | `el` |
| **Alias** | `substring` |

#### `substring(string, start[, length])`

<div class="nostroke_table"></div>

| **Description** | Extracts substring starting from character `start` up to the end of the string. If optional argument `length` is set, extracts a substring of `length` characters instead. Note that a `start` value of `1` refers to the first character of the `string`. |
| **Example 1** | `substring('Hello', 2)` |
| **Result** | `ello` |
| **Example 2** | `substring('Hello', 2, 2)` |
| **Result** | `el` |
| **Alias** | `substr` |

#### `substring_grapheme(string, start[, length])`

<div class="nostroke_table"></div>

| **Description** | Extracts substring starting from grapheme clusters `start` up to the end of the string. If optional argument `length` is set, extracts a substring of `length` grapheme clusters instead. Note that a `start` value of `1` refers to the `first` character of the `string`. |
| **Example 1** | `substring_grapheme('🦆🤦🏼‍♂️🤦🏽‍♀️🦆', 3)` |
| **Result** | `🤦🏽‍♀️🦆` |
| **Example 2** | `substring_grapheme('🦆🤦🏼‍♂️🤦🏽‍♀️🦆', 3, 2)` |
| **Result** | `🤦🏽‍♀️🦆` |

#### `suffix(string, search_string)`

<div class="nostroke_table"></div>

| **Description** | Returns `true` if `string` ends with `search_string`. |
| **Example** | `suffix('abc', 'bc')` |
| **Result** | `true` |
| **Alias** | `ends_with` |

#### `to_base(number, radix[, min_length])`

<div class="nostroke_table"></div>

| **Description** | Converts `number` to a string in the given base `radix`, optionally padding with leading zeros to `min_length`. |
| **Example** | `to_base(42, 16, 5)` |
| **Result** | `0002A` |

#### `to_base64(blob)`

<div class="nostroke_table"></div>

| **Description** | Converts a `blob` to a base64 encoded string. |
| **Example** | `to_base64('A'::BLOB)` |
| **Result** | `QQ==` |
| **Alias** | `base64` |

#### `to_binary(string)`

<div class="nostroke_table"></div>

| **Description** | Converts the `string` to binary representation. |
| **Example** | `to_binary('Aa')` |
| **Result** | `0100000101100001` |
| **Alias** | `bin` |

#### `to_hex(string)`

<div class="nostroke_table"></div>

| **Description** | Converts the `string` to hexadecimal representation. |
| **Example** | `to_hex('Hello')` |
| **Result** | `48656C6C6F` |
| **Alias** | `hex` |

#### `translate(string, from, to)`

<div class="nostroke_table"></div>

| **Description** | Replaces each character in `string` that matches a character in the `from` set with the corresponding character in the `to` set. If `from` is longer than `to`, occurrences of the extra characters in `from` are deleted. |
| **Example** | `translate('12345', '143', 'ax')` |
| **Result** | `a2x5` |

#### `trim(string[, characters])`

<div class="nostroke_table"></div>

| **Description** | Removes any occurrences of any of the `characters` from either side of the `string`. `characters` defaults to `space`. |
| **Example 1** | `trim('    test  ')` |
| **Result** | `test` |
| **Example 2** | `trim('>>>>test<<', '><')` |
| **Result** | `test` |

#### `ucase(string)`

<div class="nostroke_table"></div>

| **Description** | Converts `string` to upper case. |
| **Example** | `ucase('Hello')` |
| **Result** | `HELLO` |
| **Alias** | `upper` |

#### `unbin(value)`

<div class="nostroke_table"></div>

| **Description** | Converts a `value` from binary representation to a blob. |
| **Example** | `unbin('0110')` |
| **Result** | `\x06` |
| **Alias** | `from_binary` |

#### `unhex(value)`

<div class="nostroke_table"></div>

| **Description** | Converts a `value` from hexadecimal representation to a blob. |
| **Example** | `unhex('2A')` |
| **Result** | `*` |
| **Alias** | `from_hex` |

#### `unicode(string)`

<div class="nostroke_table"></div>

| **Description** | Returns an `INTEGER` representing the `unicode` codepoint of the first character in the `string`. |
| **Example** | `[unicode('âbcd'), unicode('â'), unicode(''), unicode(NULL)]` |
| **Result** | `[226, 226, -1, NULL]` |
| **Alias** | `ord` |

#### `upper(string)`

<div class="nostroke_table"></div>

| **Description** | Converts `string` to upper case. |
| **Example** | `upper('Hello')` |
| **Result** | `HELLO` |
| **Alias** | `ucase` |

#### `url_decode(string)`

<div class="nostroke_table"></div>

| **Description** | Decodes a URL from a representation using [Percent-Encoding](https://datatracker.ietf.org/doc/html/rfc3986#section-2.1). |
| **Example** | `url_decode('https%3A%2F%2Fduckdb.org%2Fwhy_duckdb%23portable')` |
| **Result** | `https://duckdb.org/why_duckdb#portable` |

#### `url_encode(string)`

<div class="nostroke_table"></div>

| **Description** | Encodes a URL to a representation using [Percent-Encoding](https://datatracker.ietf.org/doc/html/rfc3986#section-2.1). |
| **Example** | `url_encode('this string has/ special+ characters>')` |
| **Result** | `this%20string%20has%2F%20special%2B%20characters%3E` |

<!-- End of section generated by scripts/generate_sql_function_docs.py -->

## Text Similarity Functions

These functions are used to measure the similarity of two strings using various [similarity measures](https://en.wikipedia.org/wiki/Similarity_measure).

<!-- Start of section generated by scripts/generate_sql_function_docs.py; categories: [text_similarity] -->
<!-- markdownlint-disable MD056 -->

| Name | Description |
|:--|:-------|
| [`damerau_levenshtein(s1, s2)`](#damerau_levenshteins1-s2) | Extension of Levenshtein distance to also include transposition of adjacent characters as an allowed edit operation. In other words, the minimum number of edit operations (insertions, deletions, substitutions or transpositions) required to change one string to another. Characters of different cases (e.g., `a` and `A`) are considered different. |
| [`editdist3(s1, s2)`](#editdist3s1-s2) | The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., `a` and `A`) are considered different. |
| [`hamming(s1, s2)`](#hammings1-s2) | The Hamming distance between to strings, i.e., the number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., `a` and `A`) are considered different. |
| [`jaccard(s1, s2)`](#jaccards1-s2) | The Jaccard similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. |
| [`jaro_similarity(s1, s2[, score_cutoff])`](#jaro_similaritys1-s2-score_cutoff) | The Jaro similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. For similarity < `score_cutoff`, 0 is returned instead. `score_cutoff` defaults to 0. |
| [`jaro_winkler_similarity(s1, s2[, score_cutoff])`](#jaro_winkler_similaritys1-s2-score_cutoff) | The Jaro-Winkler similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. For similarity < `score_cutoff`, 0 is returned instead. `score_cutoff` defaults to 0. |
| [`levenshtein(s1, s2)`](#levenshteins1-s2) | The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., `a` and `A`) are considered different. |
| [`mismatches(s1, s2)`](#mismatchess1-s2) | The Hamming distance between to strings, i.e., the number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., `a` and `A`) are considered different. |

<!-- markdownlint-enable MD056 -->

#### `damerau_levenshtein(s1, s2)`

<div class="nostroke_table"></div>

| **Description** | Extension of Levenshtein distance to also include transposition of adjacent characters as an allowed edit operation. In other words, the minimum number of edit operations (insertions, deletions, substitutions or transpositions) required to change one string to another. Characters of different cases (e.g., `a` and `A`) are considered different. |
| **Example** | `damerau_levenshtein('duckdb', 'udckbd')` |
| **Result** | `2` |

#### `editdist3(s1, s2)`

<div class="nostroke_table"></div>

| **Description** | The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., `a` and `A`) are considered different. |
| **Example** | `editdist3('duck', 'db')` |
| **Result** | `3` |
| **Alias** | `levenshtein` |

#### `hamming(s1, s2)`

<div class="nostroke_table"></div>

| **Description** | The Hamming distance between to strings, i.e., the number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., `a` and `A`) are considered different. |
| **Example** | `hamming('duck', 'luck')` |
| **Result** | `1` |
| **Alias** | `mismatches` |

#### `jaccard(s1, s2)`

<div class="nostroke_table"></div>

| **Description** | The Jaccard similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. |
| **Example** | `jaccard('duck', 'luck')` |
| **Result** | `0.6` |

#### `jaro_similarity(s1, s2[, score_cutoff])`

<div class="nostroke_table"></div>

| **Description** | The Jaro similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. For similarity < `score_cutoff`, 0 is returned instead. `score_cutoff` defaults to 0. |
| **Example** | `jaro_similarity('duck', 'duckdb')` |
| **Result** | `0.8888888888888888` |

#### `jaro_winkler_similarity(s1, s2[, score_cutoff])`

<div class="nostroke_table"></div>

| **Description** | The Jaro-Winkler similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. For similarity < `score_cutoff`, 0 is returned instead. `score_cutoff` defaults to 0. |
| **Example** | `jaro_winkler_similarity('duck', 'duckdb')` |
| **Result** | `0.9333333333333333` |

#### `levenshtein(s1, s2)`

<div class="nostroke_table"></div>

| **Description** | The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., `a` and `A`) are considered different. |
| **Example** | `levenshtein('duck', 'db')` |
| **Result** | `3` |
| **Alias** | `editdist3` |

#### `mismatches(s1, s2)`

<div class="nostroke_table"></div>

| **Description** | The Hamming distance between to strings, i.e., the number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., `a` and `A`) are considered different. |
| **Example** | `mismatches('duck', 'luck')` |
| **Result** | `1` |
| **Alias** | `hamming` |

<!-- End of section generated by scripts/generate_sql_function_docs.py -->

## Formatters

### `fmt` Syntax

The `format(format, parameters...)` function formats strings, loosely following the syntax of the [{fmt} open-source formatting library](https://fmt.dev/latest/syntax/).

Format without additional parameters:

```sql
SELECT format('Hello world'); -- Hello world
```

Format a string using {}:

```sql
SELECT format('The answer is {}', 42); -- The answer is 42
```

Format a string using positional arguments:

```sql
SELECT format('I''d rather be {1} than {0}.', 'right', 'happy'); -- I'd rather be happy than right.
```

#### Format Specifiers

| Specifier | Description | Example |
|:-|:------|:---|
| `{:d}`   | integer                                | `654321`       |
| `{:E}`   | scientific notation                    | `3.141593E+00` |
| `{:f}`   | float                                  | `4.560000`     |
| `{:o}`   | octal                                  | `2375761`      |
| `{:s}`   | string                                 | `asd`          |
| `{:x}`   | hexadecimal                            | `9fbf1`        |
| `{:tX}`  | integer, `X` is the thousand separator | `654 321`      |

#### Formatting Types

Integers:

```sql
SELECT format('{} + {} = {}', 3, 5, 3 + 5); -- 3 + 5 = 8
```

Booleans:

```sql
SELECT format('{} != {}', true, false); -- true != false
```

Format datetime values:

```sql
SELECT format('{}', DATE '1992-01-01'); -- 1992-01-01
SELECT format('{}', TIME '12:01:00'); -- 12:01:00
SELECT format('{}', TIMESTAMP '1992-01-01 12:01:00'); -- 1992-01-01 12:01:00
```

Format BLOB:

```sql
SELECT format('{}', BLOB '\x00hello'); -- \x00hello
```

Pad integers with 0s:

```sql
SELECT format('{:04d}', 33); -- 0033
```

Create timestamps from integers:

```sql
SELECT format('{:02d}:{:02d}:{:02d} {}', 12, 3, 16, 'AM'); -- 12:03:16 AM
```

Convert to hexadecimal:

```sql
SELECT format('{:x}', 123_456_789); -- 75bcd15
```

Convert to binary:

```sql
SELECT format('{:b}', 123_456_789); -- 111010110111100110100010101
```

#### Print Numbers with Thousand Separators

Integers:

```sql
SELECT format('{:,}',  123_456_789); -- 123,456,789
SELECT format('{:t.}', 123_456_789); -- 123.456.789
SELECT format('{:''}', 123_456_789); -- 123'456'789
SELECT format('{:_}',  123_456_789); -- 123_456_789
SELECT format('{:t }', 123_456_789); -- 123 456 789
SELECT format('{:tX}', 123_456_789); -- 123X456X789
```

Float, double and decimal:

```sql
SELECT format('{:,f}',    123456.789); -- 123,456.78900
SELECT format('{:,.2f}',  123456.789); -- 123,456.79
SELECT format('{:t..2f}', 123456.789); -- 123.456,79
```

### `printf` Syntax

The `printf(format, parameters...)` function formats strings using the [`printf` syntax](https://cplusplus.com/reference/cstdio/printf/).

Format without additional parameters:

```sql
SELECT printf('Hello world');
```

```text
Hello world
```

Format a string using arguments in a given order:

```sql
SELECT printf('The answer to %s is %d', 'life', 42);
```

```text
The answer to life is 42
```

Format a string using positional arguments `%position$formatter`, e.g., the second parameter as a string is encoded as `%2$s`:

```sql
SELECT printf('I''d rather be %2$s than %1$s.', 'right', 'happy');
```

```text
I'd rather be happy than right.
```

#### Format Specifiers

| Specifier | Description | Example |
|:-|:------|:---|
| `%c`   | character code to character                                    | `a`            |
| `%d`   | integer                                                        | `654321`       |
| `%Xd`  | integer with thousand seperarator `X` from `,`, `.`, `''`, `_` | `654_321`      |
| `%E`   | scientific notation                                            | `3.141593E+00` |
| `%f`   | float                                                          | `4.560000`     |
| `%hd`  | integer                                                        | `654321`       |
| `%hhd` | integer                                                        | `654321`       |
| `%lld` | integer                                                        | `654321`       |
| `%o`   | octal                                                          | `2375761`      |
| `%s`   | string                                                         | `asd`          |
| `%x`   | hexadecimal                                                    | `9fbf1`        |

#### Formatting Types

Integers:

```sql
SELECT printf('%d + %d = %d', 3, 5, 3 + 5); -- 3 + 5 = 8
```

Booleans:

```sql
SELECT printf('%s != %s', true, false); -- true != false
```

Format datetime values:

```sql
SELECT printf('%s', DATE '1992-01-01'); -- 1992-01-01
SELECT printf('%s', TIME '12:01:00'); -- 12:01:00
SELECT printf('%s', TIMESTAMP '1992-01-01 12:01:00'); -- 1992-01-01 12:01:00
```

Format BLOB:

```sql
SELECT printf('%s', BLOB '\x00hello'); -- \x00hello
```

Pad integers with 0s:

```sql
SELECT printf('%04d', 33); -- 0033
```

Create timestamps from integers:

```sql
SELECT printf('%02d:%02d:%02d %s', 12, 3, 16, 'AM'); -- 12:03:16 AM
```

Convert to hexadecimal:

```sql
SELECT printf('%x', 123_456_789); -- 75bcd15
```

Convert to binary:

```sql
SELECT printf('%b', 123_456_789); -- 111010110111100110100010101
```

#### Thousand Separators

Integers:

```sql
SELECT printf('%,d',  123_456_789); -- 123,456,789
SELECT printf('%.d',  123_456_789); -- 123.456.789
SELECT printf('%''d', 123_456_789); -- 123'456'789
SELECT printf('%_d',  123_456_789); -- 123_456_789
```

Float, double and decimal:

```sql
SELECT printf('%,f',   123456.789); -- 123,456.789000
SELECT printf('%,.2f', 123456.789); -- 123,456.79
```