# Available functions & operators
## Summary
- [Operators](#operators)
- [Unary operators](#unary-operators)
- [Numerical comparison](#numerical-comparison)
- [String/sequence comparison](#stringsequence-comparison)
- [Arithmetic operators](#arithmetic-operators)
- [String/sequence operators](#stringsequence-operators)
- [Logical operators](#logical-operators)
- [Indexing & slicing operators](#indexing--slicing-operators)
- [Pipeline operator](#pipeline-operator)
- [Boolean operations & branching](#boolean-operations--branching)
- [Comparison](#comparison)
- [Arithmetics](#arithmetics)
- [Formatting](#formatting)
- [Strings](#strings)
- [Strings, lists and maps](#strings-lists-and-maps)
- [Lists](#lists)
- [Maps](#maps)
- [Dates & time](#dates--time)
- [Urls & web-related](#urls--webrelated)
- [Fuzzy matching & information retrieval](#fuzzy-matching--information-retrieval)
- [Utils](#utils)
- [IO & path wrangling](#io--path-wrangling)
- [Randomness & hashing](#randomness--hashing)
## Operators
### Unary operators
```txt
!x - boolean negation
-x - numerical negation
```
### Numerical comparison
Warning: those operators will always consider operands as numbers or dates and will try to cast them around as such. For string/sequence comparison, use the operators in the next section.
```txt
x == y - numerical equality
x != y - numerical inequality
x < y - numerical less than
x <= y - numerical less than or equal
x > y - numerical greater than
x >= y - numerical greater than or equal
```
### String/sequence comparison
Warning: those operators will always consider operands as strings or sequences and will try to cast them around as such. For numerical comparison, use the operators in the previous section.
```txt
x eq y - string equality
x ne y - string inequality
x lt y - string less than
x le y - string less than or equal
x gt y - string greater than
x ge y - string greater than or equal
```
### Arithmetic operators
```txt
x + y - numerical addition
x - y - numerical subtraction
x * y - numerical multiplication
x / y - numerical division
x % y - numerical remainder
x // y - numerical integer division
x ** y - numerical exponentiation
```
### String/sequence operators
```txt
x ++ y - string concatenation
```
### Logical operators
```txt
x && y - logical and
x and y
x || y - logical or
x or y
x in y
x not in y
```
### Indexing & slicing operators
Negative indices are accepted and mean the same thing as with the Python language.
```txt
x[y] - get y from x (string or list index, map key)
x[start:end] - slice x from start index to end index
x[:end] - slice x from start to end index
x[start:] - slice x from start index to end
```
### Pipeline operator
using "_" for left-hand side substitution.
```txt
trim(name) | len(_) - Same as len(trim(name))
trim(name) | len - Supports elision for unary functions
trim(name) | add(1, len(_)) - Can be nested
add(trim(name) | len, 2) - Can be used anywhere
```
## Boolean operations & branching
- **and**(*a*, *b*, *\*n*) -> `T`: Perform boolean AND operation on two or more values.
- **if**(*cond*, *then*, *else?*) -> `T`: Evaluate condition and switch to correct branch.
- **unless**(*cond*, *then*, *else?*) -> `T`: Shorthand for `if(not(cond), then, else?)`
- **not**(*a*) -> `bool`: Perform boolean NOT operation.
- **or**(*a*, *b*, *\*n*) -> `T`: Perform boolean OR operation on two or more values.
- **try**(*T*) -> `T`: Attempt to evaluate given expression and return null if it raised an error.
## Comparison
- **eq**(*s1*, *s2*) -> `bool`: Test string or list equality.
- **ne**(*s1*, *s2*) -> `bool`: Test string or list inequality.
- **gt**(*s1*, *s2*) -> `bool`: Test string or list s1 > s2.
- **ge**(*s1*, *s2*) -> `bool`: Test string or list s1 >= s2.
- **lt**(*s1*, *s2*) -> `bool`: Test string or list s1 < s2.
- **le**(*s1*, *s2*) -> `bool`: Test string or list s1 <= s2.
## Arithmetics
- **abs**(*x*) -> `number`: Return absolute value of number.
- **add**(*x*, *y*, *\*n*) -> `number`: Add two or more numbers.
- **argmax**(*numbers*, *labels?*) -> `any`: Return the index or label of the largest number in the list.
- **argmin**(*numbers*, *labels?*) -> `any`: Return the index or label of the smallest number in the list.
- **ceil**(*x*) -> `number`: Return the smallest integer greater than or equal to x.
- **div**(*x*, *y*, *\*n*) -> `number`: Divide two or more numbers.
- **idiv**(*x*, *y*) -> `number`: Integer division of two numbers.
- **int**(*any*) -> `int`: Cast value as int and raise an error if impossible.
- **float**(*any*) -> `float`: Cast value as float and raise an error if impossible.
- **floor**(*x*) -> `number`: Return the smallest integer lower than or equal to x.
- **log**(*x*, *base?*) -> `number`: Return the natural or custom base logarithm of x.
- **log2**(*x*) -> `number`: Return the base 2 logarithm of x.
- **log10**(*x*) -> `number`: Return the base 10 logarithm of x.
- **max**(*x*, *y*, *\*n*) -> `number`: Return the maximum number.
- **max**(*list_of_numbers*) -> `number`: Return the maximum number.
- **min**(*x*, *y*, *\*n*) -> `number`: Return the minimum number.
- **min**(*list_of_numbers*) -> `number`: Return the minimum number.
- **mod**(*x*, *y*) -> `number`: Return the remainder of x divided by y.
- **mul**(*x*, *y*, *\*n*) -> `number`: Multiply two or more numbers.
- **neg**(*x*) -> `number`: Return -x.
- **pow**(*x*, *y*) -> `number`: Raise x to the power of y.
- **round**(*x*) -> `number`: Return x rounded to the nearest integer.
- **sqrt**(*x*) -> `number`: Return the square root of x.
- **sub**(*x*, *y*, *\*n*) -> `number`: Subtract two or more numbers.
- **trunc**(*x*) -> `number`: Truncate the number by removing its decimal part.
## Formatting
- **bytesize**(*string*) -> `string`: Return a number of bytes in human-readable format (KB, MB, GB, etc.).
- **escape_regex**(*string*) -> `string`: Escape a string so it can be used safely in a regular expression.
- **fmt**(*string*, *\*arguments*) -> `string`: Format a string by replacing "{}" occurrences by subsequent arguments.
Example: `fmt("Hello {} {}", name, surname)` will replace the first "{}" by the value of the name column, then the second one by the value of the surname column.
Can also be given a substitution map like so:
`fmt("Hello {name}", {name: "John"})`.
- **fmt**(*string*, *map*) -> `string`: Format a string by replacing "{}" occurrences by subsequent arguments.
Example: `fmt("Hello {} {}", name, surname)` will replace the first "{}" by the value of the name column, then the second one by the value of the surname column.
Can also be given a substitution map like so:
`fmt("Hello {name}", {name: "John"})`.
- **lower**(*string*) -> `string`: Lowercase string.
- **pad**(*string*, *width*, *char?*) -> `string`: Pad given string with spaces or given character so that it is least given width.
- **lpad**(*string*, *width*, *char?*) -> `string`: Left pad given string with spaces or given character so that it is least given width.
- **rpad**(*string*, *width*, *char?*) -> `string`: Right pad given string with spaces or given character so that it is least given width.
- **printf**(*format*, *\*arguments*) -> `string`: Apply printf formatting with given format and arguments. Arguments can also be provided as a list.
For instance: `split('John Landy') | printf('first: %s, last: %s', _)`
- **numfmt**(*number*, *thousands_sep=","*, *comma=false*, *significance=5*) -> `string`: Format a number with thousands separator and proper significance.
- **trim**(*string*, *chars?*) -> `string`: Trim string of leading & trailing whitespace or provided characters.
- **to_fixed**(*number*, *precision*) -> `string`: Format given number using fixed point notation with specified number of decimal places.
- **ltrim**(*string*, *chars?*) -> `string`: Trim string of leading whitespace or provided characters.
- **rtrim**(*string*, *chars?*) -> `string`: Trim string of trailing whitespace or provided characters.
- **upper**(*string*) -> `string`: Uppercase string.
## Strings
- **count**(*string*, *substring*) -> `int`: Count number of times substring appear in string. Or count the number of times a regex pattern matched the strings. Note that only non-overlapping matches will be counted in both cases. Remember a regex pattern must be written with slashes e.g. `/france|french/i`.
- **count**(*string*, *regex*) -> `int`: Count number of times substring appear in string. Or count the number of times a regex pattern matched the strings. Note that only non-overlapping matches will be counted in both cases. Remember a regex pattern must be written with slashes e.g. `/france|french/i`.
- **endswith**(*string*, *substring*) -> `bool`: Test if string ends with substring.
- **match**(*string*, *regex*, *group*) -> `string`: Return a regex pattern match on the string. Remember a regex pattern must be written with slashes e.g. `/france|french/i`.
- **replace**(*string*, *substring*, *replacement*) -> `string`: Replace all non-overlapping occurrences of substring in given string with provided replacement. Can also replace regex pattern matches. Remember a regex pattern must be written with slashes e.g. `/france|french/i`.
See regex replacement string syntax documentation here:
https://docs.rs/regex/latest/regex/struct.Regex.html#replacement-string-syntax
- **replace**(*string*, *regex*, *replacement*) -> `string`: Replace all non-overlapping occurrences of substring in given string with provided replacement. Can also replace regex pattern matches. Remember a regex pattern must be written with slashes e.g. `/france|french/i`.
See regex replacement string syntax documentation here:
https://docs.rs/regex/latest/regex/struct.Regex.html#replacement-string-syntax
- **split**(*string*, *substring*, *max?*) -> `list`: Split a string by a given separator substring. Can also split using a regex pattern. Remember a regex pattern must be written with slashes e.g. `/france|french/i`.
- **split**(*string*, *regex*, *max?*) -> `list`: Split a string by a given separator substring. Can also split using a regex pattern. Remember a regex pattern must be written with slashes e.g. `/france|french/i`.
- **startswith**(*string*, *substring*) -> `bool`: Test if string starts with substring.
## Strings, lists and maps
- **concat**(*string*, *\*strings*) -> `string`: Concatenate given strings into a single one.
- **contains**(*string*, *substring*) -> `bool`: If target is a string: return whether substring can be found in it or return whether given regular expression matched.
If target is a list, returns whether given item was found in it.
If target is a map, returns whether given key was found in it.
- **contains**(*string*, *regex*) -> `bool`: If target is a string: return whether substring can be found in it or return whether given regular expression matched.
If target is a list, returns whether given item was found in it.
If target is a map, returns whether given key was found in it.
- **contains**(*list*, *item*) -> `bool`: If target is a string: return whether substring can be found in it or return whether given regular expression matched.
If target is a list, returns whether given item was found in it.
If target is a map, returns whether given key was found in it.
- **contains**(*map*, *key*) -> `bool`: If target is a string: return whether substring can be found in it or return whether given regular expression matched.
If target is a list, returns whether given item was found in it.
If target is a map, returns whether given key was found in it.
- **first**(*seq*) -> `T`: Get first char of string or first item of list.
- **last**(*seq*) -> `T`: Get last char of string or first item of list.
- **len**(*seq*) -> `int`: Get number of chars in string or number of items in list.
- **get**(*string*, *index*, *default?*) -> `any`: If target is a string, return the nth unicode char. If target is a list, return the nth item. Indices are zero-based and can be negative to access items in reverse. If target is a map, return the value associated with given key. All variants can also take a default value when desired item is not found.
- **get**(*list*, *index*, *default?*) -> `any`: If target is a string, return the nth unicode char. If target is a list, return the nth item. Indices are zero-based and can be negative to access items in reverse. If target is a map, return the value associated with given key. All variants can also take a default value when desired item is not found.
- **get**(*map*, *key*, *default?*) -> `any`: If target is a string, return the nth unicode char. If target is a list, return the nth item. Indices are zero-based and can be negative to access items in reverse. If target is a map, return the value associated with given key. All variants can also take a default value when desired item is not found.
- **slice**(*seq*, *start*, *end?*) -> `seq`: Return slice of string or list.
## Lists
- **all**(*list*, *lambda*) -> `bool`: Returns whether the given lambda returned true for all elements of the list.
For instance: `all(names, name.startswith('A'))`
- **any**(*list*, *lambda*) -> `bool`: Returns whether the given lambda returned true for any element of the list.
For instance: `any(names, name.startswith('A'))`
- **compact**(*list*) -> `list`: Drop all falsey values from given list.
- **filter**(*list*, *lambda*) -> `list`: Return a list containing only elements for which given lambda returned true.
For instance: `filter(names, name => name.startswith('A'))`
- **find**(*list*, *lambda*) -> `any?`: Return the first item of a list for which given lambda returned true.
For instance: `find(names, name => name.startswith('A'))`
- **find_index**(*list*, *lambda*) -> `int?`: Return the index of the first item of a list for which given lambda returned true.
For instance: `find_index(names, name => name.startswith('A'))`
- **index_by**(*list*, *key*) -> `map`: Take a list of maps and a key name and return an indexed map from selected keys to the original maps.
- **join**(*list*, *sep*) -> `string`: Join list by separator.
- **map**(*list*, *lambda*) -> `list`: Return a list with elements transformed by given lambda.
For instance: `map(numbers, n => n + 3)`
- **mean**(*numbers*) -> `number?`: Return the mean of the given numbers.
- **sum**(*numbers*) -> `number?`: Return the sum of the given numbers, or nothing if the sum overflowed.
## Maps
- **keys**(*map*) -> `[string]`: Return a list of the map's keys.
- **values**(*map*) -> `[T]`: Return a list of the map's values.
## Dates & time
- **datetime**(*string*, *format=?*, *timezone=?*) -> `datetime`: Parse a string as a datetime according to format and timezone. If no format is provided, string is parsed as ISO 8601 date format. Default timezone is the system timezone.
https://docs.rs/jiff/latest/jiff/fmt/strtime/index.html#conversion-specifications
- **earliest**(*datetime1*, *datetime2*, *\*datetimen*) -> `datetime`: Return the earliest datetime.
- **earliest**(*list_of_datetimes*) -> `datetime`: Return the earliest datetime.
- **latest**(*datetime1*, *datetime2*, *\*datetimen*) -> `datetime`: Return the latest datetime.
- **latest**(*list_of_datetimes*) -> `datetime`: Return the latest datetime.
- **strftime**(*target*, *format*) -> `string`: Format target (a time in ISO 8601 format, or the result of datetime() function) according to format.
- **timestamp**(*number*) -> `datetime`: Parse a number as a POSIX timestamp in seconds (nb of seconds since 1970-01-01 00:00:00 UTC), and convert it to a datetime in local time.
- **timestamp_ms**(*number*) -> `datetime`: Parse a number as a POSIX timestamp in milliseconds (nb of milliseconds since 1970-01-01 00:00:00 UTC), and convert it to a datetime in local time.
- **to_timezone**(*target*, *timezone_in*, *timezone_out*) -> `datetime`: Parse target (a time in ISO 8601 format, or the result of datetime() function) in timezone_in, and convert it to timezone_out.
- **to_local_timezone**(*target*) -> `datetime`: Parse target (a time in ISO 8601 format, or the result of datetime() function) in timezone_in, and convert it to the system's local timezone.
- **year_month_day**(*target*) -> `string` (aliases: **ymd**): Extract the year, month and day of a datetime. If the input is a string, first parse it into datetime, and then extract the year, month and day.
Equivalent to `strftime(string, format="%Y-%m-%d")`.
- **month_day**(*target*) -> `string`: Extract the month and day of a datetime. If the input is a string, first parse it into datetime, and then extract the month and day.
Equivalent to `strftime(string, format="%m-%d")`.
- **month**(*target*) -> `string`: Extract the month of a datetime. If the input is a string, first parse it into datetime, and then extract the month.
Equivalent to `strftime(string, format="%m")`.
- **year**(*target*) -> `string`: Extract the year of a datetime. If the input is a string, first parse it into datetime, and then extract the year.
Equivalent to `strftime(string, format="%Y")`.
- **year_month**(*target*) -> `string` (aliases: **ym**): Extract the year and month of a datetime. If the input is a string, first parse it into datetime, and then extract the year and month.
Equivalent to `strftime(string, format="%Y-%m")`.
## Urls & web-related
- **html_unescape**(*string*) -> `string`: Unescape given HTML string by converting HTML entities back to normal text.
- **lru**(*string*) -> `string`: Convert the given URL to LRU format.
For more info, read this: https://github.com/medialab/ural#about-lrus
- **mime_ext**(*string*) -> `string`: Return the extension related to given mime type.
- **parse_dataurl**(*string*) -> `[string, bytes]`: Parse the given data url and return its mime type and decoded binary data.
- **urljoin**(*string*, *string*) -> `string`: Join an url with the given addendum.
## Fuzzy matching & information retrieval
- **fingerprint**(*string*) -> `string`: Fingerprint a string by normalizing characters, re-ordering and deduplicating its word tokens before re-joining them by spaces.
- **carry_stemmer**(*string*) -> `string`: Apply the "Carry" stemmer targeting the French language.
- **s_stemmer**(*string*) -> `string`: Apply a very simple stemmer removing common plural inflexions in some languages.
- **unidecode**(*string*) -> `string`: Convert string to ascii as well as possible.
## Utils
- **col**(*name_or_pos*, *nth?*) -> `bytes`: Return value of cell for given column, by name, by position or by name & nth, in case of duplicate header names.
- **col?**(*name_or_pos*, *nth?*) -> `bytes`: Return value of cell for given column, by name, by position or by name & nth, in case of duplicate header names. Allow selecting inexisting columns, in which case it will return null.
- **header**(*name_or_pos*, *nth?*) -> `bytes`: Return header name for given column, by name, by position or by name & nth, in case of duplicate header names.
- **header?**(*name_or_pos*, *nth?*) -> `bytes`: Return header namefor given column, by name, by position or by name & nth, in case of duplicate header names. Allow selecting inexisting columns, in which case it will return null.
- **col_index**(*name_or_pos*, *nth?*) -> `bytes`: Return zero-based index of given column, by name, by position or by name & nth, in case of duplicate header names.
- **col_index?**(*name_or_pos*, *nth?*) -> `bytes`: Return zero-based index of given column, by name, by position or by name & nth, in case of duplicate header names. Allow selecting inexisting columns, in which case it will return null.
- **cols**(*from_name_or_pos?*, *to_name_or_pos?*) -> `list[bytes]`: Return list of cell values from the given column by name or position to another given column by name or position, inclusive. Can also be called with a single argument to take a slice from the given column to the end, or no argument at all to take all columns.
- **err**(*msg*) -> `error`: Make the expression return a custom error.
- **headers**(*from_name_or_pos?*, *to_name_or_pos?*) -> `list[string]`: Return list of header names from the given column by name or position to another given column by name or position, inclusive. Can also be called with a single argument to take a slice from the given column to the end, or no argument at all to return all headers.
- **index**() -> `int?`: Return the row's index, if applicable.
- **regex**(*string*) -> `regex`: Parse given string as regex. Useful when your patterns are dynamic, e.g. built from a CSV cell. Else prefer using regex literals e.g. "/test/".
- **typeof**(*value*) -> `string`: Return type of value.
## IO & path wrangling
- **abspath**(*string*) -> `string`: Return absolute & canonicalized path.
- **cmd**(*string*, *list[string]*) -> `bytes`: Run a command using the provided list of arguments as a subprocess and return the resulting bytes trimmed of trailing whitespace.
- **copy**(*source_path*, *target_path*) -> `string`: Copy a source to target path. Will create necessary directories on the way. Returns target path as a convenience.
- **ext**(*path*) -> `string?`: Return the path's extension, if any.
- **filesize**(*string*) -> `int`: Return the size of given file in bytes.
- **isfile**(*string*) -> `bool`: Return whether the given path is an existing file on disk.
- **move**(*source_path*, *target_path*) -> `string`: Move a source to target path. Will create necessary directories on the way. Returns target path as a convenience.
- **parse_json**(*string*) -> `any`: Parse the given string as JSON.
- **pathjoin**(*string*, *\*strings*) -> `string` (aliases: **pjoin**): Join multiple paths correctly.
- **read**(*path*, *encoding="utf-8"*, *errors="strict"*) -> `string`: Read file at path. Default encoding is "utf-8". Default error handling policy is "replace", and can be one of "replace", "ignore" or "strict".
- **read_csv**(*path*) -> `list[map]`: Read and parse CSV file at path, returning its rows as a list of maps with headers as keys.
- **read_json**(*path*) -> `any`: Read and parse JSON file at path.
- **shell**(*string*) -> `bytes`: Convenience function running `cmd("$SHELL -c ") ` on unix-like systems and `cmd("cmd \C ")` on Windows.
- **shlex_split**(*string*) -> `list[string]`: Split a string of command line arguments into a proper list that can be given to e.g. the `cmd` function.
- **write**(*string*, *path*) -> `string`: Write string to path as utf-8 text. Will create necessary directories recursively before actually writing the file. Return the path that was written.
## Randomness & hashing
- **md5**(*string*) -> `string`: Return the md5 hash of string in hexadecimal representation.
- **random**() -> `float`: Return a random float between 0 and 1.
- **uuid**() -> `string`: Return a uuid v4.