# The Wonderful KFG Format The **KFG** format is the **Kung-Fig** file format, and it does wonders for all your config and data files. It's like .cfg on steroid! Once you start using it, **you won't use anything else anymore!** **KFG** is primarily a **human-friendly format for describing data** (i.e. a [data serialization language](https://en.wikipedia.org/wiki/Serialization)) but with an impressive list of features: * Human friendly data structure representation (similar to YAML, but potentially better) * Comments support * Multi-line strings support, with or without newline **folding** * Sections * Nice Map and dictionnary syntax * Classes/Constructors (date, binary data, regular expression, and custom constructors!) * Including files (.kfg, .json, .js, .txt, etc), featuring **globs** and **recursive parent search** * Relational data representation support * Meta-tags (headers) * Tags (to build scripting language on top of KFG) * References (i.e. referencing a part of the document from elsewhere) * Template strings and internationalization/localization * Expressions (arithmetic, logic, maths, etc) * Tree operations syntax (merge, combine, etc) * *... and many more!* Stop using JSON for configuration files, use KFG now! **This documentation is still a work in progress.** # Language References ### Table of Contents * [A Bit of History](#history) * [Getting Started](#getting-started) * [Comments](#ref.comments) * [Constants](#ref.constants) * [Numbers](#ref.numbers) * [Strings](#ref.strings) * [Implicit strings](#ref.strings.implicit) * [Quoted strings](#ref.strings.quoted) * [Introduced strings](#ref.strings.introduced) * [Multi-line strings](#ref.strings.multiline) * [Multi-line folded strings](#ref.strings.multiline-folded) * [Hierarchical Data Representation - Containers](#ref.hierarchical) * [Arrays](#ref.arrays) * [Objects](#ref.objects) * [Maps](#ref.maps) * [Dictionnaries](#ref.dictionnaries) * [Sections](#ref.sections) * [Array's Element Sections](#ref.sections.array) * [Object's Key/Value Sections](#ref.sections.object) * [Classes/Constructors](#ref.constructors) * [Built-in constructors](#ref.builtin-constructors) * [Tags](#ref.tags) * [Meta Tags](#ref.meta-tags) * [Includes](#ref.includes) * [Recursive Parent Search](#ref.includes.recursive-parent-search) * [Glob: including multiple files at once](#ref.includes.glob) * [Local reference: including a sub-tree of a document](#ref.includes.local-reference) * [Relational Data Representation](#ref.includes.relational) * [Refs](#ref.refs) * [Template Sentences](#ref.template-sentences) * [Expressions](#ref.expressions) * [Built-in Expressions Operators](#ref.expressions.builtin-operators) * [Operators](#ref.operators) ## A Bit of History It all started back in 2009, when *Cédric Ronvel* was bored by the fact that JSON would be a great format to write config file if it had comments support and would be less nitpicky with commas. He ends up writting a parser for a human-friendly format, being like JSON without braces, brackets and commas, with optional double-quotes, relying on indentation for hierarchical data representation, very close to YAML (also it's worth noting that it was done *before* being aware of the very existence of YAML), and a simple syntax to perform operations. That very first KFG implementation was written for PHP and was not publicly released. * In 2014, the KFG file format resurrected: it was ported to Node.js, it was part of some obscure vaporware projects. * It undergoes fundamental redesign in 2015, then was publicly released for the first time. * The addition of **custom classes/constructors** appears in 2015. * The addition of **tags** appears in 2016 to support creation of simple scripting language. * The addition of **refs**, **templates** and **expressions** appears in 2016 to support creation of simple scripting language. * The addition of **section**, **map/dictionnary** syntax appears in 2018 to ease creation of localization langpack. The Philosophy of KFG focuses on human-friendly, intuitive and natural syntax, coverage of all kind of data-model, and line-based. Each line of KFG can be parsed as a stand-alone line, except for the hierarchical reconnection. ## Getting started If you have already used YAML before, KFG will look familiar to you. For example: ``` first-name: Joe last-name: Doe ``` ... will produce `{ "first-name": "Joe" , "last-name": "Doe" }`. ``` fruits: - banana - apple - pear ``` ... will produce `{ "fruits": [ "banana" , "apple" , "pear" ] }`. Since there are no braces to delimit blocks in KFG, that's the indentation that produce the hierarchy. Here the array is the child of the *fruits* property of the top-level object. Note that **tabs SHOULD be used** to indent in KFG. This is the **recommended** way. One tab per depth-level. If you really insist with spaces, KFG only supports the 4-spaces indentation. But this is *not* recommended. Note how objects and arrays are implicit in KFG. A *node* is an object if it contains a key followed by a `:` colon. A *node* is an array if it contains array element introduced by a hyphen `-`. Some supported scalar type are: ``` number: 123.456 true-boolean1: true true-boolean2: yes true-boolean3: on false-boolean1: false false-boolean2: no false-boolean3: off null-value: null ``` There are many way to enter string: implicit mode, quoted string, introduced string, multi-line string with or without newline folding: ``` string1: This is an implicit string. string2: "This is a quoted string.\nThis is on a new line." string3: > This is a litteral string. \n <-- this 'anti-slash n' is litteral and does not produce a newline. string4: > This is a multi-line string. > This is on a new line. > > The previous line is blank. string5: >> This is a multi-line string, with newline folding. >> This is on the first line, not on the second one. >> >> This is on a new line, but there is no blank line in between. >> >> >> This is on a new line, there is only one blank line in between. ``` Implicit string (i.e. string without markup) should not be a constant or a number, in that case, you should use one of the explicit syntax to disambiguate it. But KFG can do a lot more! **Using few built-in constructors, we can store date or binary:** ``` date: Fri Jan 02 1970 11:17:36 GMT+0100 (CET) bin: af461e0a ``` ... will produce an object, with 2 properties, the *date* property will contain a Javascript `Date` object, and the *bin* property will contain a `Buffer` instance created from the hexadecimal string. By the way the *date* constructor accepts a lot of input format, like timestamp, ISO, ... Using the map/dictionnary syntax, localization files could be written like this: ``` <<: Hello World! :>> Salut tout le monde ! <<: How are you? :>> Comment vas-tu ? ``` The parser will produce a Javascript `Map` instance, with the english strings as keys, and the french translations as values. Any map could be produced: ``` <: first-name: Joe last-name: Doe :> first-name: Jane last-name: Doe ``` This will produce a map with `{ "first-name": "Joe" , "last-name": "Doe" }` as the key and `{ "first-name": "Jane" , "last-name": "Doe" }` as the value. For localization files, multi-line with or without newline folding is supported: ``` <<: Hi Bob! <<: How are you? :>> Salut Bob ! :>> Comment vas-tu ? <<<: Hi Alice! <<<: How are you? :>>> Salut Alice ! :>>> Comment vas-tu ? ``` This will produce a map with those entries: * "Hi Bob!\nHow are you?" => "Salut Bob !\nComment vas-tu ?" * "Hi Alice! How are you?" => "Salut Alice ! Comment vas-tu ?" **What is wonderful about KFG is that it supports file inclusions:** ``` user: Joe Doe items: @@items.kfg ``` ... this would load the file `items.kfg` and put its content inside the `items` property. The path is relative to the current file, so assuming `items.kfg` is in the same directory and contains this: ``` - pear - pencil - paper ``` ... the previous document would be `{ "user": "Joe Doe" , "items": [ "pear" , "pencil" , "paper" ] }`. **Also KFG supports tags:** ``` [message] text: Hello world! color: blue ``` Parsing that will produce an instance of `TagContainer` that contains a single `Tag` instance, whose name is `message`, and whose content is `{ "text": "Hello world!" , "color": "blue" }`. Tags are useful to create scripting language on top of KFG. For example, [Spellcast](https://github.com/cronvel/spellcast) is a full-blown scripting language built on top of KFG. Tags support attributes: ``` [message some:attribute] text: Hello world! color: blue ``` ... but in order to works properly, a constructor should be provided for each tag. By default, attributes are a single unparsed and trimmed string starting after the tag's name and finished before the closing bracket. ## Comments KFG supports single line comments, introduced by the hash sign `#`. A comment **MUST** be on its own line: it cannot be placed after any content, or it would be parsed as part of that content. A comment can be indented, and can even lie at a nonsensical depth. So a comment is basically some indentations, followed by a hash sign `#`, followed by anything until the end of the line. The whole line will be ignored, so any chars are accepted, even non-printable/controle chars (except, of course, the newline char). Examples of valid and invalid comments: ``` # This is a valid comment # This is a valid comment # If you need multiple lines, # you should put a # at the # beginning of each line. users: - first-name: Joe # This is a valid comment last-name: Doe # This is a valid comment, abd it does *NOT* 'close' the current object job: developer # This is NOT comment! It will be included in the string! ``` It will produce: ``` { users: [ { "first-name": "Joe" , "last-name": "Doe" , "job": "developer # This is NOT comment! It will be included in the string!" } ] } ``` As you can see, the *job* property contains the hash and anything beyond it. ## Constants Constants represent special values. They are few of them in KFG: * `null`: represent the `null` value. * `true`, `yes`, `on`: they are all representing the boolean `true` value. * `false`, `no`, `off`: they are all representing the boolean `false` value. * `NaN`: a number type whose value is *Not A Number* (e.g.: what we get when we divide by zero) * `Infinity`: a number type whose value is `Infinity` * `-Infinity`: a number type whose value is `-Infinity` E.g.: ``` debug: on ``` ... would produce `{ "debug": true }`. ## Numbers Numbers are written down directly. As anyone would expect, this KFG file will produce `{ "age": 42 }`: ``` age: 42 ``` The scientific notation is also supported, like this: `value: 1.23e45` ## Strings All KFG's strings are encoded in **UTF-8**. There are many string declaration syntax in KFG, one should use the most appropriate syntax for its usage. ### Implicit Strings The most straight-forward syntax is *implicit strings*. For example, this KFG file will produce `{ "name": "Joe Doe" }`: ``` name: Joe Doe ``` Implicit strings are fine, however they should not collide with an existing [constants](#ref.constants), should not be a valid number and should not start with a symbole used by the KFG syntax, like: - spaces and tabs (they are trimmed out) - double-quote `"` - lesser than `<` or greater than `>` - colon `:` - opening parenthesis `(` - at sign `@` - dollar `$` Trailing spaces and tabs are trimmed out too. Lastly, if your string is at top-level, it should not be confused with an object's property or an array's element, thus it should not contain any colon `:` or start with a hyphen `-`. Multi-line strings are not supported by the implicit syntax. If you are in one of those cases, declare your string using one of the following syntax. ### Quoted Strings *Quoted strings* are string inside double-quote. This KFG file will produce `{ "name": "Joe Doe" }`: ``` name: "Joe Doe" ``` Inside a quoted string, all characters are available except three types that have special meanings: - the double-quote `"` itself should be *escaped* with a backslash: `\"`, otherwise it would mean the end of the string - the backslash `\` should be *escaped* with another backslash: `\\`, because it is used to start escape sequence - all controle characters are illegals, they should be represented by a backslash escape sequence, see below Backslash escape sequence: - `\b` for the *bell* controle char - `\f` for the *form feed* controle char - `\n` for the *new line* controle char - `\r` for the *carriage return* controle char - `\t` for the *tab* controle char - `\\` for a single *backslash* `\` char - `\/` for a single *slash* `/` char (escaping slashes is **optional** and **not recommended**) - `\"` for the *double-quote* `"` char - `\uXXXX` for writing a char using its unicode code point, where *XXXX* is the hexedecimal unicode code point, this is **optional**, KFG support UTF-8 out of the box, so it should be used only if one want to avoid some strange chararacters in its source code Quoted strings does not support multi-line: they should start and end at the same line, however the **content** of the string can be multi-line: just insert as many `\n` as you need. ### Introduced Strings *Introduced strings* are strings introduced by the *greater than* sign `>` followed by a space ` `. This KFG file will produce `{ "name": "Joe Doe" }`: ``` name: > Joe Doe ``` Everything after the `> ` mark **until the end of the line** will be in the string, without being trimmed. That means that trailing spaces will be part of the string, as well as extra spaces after the `> ` mark. Introduced strings are great because they do not need escaping, any chars except the new line can be used. They are left untouched. Reciprocally, since chars aren't interpreted, it could be hard to spot bad chars, especially controle chars. If you need to declare a string with controle chars, it's best to use [quoted string](#ref.strings.quoted) and backslash escape sequences. For anything else, they are generally greater than *quoted strings*. If you need multi-line, use the [multi-line string syntax](#ref.strings.multiline), which is a variant of this syntax. ### Multi-line Strings *Multi-line strings* is a variant of [introduced string](#ref.strings.introduced). Just look at this example: ``` description: > The KFG format is the Kung-Fig file format. > It does wonders for all your config files. > It's like .cfg on steroid! > Once you start using it, you won't use anything else! ``` As you would expect, it produces an object with a single *description* property containing the whole paragraph, of course without the initial indentation and the `> ` mark at the start of each lines. Note that the multi-line string does not start at the same line than the property key *description*, but at the next line, one level of indentation deeper than its *container*. All the other rules of [introduced string](#ref.strings.introduced) applies. *Multi-line strings* are really great. Copy-paste any raw text paragraph in your KFG, then prefix each line with `> ` and indent it: *and it just works!* Your text editor may even do that for you with a few keystrokes. This works mostly like the quotes in email format. ### Multi-line Folded Strings *Multi-line folded strings* is a variant of [multi-line string](#ref.strings.multiline). It works with a double `>>` sign instead of a single `>`. Just look at this example: ``` description: >> The KFG format is the Kung-Fig file format. >> It does wonders for all your config files. >> It's like .cfg on steroid! >> Once you start using it, you won't use anything else! ``` The description property will not contain a 4-lines string like it would for a regular multi-line, instead, the lines are folded: only one big line will be created. Before merging all lines in one, each lines is *trimmed*: all consecutive white spaces are removed from the left and the right of the line. If a true line break is needed, an *empty text line* is needed, i.e. a line with a `>>` mark with nothing left after it except white spaces. Example: ``` description: >> The KFG format is the Kung-Fig file format. >> It does wonders for all your config files. >> It's like .cfg on steroid! >> >> Once you start using it, you won't use anything else! ``` The first 3 lines are merged, but not the last one. So... if an empty line is needed, two consecutive *empty text lines* should be written, e.g.: ``` description: >> The KFG format is the Kung-Fig file format. >> It does wonders for all your config files. >> It's like .cfg on steroid! >> >> >> Once you start using it, you won't use anything else! ``` ## Hierarchical Data Representation - Containers Non-scalar value are called *containers*. There are four *container* types in KFG: * [Arrays](#ref.arrays): an ordered list of elements, it generates a Javascript Array instance * [Objects](#ref.objects): an object is a kind of map of key/value pairs, where the key is a string, it generates a Javascript Object instance * [Map](#ref.maps): a true map of key/value pairs, where the key can be of any type, it generates a Map instance * [Tag Containers](#ref.tags): an ordered list of tags The indentation is used to denote structure, to express the nested/embedded relationship: any part that is indented belongs to the element on the closest line above having a smaller indentation level. Here is a commented document that explains for each element its parent relationship: ``` # The following tag belong to the root document, which is implicitly a Tag Container [character Joe] # The following key/value pairs belong to the [character] tag above name: Joe Doe stats: # The following key/value pairs belong to the stats object above strength: 11 dexterity: 14 intelligence: 17 # The following key/value pair belongs to the [character] tag status: # The following key/value pair belongs to the status object hp: 18 # The following key/value pair belongs to the [character] tag friends: # The following elements belong to the friends array above - Rebecca - Anna - Siegfried ``` It is important to understand that **siblings should be of the same type**. This is incorrect and would cause a parse error: ``` [mytag] name: Joe Doe - one - two - three ``` Is this document a tag container? An object? An array? This doesn't make any sense. ## Arrays The array representation in KFG is simply a list where each item/element is introduced by a hyphen `-` followed by a space ` `. One item/element per line. The hyphen `-` is usually well understood as a list's item introducer in various format. For example, this would produce `[ "banana" , "apple" , "pear" ]`: ``` - banana - apple - pear ``` This is really a simple and easy to read syntax. **Arrays are implicit**: a *node* is an array as soon as it contains an array's element. Thus an empty array cannot be declared implicitly -- it has no element! So it should be declared explicitly with the [constructor syntax](#ref.contructors). E.g.: ``` empty: ``` ... would produce `{ "empty": [] }`. Defining array of arrays would look like this: ``` - - one - two - three - - four - five - six - - seven - eight - nine ``` Indeed, each top-level element is a container, so the nested array should be one-level deeper. However, the KFG supports this neat *compact syntax* inspired by YAML: ``` - - one - two - three - - four - five - six - - seven - eight - nine ``` That's it: if an element/item of an array is a container (array/object/map), its first child can be put on the same line. For that purpose, **a tab should be inserted right after the hyphen** `-`. If you have insisted on using spaces instead of tabs for indentation (something that is **not** recommended), you should insert exactly 3 spaces (not 4, for alignment reasons) right after the hyphen `-`. The same syntax with objects inside the array: ``` - first-name: Joe last-name: Doe - first-name: Bill last-name: Baroud - first-name: Jane last-name: Doe ``` This would produce: `[ { "first-name": "Joe" , "last-name": "Doe" } , { "first-name": "Bill" , "last-name": "Baroud" } , { "first-name": "Jane" , "last-name": "Doe" } ]`. ## Element repetition To repeat one element *n* times, put immediately after the hyphen an integer, followed by a `x` and a colon `:` **WITHOUT ANY SPACES**: ``` -3x: Alice -2x: Bob ``` This would produce: `[ "Alice" , "Alice" , "Alice" , "Bob" , "Bob" ]`. If the element to be repeated is an object, those elements will share the same references to that object. ## Objects The object representation in KFG is simply a list of key, followed by a colon `:` followed by the value. There can be any number of spaces before and after the colon. The syntax is similar to the array syntax, the hyphen being replaced by the property's key and the colon: one property per line. For example, this would produce `{ "first-name": "Joe" , "last-name": "Doe" , "job": "developer" }`: ``` first-name: Joe last-name: Doe job: developer ``` Like arrays, **objects are implicit**: a *node* is an object as soon as it contains one object's property. Thus an empty object cannot be declared implicitly -- it has no property! So they should be declared explicitly with the [constructor syntax](#ref.contructors). E.g.: ```