## Custom languages support Custom languages support can be added in the languages directory found at: * *Linux*: uses `XDG_CONFIG_HOME`, usually translates to `~/.config/ecode/languages` * *macOS*: uses `Application Support` folder in `HOME`, usually translates to `~/Library/Application Support/ecode/languages` * *Windows*: uses `APPDATA`, usually translates to `C:\Users\{username}\AppData\Roaming\ecode\languages` ecode will read each file located at that directory with `json` extension. Each file can contain one or several language definitions. * **Single Language:** If the root element of the JSON file is an object, it defines a single language. * **Multiple Languages / Sub-Grammars:** If the root element is an array, it can define multiple independent languages *or* a main language along with **sub-language definitions** used for nesting within the main language (see Nested Syntaxes below). Each object in the array must be a complete language definition with at least a unique `"name"`. Language definitions can override any currently supported definition. ecode will prioritize user defined definitions. Sub-language definitions used only for nesting might not need fields like `"files"` or `"headers"` if they aren't intended to be selectable top-level languages. ### Language definition format ```json { "name": "language_name", // (Required) The display name of the language. Must be unique, especially if referenced by other definitions for nesting or includes. "files": [ // (Required if `visible` is `true`) An array of Lua patterns matching filenames for this language. "%.ext$", // Example: Matches files ending in .ext "^Makefile$" // Example: Matches the exact filename Makefile ], "comment": "//", // (Optional) Sets the single-line comment string used for auto-comment functionality. "patterns": [ // (Required) An array defining syntax highlighting rules. See "Pattern Rule Types" below for details. // ... pattern rules defined here ... ], "repository": { // (Optional) A collection of named pattern sets for reuse within this language definition. // Keys are repository item names (e.g., "comments", "strings", "expressions"). // Values are arrays of pattern rules, following the same format as the main "patterns" array. // These items can be referenced in any "patterns" array using an "include" rule (e.g., { "include": "#comments" }). "my_reusable_rules": [ { "pattern": "foo", "type": "keyword" }, { "pattern": ["bar_start", "bar_end"], "type": "string" } ] }, "symbols": [ // (Optional) An array defining specific types for exact words, primarily used in conjunction with patterns having `type: "symbol"`. // Structure: An array where each element is an object containing exactly one key-value pair. // - The key is the literal word (symbol) to match. // - The value is the `type_name` to apply when that word is matched via a `type: "symbol"` pattern rule. // How it works: // When a pattern rule results in a `type: "symbol"` match (either for the whole pattern or a capture group), // the actual text matched by that pattern/group is looked up within this `symbols` array. // The editor iterates through the array, checking if the key of any object matches the text. // If a match is found (e.g., the text is "if" and an object `{ "if": "keyword" }` exists), // the corresponding value ("keyword") is used for highlighting. // If the matched text is not found as a key in any object within this array, the highlighting typically falls back to the "normal" type. // This mechanism is essential for highlighting keywords, built-in constants/literals, and other reserved words // that might otherwise be matched by more general patterns (like a pattern for all words). // Example: { "if": "keyword" }, // If a `type: "symbol"` pattern matches "if", it will be highlighted as "keyword". { "else": "keyword" }, { "true": "literal" }, // If a `type: "symbol"` pattern matches "true", it will be highlighted as "literal". { "false": "literal" }, { "MyClass": "type" }, // If a `type: "symbol"` pattern matches "MyClass", it will be highlighted as "type". { "begin": "keyword" }, { "end": "keyword" } // ... add other specific words and their types as needed ... ], "headers": [ // (Optional) Array of Lua Patterns to identify file type by reading the first few lines (header). "^#!.*[ /]bash" // Example: Identifies bash scripts like '#!/bin/bash' ], "visible": true, // (Optional) If true (default), language appears in main selection menus. Set to false for internal/helper languages. "case_insensitive": false, // (Optional) If true, pattern matching ignores case. Default is false (case-sensitive). "auto_close_xml_tag": false, // (Optional) If true, enables auto-closing of XML/HTML tags (e.g., typing `
` automatically adds `
`). Default is false. "extension_priority": false, // (Optional) If true, this language definition takes priority if multiple languages define the same file extension. Default is false. "lsp_name": "language_server_name", // (Optional) Specifies the name recognized by Language Servers (LSP). Defaults to the 'name' field in lowercase if omitted. "fold_range_type": "braces", // (Optional) Specifies the strategy used to detect foldable code regions. Default behavior if omitted might be no folding or a global default. Possible values: // - "braces": Folding is determined by matching pairs of characters (defined in `fold_braces`). Suitable for languages like C, C++, Java, JavaScript, JSON. // - "indentation": Folding is determined by changes in indentation level. Suitable for languages like Python, YAML, Nim. // - "tag": Folding is based on matching HTML/XML tags (e.g., `
...
`). Suitable for HTML, XML, SVG. // - "markdown": Folding is based on Markdown header levels (e.g., `## Section Title`). Suitable for Markdown. "fold_braces": [ // (Required *only if* `fold_range_type` is "braces") Defines the pairs of characters used for brace-based folding. // This is an array of objects, where each object specifies a starting and ending character pair. { "start": "{", "end": "}" }, // Example: Standard curly braces { "start": "[", "end": "]" }, // Example: Square brackets { "start": "(", "end": ")" } // Example: Parentheses ] } ``` #### Pattern Rule Types The `"patterns"` array is the core of syntax highlighting. It contains an ordered list of rules that ecode attempts to match against the text. Each rule is an object. Here are the types of rules you can define: ```json "patterns": [ // --- Simple Rules --- // Rule using Lua patterns: { "pattern": "lua_pattern", "type": "type_name" }, // Rule using Lua patterns with capture groups mapping to different types: { "pattern": "no_capture(pattern_capture_1)(pattern_capture_2)", "type": [ "no_capture_type_name", "capture_1_type_name", "capture_2_type_name" ] }, // Rule using Perl-compatible regular expressions (PCRE): { "regex": "perl_regex", "type": "type_name" }, // Rule using PCRE with capture groups mapping to different types: { "regex": "no_capture(pattern_capture_1)(pattern_capture_2)", "type": [ "no_capture_type_name", "capture_1_type_name", "capture_2_type_name" ] }, // --- Multi-line Block Rules (Lua Patterns or PCRE) --- // Defines a block spanning multiple lines using start/end patterns and an optional escape character. // These rules can use either "pattern": ["start", "end", "escape?"] for Lua patterns // or "regex": ["start_regex", "end_regex", "escape_char?"] for PCRE. // They support the same "type" and "end_type" combinations for styling delimiters as detailed previously. // // Basic usage (same type for start and end delimiters): { "pattern": ["lua_pattern_start", "lua_pattern_end", "escape_character"], "type": "type_name" }, // Using different types for start and end delimiters: { "regex": ["regex_start", "regex_end"], "type": "start_type_name", "end_type": "end_type_name" }, // Using capture groups with different types for start and end delimiters: { "pattern": ["start_nocap(scap1)", "end_nocap(ecap1)(ecap2)", "escape"], "type": ["start_nocap_type", "start_cap1_type"], "end_type": ["end_nocap_type", "end_cap1_type", "end_cap2_type"] }, // --- Contextual Patterns within Blocks (Inner Patterns) --- // Multi-line block rules can define their own "patterns" array to apply specific rules // to the content *between* their start and end delimiters. This allows for more granular // highlighting within a block without needing to define a full sub-language via the "syntax" key. { "regex": ["
", "
"], // Defines the block "type": "keyword", // Type for "
" delimiter "end_type": "keyword", // Type for "
" delimiter "patterns": [ // (Optional) Inner patterns for content *inside*
...
{ "regex": "highlight_this_inside", "type": "function" }, { "include": "#common_section_rules" } // Can also include repository items // These inner patterns are matched only against the text between "
" and "
". // Inner patterns can themselves be block rules with their own inner patterns, allowing for nested contextual highlighting. ] }, // Note: If a block rule includes both inner "patterns" and a "syntax" key (for nested languages), // the "syntax" key typically takes precedence, causing the content to be highlighted by the // specified sub-language. Inner "patterns" are primarily for applying rules from the // *current* language's context or ad-hoc rules specifically to the content of this block. // --- Custom Parser Rule --- // Rule using a custom parser implemented in native code (as previously described): { "parser": "custom_parser_name", "type": "type_name" }, // --- Symbol Lookup Rule --- // Rule assigning the "symbol" type, for lookup in the language's "symbols" definition (as previously described): { "pattern": "[%a_][%w_]*", "type": "symbol" }, // Rule using "symbol" within capture groups (as previously described): { "pattern": "(%s%-%a[%w_%-]*%s+)(%a[%a%-_:=]+)", "type": [ "normal", "function", "symbol" ] }, // --- Include Rules --- // Allows reusing sets of patterns defined elsewhere in the grammar. // This helps in organizing complex grammars and avoiding repetition. { "include": "#repository_item_name" // Includes rules from the 'repository_item_name' entry // in the top-level "repository" object of this language definition. // The '#' prefix is mandatory for repository items. }, { "include": "$self" // Includes all rules from the main top-level "patterns" array // of the *current* language definition. This is useful for // recursive definitions, such as nested expressions or blocks. }, // Example using "include" with a "repository": // "repository": { // "comments_and_strings": [ // { "pattern": "//.*", "type": "comment" }, // { "pattern": ["\"", "\"", "\\\\"], "type": "string" } // ] // }, // "patterns": [ // { "include": "#comments_and_strings" }, // // ... other rules ... // ] // --- NESTED SYNTAX RULE --- // Defines a multi-line block that switches to a DIFFERENT language syntax inside. // Uses the same multi-line pattern/regex format and delimiter styling options ("type", "end_type"). { "pattern": ["lua_pattern_start", "lua_pattern_end", "escape_character"], // Can also use "regex" "syntax": "NestedLanguageName", // (Required for nesting) The 'name' of another language definition. "type": "start_delimiter_type", // (Optional) Type(s) for the START delimiter. "end_type": "end_delimiter_type" // (Optional) Type(s) for the END delimiter. } // This is distinct from "Contextual Patterns within Blocks (Inner Patterns)". // The "syntax" key switches highlighting to a completely different, pre-defined language // for the content within the delimiters. Inner "patterns", on the other hand, apply a // specific set of rules from the *current* language's context or ad-hoc rules to the block's content. ] ``` ### Nested Syntaxes (Sub-Grammars) ecode supports **nested syntaxes**, allowing a block of code within one language to be highlighted according to the rules of another language. This is crucial for accurately representing modern languages that often embed other languages or domain-specific languages. **How it works:** 1. **Define Sub-Languages:** Define the syntax for the language to be embedded (e.g., "CSS", "JavaScript", "XML", "SQL") as a separate language definition. Often, these are defined within the *same JSON file* as the main language, using a JSON array as the root element (see [Custom languages support](#custom-languages-support)). The sub-language definition needs a unique `"name"`. 2. **Reference in Patterns:** In the main language's `"patterns"`, use a multi-line block rule (`pattern` or `regex` array). Add the `"syntax"` key to this rule, setting its value to the `"name"` of the sub-language definition you want to use inside the block. 3. **Highlighting:** When ecode encounters this block, it applies the highlighting rules from the specified sub-language to the content *between* the start and end delimiters. The delimiters themselves are styled according to the `type` (and `end_type`) specified in the *outer* rule. **Contrast with Inner Patterns:** While the `syntax` key is used to embed an *entirely different language* within a block, multi-line block rules can also contain their own `patterns` array (see "Contextual Patterns within Blocks" under [Pattern Rule Types](#pattern-rule-types)). This inner `patterns` array allows for defining specific highlighting rules for the content *within* the block using rules from the current language's context or ad-hoc rules. This is useful when a full language switch isn't necessary but more granular control over the block's content highlighting is desired (e.g., highlighting specific keywords differently only within a certain type of block in the parent language). **Example Use Cases for `syntax` key:** * HTML files containing `