# GoEML **Elementary analysis from one binary operator, in Go.** GoEML is a Go library and command-line tool built around the EML operator `eml(x, y) = exp(x) - ln(y)`. Every expression is a **uniform binary tree** whose leaves are the constant `1` or indexed variables `x0`, `x1`, …; internal nodes are always `eml`. From that tiny grammar you can reach the usual elementary functions (exponential, logarithm, trig, hyperbolic inverses, arithmetic, and more) through fixed **canonical** compositions. The mathematics follows Andrzej Odrzywolek, *“All elementary functions from a single binary operator”* — [arXiv:2603.21852](https://arxiv.org/abs/2603.21852) (Jagiellonian University, Institute of Theoretical Physics). This repository is the **Go implementation** of [https://github.com/cool-japan/oxieml](https://github.com/cool-japan/oxieml) (the **OxiEML** Rust project): the same EML idea, parser/evaluator/CLI-style workflow, and many of the same capabilities, implemented in Go rather than as a Rust dependency or FFI wrapper. It is **not** a mechanical translation of the Rust sources and does **not** aim for full feature parity yet (for example, the upstream crate’s symbolic regression, SIMD batch paths, Rust codegen, and optional SMT backend are outside the scope of the current Go tree). ## License GoEML is released under the **GNU General Public License v3.0** only. See the [`LICENSE`](LICENSE) file in the repository root for the full legal text. If you combine or distribute this software, comply with GPLv3 (including source-offer and license-conveyance rules where they apply). ## What you get today | Area | Support | |------|---------| | Tree + grammar | `1`, `xN`, nested `eml` | | Text formats | Compact `E(a,b)` and readable `eml(a, b)` | | Evaluation | Real path with complex fallback; batch evaluation | | Canonical math | Tables-style constructions (exp, ln, trig, hyperbolic, arithmetic, …) | | Lowering | EML → conventional ops (`exp`, `ln`, `+`, …) + simplification helpers | | CLI | Evaluate, generate (`-g`), list (`-l`), file/stdin input, variable bindings | ## Repository layout - **`go/`** — Go module root (`go.mod` is here). Import path: `goeml/pkg/goeml`. - **`go/cmd/goeml`** — `goeml` CLI. - **`go/pkg/goeml`** — Library API. - **`go/examples/minimal`** — Runnable `main` that imports the library. - **`build.sh`** — Runs `task build`. - **`test.sh`** — Builds the CLI, runs CLI checks, then **`go test ./...`** (including README-driven tests). - **`Taskfile.yml`** — `task build`, `task test`, `task integration`, etc. ## Requirements - [Go](https://go.dev/dl/) (see `go/go.mod` for the declared language version). - [Task](https://taskfile.dev/) if you use the Taskfile or `build.sh` / `test.sh` as written. ## Build ```bash ./build.sh # or task build ``` The binary is written to **`bin/goeml`** (built from `go/cmd/goeml`). ## Tests (automated) ```bash # Unit + integration tests (from repository root) ./test.sh # or task integration # Go tests only (from module root) cd go && go test ./... # README examples as code (subset of pkg tests) cd go && go test ./pkg/goeml -run TestReadme -v ``` --- ## CLI Tool The `goeml` program evaluates `E(…)` / `eml(…)` expressions, **generates** canonical EML from familiar function names (`-g` / `--gen`), **lists** built-in names (`-l`), reads from **`--file` / `-f`** or **stdin**, and binds variables as `x0=1.5` on the command line. When the real part matches a known mathematical constant closely enough, it prints a **MATCH** line (same spirit as the upstream CLI). ```bash # Evaluate an EML expression goeml "E(1, 1)" # expect: MATCH: e … and Result ≈ 2.71828… # Generate EML from a function/constant name goeml -g pi goeml -g e goeml -g sin x0=0.5 # Evaluate with variables goeml "E(x0, 1)" x0=2.0 # Alternative surface syntax goeml "eml(1, 1)" # Read from file goeml --file expression.txt goeml -f expression.txt x0=1.0 # List available functions and constants goeml -l # Help and version goeml --help goeml --version ``` If the first argument is **not** valid EML text but is a known generator name, the CLI behaves like **`goeml -g`** (e.g. `goeml pi`, `goeml sin`). The checks above (and more) run in **`./test.sh`** after each `task build`. --- ## Quick Start (Library) ```go package main import ( "fmt" "math" "goeml/pkg/goeml" ) func main() { c := goeml.Canonical{} x := goeml.Var(0) expX := c.Exp(x) ctx := goeml.NewEvalCtx([]float64{1.0}) result, err := expX.EvalReal(ctx) if err != nil { panic(err) } fmt.Printf("exp(1) ≈ e: %v\n", math.Abs(result-math.E) < 1e-9) e := c.Euler() fmt.Println(goeml.ToCompactString(e)) // E(1,1) y := goeml.Var(1) sum := c.Add(x, y) _ = sum // Add, Mul, Div, Pow, Sin, …: same Canonical API lowered := expX.Lower().Simplify() fmt.Println(lowered.Pretty()) fmt.Println(lowered.Eval([]float64{1.0})) } ``` These flows are covered by **`TestReadme_*`** in [`go/pkg/goeml/readme_doc_test.go`](go/pkg/goeml/readme_doc_test.go) (run `go test ./pkg/goeml -run TestReadme`). --- ## Parser Parse compact **`E(a,b)`** or **`eml(a,b)`** text into a tree; **`goeml.ToCompactString`** prints canonical compact form for storage and round-trip. ```go tree, err := goeml.Parse("E(E(1, 1), 1)") // depth 2 tree, err = goeml.Parse("eml(E(1, x0), 1)") compact := goeml.ToCompactString(tree) _, err = goeml.Parse(compact) ``` Parser behaviour and round-trip are covered by unit tests in [`go/pkg/goeml/parser_test.go`](go/pkg/goeml/parser_test.go) and **`TestReadme_parserRoundTripAndEml`**. --- ## Symbolic Regression The Rust **OxiEML** crate includes a gradient-based symbolic regression engine over EML topologies. **GoEML does not implement that yet**; evaluation and lowering are stable, but there is no `SymRegEngine` equivalent in this repository. If you add one, keeping the same separation between “search in EML space” and “execute lowered ops” is a good fit. --- ## SMT / Constraint Solving Upstream can integrate an SMT stack behind a feature flag. **GoEML has no constraint solver or interval domain API** at this time. --- ## Canonical constructions (paper tables) Constructions follow the paper’s phylogenetic viewpoint (see upstream README tables for full formulas). In Go you build them with **`goeml.Canonical`** (e.g. `Exp`, `Ln`, `Add`, `Sin`, …). Depth grows quickly for some identities; use **`Tree.Lower()`** when you want conventional `exp` / `ln` / arithmetic for inspection or faster scalar evaluation. ### Table 1: Basic operations | Function | EML idea | Depth (typical) | |----------|----------|-----------------| | `exp(x)` | `eml(x, 1)` | 1 | | `e` | `eml(1, 1)` | 1 | | `ln(x)` | nested `eml` chain | 3 | | `-x` | composition via `e` and `ln` | 6 | | `0` | `ln(1)` | 3 | ### Table 2: Arithmetic | Function | Built from | |----------|------------| | `x + y`, `x - y`, `x * y`, `x / y`, `x ^ y` | `Canonical.Add`, `Sub`, `Mul`, `Div`, `Pow` | | `1/x` | `Reciprocal` | ### Tables 3–7: Transcendentals, hyperbolic, inverses, roots Use **`Canonical`** methods (`Sin`, `Cos`, `Tan`, `Arcsin`, `Sinh`, `Sqrt`, `Nat`, …). Trigonometric constructions rely on the **complex evaluation path** internally; **`EvalReal`** returns an error when the imaginary part is not negligible. --- ## Architecture ```text Parse / generate Evaluate / explain -------------------- ----------------------- Text "E(...)" -----> Tree (EML grammar) -----> EvalReal / EvalComplex or -g sin S -> 1 | eml(S,S) stack machine Canonical{…} (library & CLI generator) | v Lower() -> conventional ops + Pretty() | v LoweredOp.Eval / flat IR (scalar) ``` There is **no** separate “discovery phase” binary in Go yet (no symreg). The diagram emphasises **parse → tree → eval** and **lower → fast scalar path**, matching what is implemented today. --- ## Package layout (Go ↔ upstream concepts) | Go (this repo) | Role (cf. Rust crate) | |----------------|------------------------| | [`go/pkg/goeml/tree.go`](go/pkg/goeml/tree.go) | Uniform binary trees (`tree`) | | [`go/pkg/goeml/eval.go`](go/pkg/goeml/eval.go) | Stack-machine evaluation (`eval`) | | [`go/pkg/goeml/canonical.go`](go/pkg/goeml/canonical.go) | Elementary constructions (`canonical`) | | [`go/pkg/goeml/parser.go`](go/pkg/goeml/parser.go) | Text parse / compact string (`parser`) | | [`go/pkg/goeml/lower.go`](go/pkg/goeml/lower.go) | Lowering + pretty print + flat ops (`lower`) | | [`go/pkg/goeml/errors.go`](go/pkg/goeml/errors.go) | Errors (`error`) | | — | `grad`, `symreg`, `simplify`, `compile`, `smt`, `simd_eval`: **not ported** | --- ## Upstream feature parity | OxiEML (Rust) | GoEML | |---------------|--------| | `default` crate | Core tree, eval, parser, canonical, lower, CLI | | `parallel` / `simd` features | Not available | | `smt` feature | Not available | | `symreg` | Not available | | `compile_to_rust` | Not available | --- ## Performance ### `EvalBatch`: default (sequential) vs `-tags=parallel` (worker pool) Same workload as upstream README-style batch eval: **`eml(x0, 1)`** (i.e. `exp(x0)`) on **10 000** rows, one `EvalBatch` call per benchmark iteration, shared flattened instruction list (mirrors OxiEML’s “flatten once, eval many rows” pattern). **Correctness (like upstream “scalar vs parallel” checks):** with **`-tags=parallel`**, `TestEvalBatchParallelMatchesSequential10k` compares **`EvalBatch`** to **`evalBatchSequential`** on **10 000** rows and requires **`|Δ| ≤ 1e-12`** on every row. This mirrors the **upstream README** idea: publish batch timings and prove parallel vs scalar agree; here “scalar” is **`evalBatchSequential`** and “parallel” is **`EvalBatch`** with **`-tags=parallel`**. **Measured throughput** (representative developer machine; rerun on yours—the ratio is what matters): | Build / API | Benchmark | Median wall time (10 k rows / iter) | Allocs/op (approx.) | |-------------|-----------|--------------------------------------|------------------------| | **Default** (`!parallel`) | `BenchmarkEvalBatch_10k_DefaultBuild` | **~0.45 ms** | ~10 004 | | **`-tags=parallel`** | `BenchmarkEvalBatch_10k_Sequential` (forced single-goroutine) | **~0.45 ms** | ~10 004 | | **`-tags=parallel`** | `BenchmarkEvalBatch_10k_Parallel` (`EvalBatch`, ≥128 rows → pool) | **~0.17 ms** | ~10 048 | Environment for the numbers above: **linux/amd64**, **Go 1.23** (see `go/go.mod`), CPU reported by `go test` as **Intel Core Ultra 7 265K** (`GOMAXPROCS` = 20 on that run). **Speedup** of parallel `EvalBatch` vs single-threaded reference on the same binary: **~2.5–2.6×** for this tree and row count. Reproduce: ```bash cd go go test -bench=BenchmarkEvalBatch_10k_DefaultBuild -benchmem -count=10 -benchtime=400ms ./pkg/goeml go test -tags=parallel -bench='BenchmarkEvalBatch_10k_' -benchmem -count=10 -benchtime=400ms ./pkg/goeml # or task bench ``` **Concurrency (default build):** the CLI and default **`EvalBatch`** do not fan rows across workers. **Parallel `EvalBatch` (optional):** build or test with **`-tags=parallel`**. For **`len(data) ≥ 128`**, `EvalBatch` uses a chunked goroutine pool (`GOMAXPROCS` workers); smaller batches stay sequential. **`go test -tags=parallel -race ./...`** is in `Taskfile` (`test-parallel`) and in **`./test.sh`**. The **`goeml`** CLI binary from **`task build`** is still **without** `parallel`; use **`go build -tags=parallel …`** if you need parallel `EvalBatch` inside your process. --- ## Design decisions - **Value-based trees** — Go uses pointers inside `Tree`/`Node`; sharing is by structure, not `Arc` (no symreg yet, so deep sharing is less critical than in upstream). - **Stack-machine evaluation** — Post-order flattening avoids deep recursion on huge canonical trees (same motivation as Rust). - **`complex128` internally** — `EvalComplex` supports branch cuts; `EvalReal` succeeds only when the imaginary part is negligible. - **Lowering for readability** — Pattern-matches frequent `eml` shapes into `exp`, `ln`, `e - x`, etc., with a fallback `exp(left) - ln(right)`. - **CLI constant matching** — Heuristic comparison to a small table of named reals (similar UX to upstream). - **Optional parallel `EvalBatch`** — Default build: sequential rows. With **`-tags=parallel`**, batches of **≥128** rows use a chunked goroutine pool (few OS threads, many goroutines scheduled by the runtime—Go’s analogue to lightweight parallel tasks). Errors from any worker cancel the rest via an atomic flag. --- ## Test coverage | Layer | How it is verified | |-------|---------------------| | Library | **`go test ./...`** from `go/` — **29** tests in `pkg/goeml` by default (parser, eval, canonical, lower, **`TestReadme_*`**). With **`-tags=parallel`**, **+3** tests in `eval_batch_parallel_test.go` (10 000-row **sequential vs `EvalBatch`** to **1e-12**, below-threshold **127** rows, error propagation). Benchmarks: `eval_bench_test.go` / `eval_bench_parallel_test.go`. | | CLI | **`./test.sh`** — builds `bin/goeml`, runs **grep-based** checks on real CLI output (`E(1,1)`, `-g pi/e/sin`, `eml(1,1)`, stdin, `--file`/`-f`, `-l`, `--help`, `--version`, bare `pi` / `sin`), then **`go test ./...`** again. | ```bash cd go && go test ./... -count=1 cd go && go test -tags=parallel -race ./... -count=1 task bench # optional: EvalBatch 10k sequential vs parallel timings ./test.sh ``` --- ## Using GoEML from another Go module The module root is the **`go/`** directory. Use **`replace`** until you publish a `v1.0.0` tag that matches your module path: ```go require goeml v1.0.0 replace goeml => ../path/to/goeml/go ``` ```go import "goeml/pkg/goeml" tree, err := goeml.Parse("E(1,1)") if err != nil { log.Fatal(err) } v, err := tree.EvalReal(goeml.NewEvalCtx(nil)) if err != nil { log.Fatal(err) } fmt.Println(v) // ≈ e ``` ```bash cd go && go run ./examples/minimal ``` More detail: [`go/pkg/goeml/doc.go`](go/pkg/goeml/doc.go). --- ## References - Paper: [arXiv:2603.21852](https://arxiv.org/abs/2603.21852) - Upstream Rust project (GoEML is its Go counterpart; different license and feature set): [https://github.com/cool-japan/oxieml](https://github.com/cool-japan/oxieml)