# SMILE — Data Transformation User Guide & Tutorial The `smile.data.transform` package provides a composable, serializable pipeline for preprocessing tabular data before model training. Every transform maps a `Tuple` (one row) or an entire `DataFrame` to a transformed counterpart, and the higher-level implementations in `smile.feature.transform` build the common statistical scalers on top of that foundation. --- ## Table of Contents 1. [Architecture overview](#1-architecture-overview) 2. [The Transform interface](#2-the-transform-interface) 3. [The InvertibleTransform interface](#3-the-invertibletransform-interface) 4. [ColumnTransform](#4-columntransform) 5. [InvertibleColumnTransform](#5-invertiblecolumntransform) 6. [Built-in scalers and standardisers](#6-built-in-scalers-and-standardisers) - [Scaler — min-max scaling](#61-scaler--min-max-scaling) - [Standardizer — z-score standardization](#62-standardizer--z-score-standardisation) - [RobustStandardizer — median / IQR](#63-robustandardizer--median--iqr) - [WinsorScaler — percentile clamping](#64-winsorscaler--percentile-clamping) - [MaxAbsScaler — maximum absolute value](#65-maxabsscaler--maximum-absolute-value) - [Normalizer — per-row unit norm](#66-normalizer--per-row-unit-norm) 7. [Composing transforms](#7-composing-transforms) - [andThen and compose](#71-andthen-and-compose) - [pipeline](#72-pipeline) - [fit — data-dependent pipelines](#73-fit--data-dependent-pipelines) 8. [Inverting transforms](#8-inverting-transforms) 9. [Nullable column handling](#9-nullable-column-handling) 10. [Serialization](#10-serialization) 11. [Writing a custom transform](#11-writing-a-custom-transform) 12. [End-to-end tutorial](#12-end-to-end-tutorial) 13. [Choosing the right transform](#13-choosing-the-right-transform) 14. [API quick reference](#14-api-quick-reference) --- ## 1. Architecture overview ``` Transform ────────────────────────────────────────────────────────────────── │ apply(Tuple) → Tuple (row-by-row) │ apply(DataFrame) → DataFrame (batch, column-parallel) │ andThen(Transform) → Transform (compose forward) │ compose(Transform) → Transform (compose backward) │ ├── ColumnTransform (map-of-lambdas, column-wise) │ └── InvertibleColumnTransform (adds invert() for both Tuple and DataFrame) │ InvertibleTransform extends Transform │ invert(Tuple) → Tuple │ invert(DataFrame) → DataFrame ``` **All built-in scalers** (`Scaler`, `Standardizer`, `RobustStandardizer`, `WinsorScaler`, `MaxAbsScaler`) return an `InvertibleColumnTransform`. `Normalizer` returns a plain `ColumnTransform`-compatible `Transform`. --- ## 2. The Transform interface `Transform` lives in `smile.data.transform` and extends `java.util.function.Function`. Every preprocessing step implements this single interface. ### Row-level application ```java Transform t = /* any transform */; Tuple outRow = t.apply(inRow); ``` ### Batch application The default `apply(DataFrame)` implementation streams all rows through the row-level `apply(Tuple)`. `ColumnTransform` overrides this with a column-parallel fast path that avoids per-row object allocation. ```java DataFrame out = t.apply(df); ``` ### Static factories ```java // Build a single-step pipeline from already-fitted transforms Transform chain = Transform.pipeline(step1, step2, step3); // Fit several data-dependent transforms in sequence. // Each subsequent trainer receives the data as transformed by all previous steps. Transform pipeline = Transform.fit(trainDf, Standardizer::fit, data -> WinsorScaler.fit(data, 0.01, 0.99)); ``` --- ## 3. The InvertibleTransform interface Extends `Transform` with two inverse operations: ```java public interface InvertibleTransform extends Transform { Tuple invert(Tuple x); DataFrame invert(DataFrame data); } ``` Inversion is useful whenever the original feature space must be recovered — for example, to interpret a model's output in the original units, or to reconstruct an input from a latent representation. ```java InvertibleColumnTransform scaler = Standardizer.fit(train); DataFrame scaled = scaler.apply(test); DataFrame recovered = scaler.invert(scaled); // back to original units ``` --- ## 4. ColumnTransform `ColumnTransform` is the concrete class that powers all column-wise transforms. It holds a `Map` — a mapping from column name to the `smile.util.function.Function` lambda to apply to that column's values. ### Constructing manually ```java import smile.data.transform.ColumnTransform; import smile.util.function.Function; import java.util.Map; // Log-transform "income", leave everything else unchanged Map transforms = Map.of("income", Math::log); ColumnTransform ct = new ColumnTransform("log-income", transforms); DataFrame out = ct.apply(df); ``` ### Behaviour - Columns **not** present in the map are passed through unchanged (same `ValueVector` object — zero copy). - Transformed columns always become `DoubleType` (or `NullableDoubleType` for nullable inputs). - The `apply(DataFrame)` implementation processes all columns **in parallel** using `IntStream.range(...).parallel()`. ### toString ```java System.out.println(ct); // log-income( // log(income) // ) ``` The output format is `(\n , ...\n)`. Each `Function` implementation's `toString()` should describe the transformation (the built-in scalers do this; lambdas print as the default JVM reference unless overridden). --- ## 5. InvertibleColumnTransform Extends `ColumnTransform` and additionally stores a second map of inverse lambdas, one per transformed column. ```java import smile.data.transform.InvertibleColumnTransform; Map transforms = Map.of("price", v -> Math.log1p(v)); Map inverses = Map.of("price", v -> Math.expm1(v)); InvertibleColumnTransform ict = new InvertibleColumnTransform("log1p-price", transforms, inverses); DataFrame logPrices = ict.apply(df); DataFrame original = ict.invert(logPrices); // ≈ df ``` ### Inversion behaviour - Columns in the **inverses map** are inverted using the supplied lambda. - Columns **not** in the inverses map are passed through unchanged. - Nullable columns retain their null mask on both the forward and inverse paths — nulls are never fabricated or lost. --- ## 6. Built-in scalers and standardisers All built-in transforms follow the same pattern: 1. Call `Xxx.fit(trainData)` to learn parameters from training data. 2. Call `.apply(anyData)` to transform training **and** test data alike. 3. Optionally call `.invert(transformedData)` to recover the original scale. > **Train/test discipline:** Always fit on training data only. > Apply the fitted transform to both train and test sets. ### 6.1 Scaler — min-max scaling Maps each numeric column to the range `[0, 1]`: ``` x_scaled = (x - min) / (max - min) ``` Values outside the training range are **clamped** to `[0, 1]`. ```java import smile.feature.transform.Scaler; // Fit on train, transform all numeric columns InvertibleColumnTransform scaler = Scaler.fit(train); DataFrame trainScaled = scaler.apply(train); DataFrame testScaled = scaler.apply(test); // Fit on selected columns only InvertibleColumnTransform partial = Scaler.fit(train, "age", "income"); ``` **When to use:** Algorithms that are sensitive to feature magnitude and when the data has no severe outliers (e.g. KNN, SVM with RBF kernel, neural networks). **When to avoid:** Data with large outliers — a single extreme value will compress all normal values into a tiny sub-range. Use `WinsorScaler` instead. ### 6.2 Standardizer — z-score standardization Scales each numeric column to zero mean and unit variance: ``` x_std = (x - μ) / σ ``` If the standard deviation is zero (constant column), the scale factor is `1.0` and the column is only centred. ```java import smile.feature.transform.Standardizer; InvertibleColumnTransform std = Standardizer.fit(train); DataFrame trainStd = std.apply(train); DataFrame testStd = std.apply(test); ``` **When to use:** When the algorithm assumes Gaussian-distributed features or uses distance / dot-product computations (linear regression, logistic regression, linear SVM, PCA). **When to avoid:** Heavy-tailed distributions or data with many outliers — a single large outlier inflates σ and under-compresses the bulk of the data. Use `RobustStandardizer` instead. ### 6.3 RobustStandardizer — median / IQR Scales by subtracting the median and dividing by the interquartile range (IQR): ``` x_robust = (x - median) / IQR ``` The IQR is the difference between the 75th and 25th percentiles. If the IQR is zero, the scale factor is `1.0`. ```java import smile.feature.transform.RobustStandardizer; InvertibleColumnTransform robust = RobustStandardizer.fit(train); DataFrame robustTrain = robust.apply(train); ``` **When to use:** Data with outliers or heavy-tailed distributions. Median and IQR are much less sensitive to extreme values than mean and standard deviation. ### 6.4 WinsorScaler — percentile clamping A two-step approach: 1. Clamp values outside the `[lower, upper]` percentile range of the training distribution. 2. Scale the result to `[0, 1]`. ```java import smile.feature.transform.WinsorScaler; // Default: 5th percentile lower, 95th percentile upper InvertibleColumnTransform ws = WinsorScaler.fit(train); // Custom percentiles — clamp at 1st and 99th percentile InvertibleColumnTransform ws2 = WinsorScaler.fit(train, 0.01, 0.99); // Transform specific columns only InvertibleColumnTransform ws3 = WinsorScaler.fit(train, 0.05, 0.95, "price", "quantity"); ``` **When to use:** When you know the training set has outliers that are meaningful in the training data but that you do not want to distort scaling for the bulk of the data. Common in financial and sensor data. ### 6.5 MaxAbsScaler — maximum absolute value Divides each feature by its maximum absolute value so that all values lie in `[-1, 1]`. This scaler does **not** centre the data (mean is not subtracted), making it suitable for sparse data where centering would destroy sparsity. ```java import smile.feature.transform.MaxAbsScaler; InvertibleColumnTransform mas = MaxAbsScaler.fit(train); DataFrame scaled = mas.apply(test); ``` **When to use:** Sparse data (e.g. TF-IDF vectors) or when the sign of values is meaningful and you want to preserve zero values exactly. ### 6.6 Normalizer — per-row unit norm `Normalizer` rescales each **row** (sample) so that its vector norm equals 1. Unlike the column-wise scalers above, `Normalizer` operates across columns within a single row. It supports three norms: | Norm | Formula | |------|---------| | `L1` | `Σ|xᵢ|` = 1 | | `L2` | `√(Σxᵢ²)` = 1 | | `L_INF` | `max(|xᵢ|)` = 1 | ```java import smile.feature.transform.Normalizer; import smile.feature.transform.Normalizer.Norm; // Normalize specific columns String[] features = {"f1", "f2", "f3"}; Normalizer l2 = new Normalizer(Norm.L2, features); DataFrame normed = l2.apply(df); // Apply row-by-row to a single Tuple Tuple row = df.get(0); Tuple normRow = l2.apply(row); ``` `Normalizer` is **not** invertible (it is not an `InvertibleTransform`). **When to use:** Text classification (TF-IDF), cosine-similarity-based algorithms, or whenever the magnitude of a row is irrelevant but the direction (ratio of features) is meaningful. --- ## 7. Composing transforms ### 7.1 andThen and compose `andThen` and `compose` mirror the semantics of `java.util.function.Function`: ```java // a.andThen(b) ≡ b(a(x)) — apply a first, then b Transform aFirst = a.andThen(b); // b.compose(a) ≡ b(a(x)) — same result, different spelling Transform alsoAFirst = b.compose(a); ``` Both return a new `Transform` (a lambda, not a named class). ```java InvertibleColumnTransform scaler = Scaler.fit(train); InvertibleColumnTransform std = Standardizer.fit(train); // Scale first, then standardise Transform both = scaler.andThen(std); DataFrame out = both.apply(test); ``` > **Note:** The composed result is a plain `Transform`, not an > `InvertibleTransform`, even if both inputs are invertible. If you need > inversion of the composed result, use `Transform.fit` (see §7.3) or > invert the steps individually in reverse order. ### 7.2 pipeline `Transform.pipeline(t1, t2, ..., tN)` is equivalent to chaining `andThen` calls and is more readable for three or more steps: ```java Transform pipeline = Transform.pipeline( imputer, // fill missing values scaler, // [0, 1] scale std // z-score ); DataFrame train_prepped = pipeline.apply(train); DataFrame test_prepped = pipeline.apply(test); ``` Throws `IllegalArgumentException` if called with no transforms. ### 7.3 fit — data-dependent pipelines `Transform.fit(data, trainer1, trainer2, ...)` builds a pipeline where each trainer is a `Function` — a function that observes the data **as transformed so far** and returns the next step: ```java Transform pipeline = Transform.fit(train, // Step 1: fit imputer on raw training data data -> SimpleImputer.fit(data), // Step 2: fit scaler on imputed training data data -> Scaler.fit(data), // Step 3: fit standardiser on scaled+imputed training data data -> Standardizer.fit(data) ); // Apply the whole fitted pipeline to test data DataFrame testOut = pipeline.apply(test); ``` This is the canonical way to build multi-step preprocessing pipelines where each step's parameters depend on the output of the previous steps. Throws `IllegalArgumentException` if called with no trainers. --- ## 8. Inverting transforms `InvertibleTransform.invert()` maps from the transformed space back to the original space. This is used for: - **Interpreting predictions:** A model trained on standardised targets must have its output de-standardised before presenting to users. - **Reconstruction errors:** Auto-encoders or PCA-based anomaly detectors can measure reconstruction fidelity in the original units. - **Debugging:** Verify that `invert(apply(x)) ≈ x`. ```java InvertibleColumnTransform std = Standardizer.fit(train); // Forward pass DataFrame trainStd = std.apply(train); double[] yStd = model.predict(trainStd); // Recover predictions in original units // Build a single-column DataFrame to pass through invert() DataFrame predDf = DataFrame.of(new DoubleVector("y", yStd)); DataFrame predOriginal = std.invert(predDf); ``` ### Row-level inversion ```java Tuple row = df.get(0); Tuple scaled = std.apply(row); Tuple backToNormal = std.invert(scaled); ``` ### Composing inverses When multiple invertible transforms are stacked, inverting the composed pipeline requires **reversing the order**: ```java InvertibleColumnTransform s1 = Scaler.fit(train); InvertibleColumnTransform s2 = Standardizer.fit(s1.apply(train)); // Forward: s1 then s2 DataFrame fwd = s2.apply(s1.apply(test)); // Inverse: s2⁻¹ then s1⁻¹ (reversed order) DataFrame inv = s1.invert(s2.invert(fwd)); ``` --- ## 9. Nullable column handling SMILE supports nullable columns backed by `NullableDoubleVector` (and equivalent types for other primitives). All transforms in this package preserve nullability: - A nullable input column always produces a nullable output column — the null bit-mask is copied intact. - The transform lambda is still called on the raw `double` value for a null cell (which is `Double.NaN` by convention), but the result is ignored and replaced by `NaN` / null in the output vector. - **Inversion** also preserves the null mask. ```java // Suppose "income" has some null entries InvertibleColumnTransform scaler = Scaler.fit(train); DataFrame scaled = scaler.apply(train); // scaled.column("income").isNullable() → true (same as input) // scaled.column("income").isNullAt(k) → same as train DataFrame restored = scaler.invert(scaled); // restored.column("income").isNullable() → true (null mask preserved) ``` --- ## 10. Serialization `Transform` extends `java.io.Serializable`. All built-in implementations and the closure-captured lambdas inside `InvertibleColumnTransform` are serializable, so a fitted transform can be saved and reloaded: ```java // Save try (var out = new ObjectOutputStream(new FileOutputStream("scaler.ser"))) { out.writeObject(scaler); } // Load InvertibleColumnTransform loaded; try (var in = new ObjectInputStream(new FileInputStream("scaler.ser"))) { loaded = (InvertibleColumnTransform) in.readObject(); } // Use the loaded transform on new data DataFrame newData = loaded.apply(incoming); ``` > **Caution with lambdas:** Only use method references or anonymous classes > when the enclosing class is itself `Serializable`. A non-serializable > lambda passed to `ColumnTransform` or `InvertibleColumnTransform` will > cause `NotSerializableException` at save time. --- ## 11. Writing a custom transform ### Option A — implement Transform directly Useful when the transformation is not column-wise (e.g. row normalization, dimension reduction): ```java import smile.data.transform.Transform; import smile.data.Tuple; public class ClipTransform implements Transform { private final double min, max; private final Set columns; public ClipTransform(double min, double max, String... columns) { this.min = min; this.max = max; this.columns = new HashSet<>(Arrays.asList(columns)); } @Override public Tuple apply(Tuple x) { StructType schema = x.schema(); return new smile.data.AbstractTuple(schema) { @Override public Object get(int i) { String name = schema.field(i).name(); if (columns.contains(name)) { double v = x.getDouble(i); return Math.max(min, Math.min(max, v)); } return x.get(i); } }; } } ``` ### Option B — use ColumnTransform with lambdas The simplest approach for column-wise numeric transformations: ```java // Clip all values to [-3, 3] Map clips = new HashMap<>(); for (String col : numericColumns) { clips.put(col, x -> Math.max(-3.0, Math.min(3.0, x))); } ColumnTransform clipper = new ColumnTransform("clip±3", clips); ``` ### Option C — use InvertibleColumnTransform with lambdas When you need both forward and inverse: ```java // Box-Cox power transform with λ = 0.5 (square-root transform) double lambda = 0.5; Map fwd = Map.of( "revenue", x -> (Math.pow(x, lambda) - 1.0) / lambda ); Map inv = Map.of( "revenue", y -> Math.pow(y * lambda + 1.0, 1.0 / lambda) ); InvertibleColumnTransform boxCox = new InvertibleColumnTransform("BoxCox(0.5)", fwd, inv); ``` ### Option D — data-dependent custom transform via fit ```java // Normalize each column by its own training median static InvertibleColumnTransform fitMedianScaler(DataFrame data, String... cols) { Map transforms = new HashMap<>(); Map inverses = new HashMap<>(); for (String col : cols) { double[] vals = data.column(col).toDoubleArray(); double median = MathEx.median(vals); double scale = MathEx.isZero(median) ? 1.0 : Math.abs(median); transforms.put(col, x -> x / scale); inverses.put(col, y -> y * scale); } return new InvertibleColumnTransform("MedianScaler", transforms, inverses); } ``` --- ## 12. End-to-end tutorial This tutorial preprocesses a lending dataset that has `loan_amount` (double), `annual_income` (double, nullable), `grade` (categorical), `term` (int) and `default` (int, the label). ### Step 1 — Split data ```java DataFrame data = Read.csv("loans.csv"); // 80/20 train/test split (indices) int n = data.size(); int trainSize = (int)(n * 0.8); DataFrame train = data.slice(0, trainSize); DataFrame test = data.slice(trainSize, n); ``` ### Step 2 — Impute missing values ```java import smile.feature.imputation.SimpleImputer; // Fit: learn per-column fill values (mean for numeric, mode for categorical) Transform imputer = SimpleImputer.fit(train); DataFrame trainImputed = imputer.apply(train); DataFrame testImputed = imputer.apply(test); ``` ### Step 3 — Scale numeric features ```java import smile.feature.transform.WinsorScaler; // Use Winsor to be robust to the income outliers InvertibleColumnTransform scaler = WinsorScaler.fit(trainImputed, 0.01, 0.99, "loan_amount", "annual_income"); DataFrame trainScaled = scaler.apply(trainImputed); DataFrame testScaled = scaler.apply(testImputed); ``` ### Step 4 — Compose into a single reusable pipeline ```java Transform fullPipeline = Transform.fit(train, SimpleImputer::fit, data -> WinsorScaler.fit(data, 0.01, 0.99, "loan_amount", "annual_income") ); // Apply in one call — same result as steps 2-3 above DataFrame trainPrepped = fullPipeline.apply(train); DataFrame testPrepped = fullPipeline.apply(test); ``` ### Step 5 — Train a model using a Formula ```java import smile.data.formula.Formula; Formula formula = Formula.of("default", Terms.$("loan_amount"), Terms.$("annual_income"), Terms.$("term"), Terms.$("grade")); // Extract design matrix and response var X = formula.matrix(trainPrepped); var y = formula.y(trainPrepped).toIntArray(); // … fit your model with X and y … ``` ### Step 6 — Invert predictions to interpret results ```java // Suppose we predicted a continuous score and want to understand it // in the original income units: double[] scores = model.predict(formula.matrix(testPrepped)); DataFrame scoreDf = DataFrame.of(new DoubleVector("annual_income", scores)); DataFrame inOriginalScale = scaler.invert(scoreDf); System.out.println(inOriginalScale); ``` ### Step 7 — Inspect and save the pipeline ```java // Inspect what each step learned System.out.println(fullPipeline); // Serialize for reuse in production try (var out = new ObjectOutputStream(new FileOutputStream("pipeline.ser"))) { out.writeObject(fullPipeline); } ``` --- ## 13. Choosing the right transform | Situation | Recommended transform | |---|---| | No strong outliers, range matters | `Scaler` | | Gaussian assumption, distance-based algo | `Standardizer` | | Outliers present, want robustness | `RobustStandardizer` | | Outliers present, want bounded output | `WinsorScaler` | | Sparse data, zero must stay zero | `MaxAbsScaler` | | Direction matters, not magnitude | `Normalizer` (L2) | | Text / count vectors | `Normalizer` (L1 or L2) | | Custom log / power transformation | `InvertibleColumnTransform` + lambdas | | Missing values before scaling | `SimpleImputer` first, then scaler | | Multiple sequential steps | `Transform.fit(...)` pipeline | ### Quick comparison of column-wise scalers Given a column with values `{1, 2, 3, 100}` (the `100` is an outlier): | Transform | 1 | 2 | 3 | 100 | |---|---|---|---|---| | `Scaler` | 0.000 | 0.010 | 0.020 | 1.000 | | `Standardizer` (μ≈26.5, σ≈48.0) | −0.53 | −0.51 | −0.49 | 1.53 | | `RobustStandardizer` (med=2.5, IQR≈1.5) | −1.00 | −0.33 | 0.33 | 65.0 | | `WinsorScaler` (5/95 pct) | 0.000 | 0.500 | 1.000 | 1.000 | | `MaxAbsScaler` (max=100) | 0.010 | 0.020 | 0.030 | 1.000 | `Scaler` compresses `{1,2,3}` to near zero because of the outlier. `WinsorScaler` clamps the outlier so `{1,2,3}` spread cleanly across `[0,1]`. `RobustStandardizer` is robust to the outlier but its output is unbounded. --- ## 14. API quick reference ### `Transform` (interface — `smile.data.transform`) | Member | Description | |--------|-------------| | `apply(Tuple)` | Transform one row (abstract). | | `apply(DataFrame)` | Transform all rows (default: row-stream; overridden by `ColumnTransform` for column-parallel batch). | | `andThen(Transform)` | Compose: `this`, then `after`. | | `compose(Transform)` | Compose: `before`, then `this`. | | `Transform.pipeline(Transform...)` | Chain multiple transforms left-to-right. Throws on empty input. | | `Transform.fit(DataFrame, Function...)` | Fit a data-dependent pipeline. Each trainer sees data as already transformed. Throws on empty input. | ### `InvertibleTransform` (interface — `smile.data.transform`) | Member | Description | |--------|-------------| | `invert(Tuple)` | Inverse-transform one row. | | `invert(DataFrame)` | Inverse-transform all rows. Preserves nullable columns. | ### `ColumnTransform` (class — `smile.data.transform`) | Member | Description | |--------|-------------| | `ColumnTransform(String name, Map transforms)` | Constructor. | | `apply(Tuple)` | Applies lambdas to matching column positions; others pass through. | | `apply(DataFrame)` | Column-parallel batch transform; preserves null masks. | | `toString()` | `"(\n , ...\n)"` | ### `InvertibleColumnTransform` (class — `smile.data.transform`) | Member | Description | |--------|-------------| | `InvertibleColumnTransform(String, Map, Map)` | Constructor (name, forward lambdas, inverse lambdas). | | `invert(Tuple)` | Applies inverse lambdas; unmatched columns pass through. | | `invert(DataFrame)` | Column-parallel inverse; preserves null masks. | ### Built-in scalers (all in `smile.feature.transform`) | Class | Factory method | Formula | Invertible | |-------|----------------|---------|------------| | `Scaler` | `Scaler.fit(data, cols...)` | `(x−min)/(max−min)` clamped to `[0,1]` | Yes | | `Standardizer` | `Standardizer.fit(data, cols...)` | `(x−μ)/σ` | Yes | | `RobustStandardizer` | `RobustStandardizer.fit(data, cols...)` | `(x−median)/IQR` | Yes | | `WinsorScaler` | `WinsorScaler.fit(data)` or `fit(data, lo, hi, cols...)` | clamp then `(x−pLo)/(pHi−pLo)` | Yes | | `MaxAbsScaler` | `MaxAbsScaler.fit(data, cols...)` | `x / max(|x|)` | Yes | | `Normalizer` | `new Normalizer(Norm, cols...)` | row-wise unit norm (L1/L2/L∞) | No | All column-wise `fit()` methods accept an optional varargs `String... columns` argument. When omitted, **all numeric columns** are transformed automatically. --- *SMILE — © 2010-2026 Haifeng Li. GNU GPL licensed.*