--- title: v0.4.0 β€” 2026.06.14 --- ## 2026.06.14 Release v0.4.0 CubeSandbox 0.4.0 introduces **CubeEgress**, an OpenResty-based security proxy that brings credential injection, domain filtering, and access auditing to sandbox egress traffic. This release also delivers **container log forwarding** with a new `cubecli logs` command, a **node component version matrix** with cluster-wide visibility, **template replica compatibility checking**, a **daemonless template image build pipeline**, and significant **network performance improvements** (35% faster network P50). The builder base image has been downgraded to `ubuntu:20.04`, lowering the minimum glibc requirement from 2.34 to 2.31 for broader distribution compatibility. 58 commits from 15 contributors. ### 🎯 Major Features #### CubeEgress: Security Proxy CubeEgress is a new OpenResty-based egress gateway that sits in the sandbox outbound traffic path via TPROXY, enforcing L7 policy before requests leave the cluster. It consists of ~2,200 lines of Lua across 9 modules running on OpenResty/nginx, plus Go-side integration in CubeMaster (CA provisioning, policy push), network-agent (TPROXY iptables rules), and Cubelet (per-sandbox routing, protobuf egress rule model). - **Credential injection** (#518): Per-sandbox secrets are attached to outbound requests at the proxy layer via `EgressRule.inject` β€” user code inside the sandbox never handles raw credentials. The `CubeNetworkConfig` protobuf message (formerly `CubeVSContext`) now carries L7 egress rules with match conditions (SNI, host, method, path, scheme) and actions (allow/deny, audit, inject). Credential material is redacted as `***REDACTED***` in CubeMaster safe-log output (#520). - **Domain filtering** (#518): Policy-driven allow/deny lists gate which destinations a sandbox may reach, evaluated first-match-wins against the L7 request. DNS queries are permitted even when domain-based allow-out rules are set (38fe9977). - **Access auditing** (#518): Structured JSON logs of every egress request with optional body redaction via a `redactor` Lua module, enabling downstream compliance review. - **Kernel 5.4 compatibility** (38fe9977): The security proxy runs on kernel v5.4+, expanding deployment coverage. - **CubeVS fast-path hardening** (#527): SYN-only packets are now rejected in the port-mapping BPF fast path, preventing guest-initiated connection attempts from bypassing egress policy. - **TAP TX offload** (#505): TX checksum/TSO offload and `tx-tcp-mangleid-segmentation` are enabled on TAP devices so redirected packets skip GSO before reaching the guest. - **CubeEgress version reporting** (9d76195e): CubeEgress participates in the node component version matrix with build-time version metadata injection, a `/admin/v1/health` endpoint extension, release manifest entries, and cubelet-side file-based collection. New files: `CubeEgress/` (20 files β€” Lua modules, nginx config, Dockerfile, iptables scripts, systemd units, CA generation); `CubeMaster/pkg/service/httpservice/cube/ca_download.go`; `CubeMaster/pkg/templatecenter/cube_egress_ca/`; `CubeMaster/pkg/templatecenter/cube_egress_ca_bake.go`; DB migration `0005_cube_egress.sql`. #### Container Log Forwarding Container init-process stdout/stderr is now streamed from the agent to the shim via a dedicated vsock connection and appended to log files on the host. A new `cubecli cubebox logs` subcommand lets operators read these logs from outside the sandbox. - **Log streaming** (#535): The shim injects a `cube.container.log_forwarding=true` annotation into the OCI spec, causing the agent to create stdout/stderr pipes (1 MiB buffer, `O_NONBLOCK`) for the init process. A dedicated vsock channel carries the log stream to the shim, which appends to `/data/log/template//stdout|stderr` during template builds and to `./stdout` / `./stderr` in the bundle directory for normal sandboxes. Log forwarding is cleanly cancelled before pause/snapshot/teardown, and pipe write fds are closed on process exit so readers receive EOF (#541). Exec I/O relay (FIFO-based) is kept separate from init log forwarding. - **`cubecli cubebox logs`** (#528): New subcommand to read container stdout/stderr from `/data/cubelet/state/io.containerd.runtime.v2.task/default//stdout|stderr`. Supports `--tail N`, `--head N`, `--all`, and `--stderr` flags. Since log files live inside the cubelet mount namespace, the command re-execs itself via the existing C constructor in `pkg/cubemnt/nsenter.c` to safely enter the namespace before any Go code runs. Includes `openNoFollow()` path validation hardened against symlink-following attacks. #### Node Component Version Matrix A new version tracking infrastructure gives operators cluster-wide visibility of component versions across all nodes, with a dedicated Web UI page. - **Version collection and matrix** (#500): Cubelet collects component versions (guest-image, cube-agent, kernel, plus control-plane components from the release manifest) and reports them to CubeMaster, which maintains a version matrix in the `node_component_version` table (DB migration `0004`). The matrix groups nodes by reported version for each component, surfaces version skew, and exposes summary and detail APIs through CubeAPI. - **Standardized version injection** (#493): All Go and Rust binaries now receive version, commit, and build-time metadata via ldflags / `build.rs`. A machine-readable `release-manifest.json` is generated in one-click release bundles so every artifact is traceable to the same release. The `cubecli version` and `cubemastercli version` output formats are unified across components. - **Web UI Versions page** (#500, #481): A new `Versions.tsx` page (762 lines) with i18n support (en/zh) shows per-component version distribution across nodes. The sidebar and Settings About section now display the actual release tag (injected at build time as `__APP_VERSION__`) instead of hardcoded versions. New files: `CubeMaster/pkg/nodemeta/versionmatrix.go`; `web/src/pages/Versions.tsx`; `web/src/locales/en/versions.json`, `zh/versions.json`; DB migration `0004_node_component_version.sql`. #### Template Replica Compatibility Template replicas are now checked against node component versions, with stale/missing replicas surfaced in both the API and Web UI. - **Compatibility matrix and version binding** (#510): The template compatibility system compares each template's bound component versions (guest-image, cube-agent, kernel) against what each node currently reports. Results are stored in `template_versions` (DB migration `0006`) and exposed via `/templates/compat` (summary) and `/templates/compat/{id}` (per-template detail). Version binding management lets operators pin a template to specific component versions at creation time. - **Web UI** (#545): The template detail page now shows per-replica compatibility badges, version delta between bound and current component versions, and a stale-replica warning banner with a rebuild trigger. New components: `CompatBadge`, `CompatSection`, `CompatWarning`, `CompatNodeCard`, `VersionDeltaList`. New files: `CubeMaster/pkg/templatecenter/compat.go`; `CubeMaster/pkg/service/httpservice/cube/template_compat.go`; DB migration `0006_template_replica_compat.sql`. #### Template Image Build Pipeline Overhaul The template image build pipeline has been rearchitected to support daemonless operation via skopeo/umoci, with a 72% reduction in peak disk usage and file-level content deduplication. - **Daemonless export path** (#492, #506): When skopeo and umoci are available on the CubeMaster node, template images are pulled via `skopeo copy` into a local OCI layout and unpacked with `umoci unpack --rootless`, eliminating the Docker daemon requirement. Falls back to Docker for backward compatibility. The export strategy is chosen once at image resolution time so preparation and export stay consistent. - **Artifact management** (#506): A new job runner orchestrates the full pipeline (image export β†’ rootfs artifact build β†’ distribution), with redo support that can resume from the last completed phase. File-level content fingerprints (SHA256) enable artifact deduplication across builds, and artifact cleanup is managed through a structured lifecycle. Redo operations now carry the correct template ID through working requests (#544). - **Disk usage optimization** (#472): Peak disk usage during image-to-ext4 build is reduced from ~4.2Γ— to ~1.2Γ— image size through five complementary optimizations: 1. **Pipe-streamed export**: Docker export stdout is connected directly to `tar -xf` stdin via a 1 MiB pipe (`F_SETPIPE_SZ`), eliminating the intermediate `rootfs.tar` file. 2. **Early workDir cleanup**: The scratch workDir is removed immediately after the rootfs reaches the store directory, before ext4 creation begins. 3. **Precise ext4 sizing**: Power-of-2 alignment is replaced with a triple-overhead model (fixed 256 MiB + 10% of data + 1 KiB per file), aligned to 256 MiB boundaries. 4. **Direct-to-storeDir export**: On local fast filesystems (detected via statfs magic), the rootfs is exported directly into the store directory, skipping the workDirβ†’storeDir relocate step. NFS/CIFS fall back to the relocate path to avoid cross-device copies. 5. **Disk-space pre-check**: A fail-fast statfs check on the store directory parent ensures sufficient space before the build starts, with a configurable safety margin (`CUBEMASTER_DISK_SPACE_SAFETY_MARGIN`, default 1.5Γ—). SHA256 computation uses a 4 MiB buffer to reduce read syscalls. A loop-mount streaming ext4 build phase (gated behind `CUBEMASTER_LOOP_MOUNT_EXT4_ENABLED`, default false) is also implemented with `CAP_SYS_ADMIN` detection. - **SDK alignment** (#485): CubeAPI `POST /templates` and Python/Go SDKs now expose DNS, egress CIDRs, registry auth, command/args, network type, and node scope options, matching the full `cubemastercli template create-from-image` option set. New files: `CubeMaster/pkg/templatecenter/image/` (export, ext4, disk, command, ref, source, types, paths, util); `CubeMaster/pkg/templatecenter/artifact_build.go`, `artifact_cleanup.go`, `distribution.go`, `fingerprint.go`, `image_job_runner.go`, `job_constants.go`, `job_dto.go`. #### Network Performance - **TAP fd acquisition optimization** (#487): A three-tier `GetTapFile` strategy replaces the old single-path approach: - **Fast path**: When `state.tap.File` is already cached, return it immediately (0 syscalls). - **Hot path**: For pooled taps with a closed fd, reopen with just 2 syscalls (`open` + `TUNSETIFF`), skipping the expensive `restoreTap` flow (netlink lookup, `LinkSetUp`, `SetMTU`, TC filter attach, ARP entry). - **Recovery path**: Fall back to full `restoreTap` only when there is no in-memory state or the tap is held externally. The fdserver JSON response now includes the ifindex, allowing cubelet to skip its own `netlink.LinkByName` call β€” eliminating a serialization point during concurrent sandbox creation. Cubelet falls back to `LinkByName` only when ifindex is 0 (backward-compatible with older agents). A TOCTOU race between `EnsureNetwork` and `ReleaseNetwork` is fixed by replacing singleflight-style dedup with a per-sandbox `creating` guard channel registered in the same critical section as the state check. Includes a pprof debug server (`--pprof-listen` flag) and 390 lines of concurrency tests (6 functions, 64-goroutine stress test clean under `-race`). Benchmarks (BMI5, Xeon Platinum 8255C, kernel 6.6.119): Network P50 35.3β†’23.1ms (**35% faster**), Network P99 86.6β†’51.2ms (**41% faster**), Total P50 106.1β†’92.0ms (**13% faster**), Throughput 194.8β†’209.8 sandboxes/s (**8% higher**). - **BPF checksum optimization** (#469): `bpf_csum_diff()` is replaced with `bpf_{l3,l4}_csum_replace` helpers in both `from_world` and `from_cube` BPF programs. Combined with the TAP TX offload work (#505), this enables TSO/UFO/CSUM offloads to be re-enabled on virtio-net TAPs (reverting #110), and the `disableGRO()` requirement on host NICs is dropped. ### ✨ Enhancements #### Scheduling - **Configurable overcommit and Redis allocation bypass** (#525): Two new scheduler configuration knobs: `overcommit_ratio` (default CPU=3, Mem=2) with optional per-instance-type overrides via `overcommit_ratio_conf`, and `ignore_redis_allocation` (default false) to treat Redis-recorded allocations as zero. Applied consistently across filter and score plugins, with non-positive ratios clamped back to defaults. Physical load guards (CPU utilization ceiling, real-time free memory) are intentionally preserved. #### Affinity - **Custom node affinity selector** (#504, #467): The `com.nodeaffinity.selector` annotation now accepts arbitrary `NodeSelectorRequirements` (In, NotIn, Exists, DoesNotExist, Gt, Lt) as a JSON array of `{key, operator, values}`. Node labels from registration are carried through `Node.NodeLabels`, merged into `Labels()` with an `atomic.Pointer` cache and `InvalidateLabelsCache()` for mutation safety. DoS hardening: max annotation size 4 KB, 10 selectors per request, 50 values per In/NotIn. Configurable allowed keys default to zone, cluster-id, cpu-type, memory-size, cpu-cores, instance-type. 872 lines of tests covering 47 cases. #### Template Management - **tpl- prefix enforcement** (#474): Template IDs are now always auto-generated with a `tpl-` prefix across all creation paths (API, CLI, Web UI, sandbox commit). User-specified IDs are accepted for backward compatibility but silently ignored β€” the server always returns an auto-generated `tpl-` prefixed ID as the authoritative template identifier. Validation rejects bare `tpl-` / `snap-` prefixes and non-conforming annotation prefixes. - **Builder image downgrade to ubuntu:20.04** (#468): The builder base image is changed from `ubuntu:22.04` to `ubuntu:20.04`, lowering the minimum glibc requirement from 2.34 to 2.31. Affects `Dockerfile.builder`, one-click installer preflight checks, CI workflows, and documentation. #### Web UI - **Template policy display** (#486): The template detail page now shows environment variables, network type, internet access, DNS servers, allow-out rules, and deny-out rules parsed from `createRequest`. A dedicated "Network Policy" section includes per-rule copy buttons. A `BoolBadge` component is extracted as a shared UI primitive. - **CubeAPI container image** (#513): A container build for the cube-api service produces a self-contained runtime image suitable for one-click and orchestrated deployments, with a lean build context. #### SDK - **Python SDK v0.3.0** (#521): Bump to 0.3.0 with new APIs for security proxy configuration. #### PVM - **Kernel LOCALVERSION rename** (#511, #534): The PVM host and guest kernel `LOCALVERSION` is renamed to a clean descriptive scheme so the distribution base and host/guest role are obvious from `uname -r`. Deployment configs, user-facing guides, and blog references are updated to match. ### πŸ› Bug Fixes These fixes address issues present in v0.3.1: - **Virtiofs config skipped when shareDirs is empty** (#533): Cubelet no longer generates virtiofs configuration or annotations when no shared directories are specified, preventing broken config generation. - **DNS server IP automatically added to AllowOut** (#526): When any DNS rule is configured, the DNS server IP is now added to `AllowOut` to ensure DNS resolution works through egress policy. Includes regression test coverage. - **Cubelog nil trace panic** (#512): Background workers and detached job contexts that run without a request trace no longer panic on nil dereference β€” trace handling is now tolerant of a missing trace. - **Storage symlink resolution in host-dir cleanup** (#530): `cleanupHostDirVolumes` now resolves base-path symlinks when walking sandbox directories, so bind mounts under paths like `/data β†’ /mnt/ssd/data` are correctly identified and unmounted instead of leaking or having their backing directories wiped. - **Network plugin bootstrap warnings** (#491): Cubelet startup no longer logs valid network configuration keys as "unknown TOML fields" β€” the existing config struct is now reused when reading bootstrap overrides. - **DNS not auto-allowed when internet is disabled** (#490): When `AllowInternetAccess=false`, resolved DNS servers are no longer appended to `allow_out`, so the deny-all outbound policy consistently blocks DNS resolution. Fixes #408. - **Ripgrep dependency removed from one-click runtime** (#496): The one-click install and startup path no longer requires or auto-installs `ripgrep`. Shell checks now use grep-based helpers. - **Virtiofs migration_on_error set to GuestError** (#482): The native virtiofs server now uses `MigrationOnError::GuestError` instead of `Abort`. Per-inode failures during snapshot restore surface as guest FS errors (ENOENT/EIO) on the affected paths rather than tearing down the entire live migration. - **VMM virtio-fs queue fault tolerance** (#464): `process_queue_serial()` no longer panics on malformed descriptors. Failures are recovered by writing an EIO FUSE error reply to the guest and continuing to serve the queue. A new `device_memory` view is added for device-backed memory regions (virtio-pmem, virtio-fs DAX, ivshmem/zshm BARs). - **Cgroup v2 manager creation** (#488): The agent now uses the cgroup v2 creation path from `cgroups-rs` and attaches container processes through `cgroup.procs`, avoiding v1 controller name failures in unified cgroup mode. Process ID collection for cleanup and signals also reads from `cgroup.procs`. - **Node health expiry on stale heartbeat** (#455): Node health is now derived from heartbeat freshness β€” stale heartbeats are correctly reported as unhealthy in nodemeta reads, localcache-backed reads, and scheduler prefilter. A shared helper centralizes the timeout rule across all three paths. - **SELinux context restore after one-click install** (#471): File contexts under the install prefix are now restored before starting systemd services, fixing one-click installs on SELinux Enforcing hosts. Fixes #465. - **Glibc preflight pipefail race** (#473): The `ldd --version` output is now fully captured before parsing, preventing strict-mode preflight checks from exiting on an expected SIGPIPE. - **Python SDK streaming request body read** (377a99dc): Request bodies in `IPOverrideTransport` are now buffered before copying, so multipart uploads no longer fail with `RequestNotRead`. - **CLI help text corrections** (#478): Fixed incorrect command names (e.g., `cuebcli` β†’ `cubecli`), spelling mistakes, outdated deprecation hints, and truncated descriptions in both `cubecli` and `cubemastercli`. ### πŸ“š Documentation - **DEB install instructions** (#532): Added apt (DEB) install instructions alongside existing yum (RPM) steps for Python SDK setup in the Quick Start guide. - **Benchmark blog env var fixes** (#497): Fixed benchmark setup examples that mixed environment variables from different client stacks β€” E2B variables for `e2b_code_interpreter` examples, `CUBE_API_URL` + CubeProxy settings for CubeSandbox SDK examples. - **CNCF Landscape badge** (#477): Added CNCF Landscape badge and footer note to README in both English and Chinese. - **Template ID documentation cleanup** (#476): Removed all `--template-id` flags from `create-from-image` documentation and examples since template IDs are now auto-generated with `tpl-` prefix. - **Install guide links in benchmark posts** (#475): Added installation guide callouts to the Β§2.1 Hardware section of all four benchmark blog posts (EN + ZH, bare-metal + PVM). - **Troubleshooting links** (#466): Added GitHub issue #311 troubleshooting URL to XFS filesystem check error messages in `install.sh`, `online-install.sh`, and `check-deps.sh`. Updated install docs to use direct links to the Releases page. - **CODEOWNERS** (#522): Added CubeEgress maintainer entry. ### βš™οΈ Engineering Improvements - **Build system reorganization** (#529): Per-target `.PHONY` declarations replace the single bulk list. A new `clean-rust-target-dirs` target removes `target/` under each top-level Rust workspace. The `all` target is driven from a shared `BINARIES` list. - **Format check CI** (#524): `fmt` targets are added to all component Makefiles (Go and Rust), with a new `.github/workflows/fmt-check.yml` CI workflow that runs format checking on PRs. The agent's `fmt` target automatically generates required files (`version.rs`, protocol `.rs`) before formatting. - **CI review-comment via stdin** (#494): PR review comments are now passed via stdin (`--body-file -`) instead of temp files, keeping review content out of the checkout directory. - **CI auto-review comment reuse** (#489): Automated review comments now update the bot's existing marked comment on repeated PR synchronizations instead of creating new top-level comments each time. - **Metric report jitter** (#479): The Cubelet CLS metric report loop now adds random jitter (uniformly distributed between `[t, 1.5t]`) to prevent thundering herd issues when multiple agents start concurrently.