--- title: "γ-World: Generative Multi-Agent World Modeling Beyond Two Players" source: newsletter source_url: https://research.nvidia.com/labs/sil/projects/gamma-world/ ingested: 2026-06-01 feed_name: NVIDIA Research source_published: 2026-05-28T04:23:24Z type: article sha256: 8491006f6a8b853d9991c413b087f94525198eacdd84415e60227173eb700126 tags: [world-model, multi-agent, nvidia, generative-ai, attention-mechanism] --- URL: https://research.nvidia.com/labs/sil/projects/gamma-world/ Title: Generative Multi-Agent World Modeling Beyond Two Players URL Source: http://research.nvidia.com/labs/sil/projects/gamma-world/ Published Time: Thu, 28 May 2026 04:23:24 GMT Markdown Content: _**TL;DR:** γ-World is a generative multi-agent world model that supports independently controllable, permutation-symmetric agents via **Simplex Rotary Agent Encoding** and **Sparse Hub Attention**, achieving real-time **24 FPS** rollouts and zero-shot generalization from two to four players._ γ-World interactively generates coherent future frames from multi-agent actions while preserving shared-world consistency, scaling from virtual games to real-world environments. ![Image 1: γ-World Teaser](http://research.nvidia.com/labs/sil/projects/gamma-world/assets/teaser.png) ## Gallery * * * ### γ-World Overview A comprehensive overview of γ-World: interactive multi-agent world generation across diverse scenes and configurations. ### Two-Agent Interaction Qualitative results of two-agent interaction. Each agent is independently controllable while sharing the same evolving world. ![Image 2: Two Agent Visualization](http://research.nvidia.com/labs/sil/projects/gamma-world/figures/combined_2agent_v7.png) ### Four-Agent Generalization Benefiting from the permutation-symmetric simplex agent encoding, γ-World generalizes from two to four players **without additional training**. ![Image 3: Four Agent Visualization](http://research.nvidia.com/labs/sil/projects/gamma-world/figures/4agent_visualization.png) ### Real-World Robotics Coordination γ-World extends to real-world multi-robot coordination scenarios, demonstrating practical applicability beyond virtual environments. ![Image 4: Robotics Visualization](http://research.nvidia.com/labs/sil/projects/gamma-world/figures/robo-visualization.png) ## Abstract * * * World models for interactive video generation have largely focused on single-agent settings, where future observations are rolled out from a single action stream, user input, or controllable viewpoint. However, many simulated worlds are inherently populated: multiple players, robots, or embodied agents act simultaneously within a shared, evolving environment. Scaling world models to such settings requires a principled multi-agent design: agents should remain independently controllable, permutation-symmetric, and support efficient inference while maintaining consistency across time and perspectives. In this paper, we present **γ-World**, a generative multi-agent world model for interactive simulation. γ-World introduces _Simplex Rotary Agent Encoding_, a parameter-free extension of 3D RoPE that represents agents as vertices of a regular simplex in rotary angle space. This gives each agent a distinct phase while making all agents permutation-equivalent, enabling scalable agent identity without learned per-slot identities or a fixed agent ordering. To support efficient cross-agent interaction, we further propose _Sparse Hub Attention_, where learnable hub tokens mediate communication across agents, reducing cross-agent attention cost from quadratic to linear in the number of agents. Finally, we use a bidirectional multi-agent teacher to guide a block-causal student with distillation, after which the final causal model can use KV caching for streaming, achieving real-time action-responsive rollouts at **24 FPS**. Experiments in multiplayer virtual environments show that γ-World improves video fidelity, action controllability, and inter-agent consistency over slot-based and dense-attention baselines, while generalizing from two to four players without additional training. ## Method * * * ![Image 5: Method overview](http://research.nvidia.com/labs/sil/projects/gamma-world/figures/multiagent_method.png) **Architecture overview.** γ-World takes per-agent action streams and produces a shared, multi-view rollout. Two key designs make it scalable to many agents: #### Simplex Rotary Agent Encoding A parameter-free extension of 3D RoPE that represents agents as vertices of a regular simplex in rotary angle space. Each agent receives a distinct phase while remaining _permutation-equivalent_, eliminating the need for learned per-slot identities or a fixed agent ordering. #### Sparse Hub Attention Learnable hub tokens mediate communication across agents, reducing cross-agent attention cost from _quadratic_ to _linear_ in the number of agents — enabling efficient scaling to four or more agents. ### Efficiency: Sparse Hub Attention Sparse Hub Attention scales linearly with the number of agents, while dense attention scales quadratically. ![Image 6: Sparse Hub Attention Timing](http://research.nvidia.com/labs/sil/projects/gamma-world/figures/sparse_hub_timing_comparison.png)