June 24, 2026Research RL Agents

Qwen-AgentWorld: Train the Agent in a Dream

Here's a clean idea from Qwen. Training agents in the real world is slow, fragile and expensive, you need actual environments, actual websites, actual apps that break. So instead, build a model that simulates the environment, and let the agent practice inside the simulation. A dream world, basically, where you can spin up thousands of fake-but-realistic environments and run reinforcement learning cheaply.

They're calling these language world models, and they shipped two: a 35B-A3B and a big 397B-A17B. Trained on more than 10 million interaction trajectories across seven domains, through a three-stage pipeline that first injects general capability, then teaches it to predict the next state of an environment, then sharpens simulation fidelity with RL. Two ways to use it: as a standalone simulator to generate cheap training environments, or as a foundation model where the world-model training acts as a warmup that just makes the downstream agent better.

The result that's worth flagging: on their AgentWorldBench, world-model warmup beat training in the real environment alone. Read that twice. Practicing in the dream produced a better agent than practicing in reality, because the dream gives you volume and control that reality can't.

This is the same trick that quietly powers a lot of robotics and game AI, now pointed squarely at general agents, and the code is open at https://github.com/QwenLM/Qwen-AgentWorld . If simulated environments keep closing the gap with real ones, the bottleneck on agent training stops being data collection and starts being how good your dream is.

← Previous

Gemini 3.5 Flash Can Now Use Your Computer

DESIGN.md: Giving Agents Taste in a File

← Back to all articles

Qwen-AgentWorld: Train the Agent in a Dream

Related Articles

Comments