Google DeepMind Launches SIMA 2: A Breakthrough AI Agent That Thinks, Learns, and Acts in 3D Worlds

Google DeepMind has taken a major leap toward the future of interactive artificial intelligence with the reveal of SIMA 2, the newest generation of its Scalable Instructable Multiworld Agent. Designed to reason, collaborate, and learn autonomously inside rich 3D environments, SIMA 2 marks what researchers are calling “a milestone in creating general and helpful AI agents.”

Released as a limited research preview for select academics and game developers, SIMA 2 represents a significant advancement in building AI systems that don’t just follow instructions—but understand, explain, and execute them with an unprecedented depth of reasoning.

A Smarter Core: Gemini Inside an Action-Driven AI

At the heart of SIMA 2 lies Google’s powerful Gemini model, enabling the agent to interpret complex instructions, understand overarching goals, and clearly articulate how it plans to carry out its actions. DeepMind explains: “SIMA 2 can do more than just respond to instructions—it can think and reason about them.”

Its predecessor, SIMA 1, laid the foundation with over 600 learned skills across commercial games. SIMA 2 builds on this with advanced cognitive abilities, shifting the AI from a command-follower to a reasoning collaborator.

Training That Blends Human Insight with AI-Generated Intelligence

SIMA 2’s training strategy is part of what makes it game-changing. DeepMind combined human demonstrations with Gemini-generated labels, giving the agent a dual stream of real-world and AI-inferred knowledge.

This hybrid approach allows SIMA 2 to:

  • Explain what it intends to do
  • Describe the steps needed to complete a task
  • Learn to collaborate naturally with users

Researchers say interactions now feel “less like giving commands and more like working with a companion who can reason about the task at hand.”

Impressive Generalisation Across New and Unknown Worlds

In testing, SIMA 2 demonstrated robust generalisation beyond its training environments. It successfully performed complex tasks in games it had never seen before, including the Viking survival game ASKA and the multi-environment research platform MineDojo.

SIMA 2 also transferred skills learned in one game—such as mining or navigation—to entirely different virtual spaces. DeepMind reports that the agent has significantly narrowed the performance gap between AI and human players across evaluation tasks.

SIMA 2 + Genie 3: AI That Can Act Inside AI-Generated Worlds

In a notable experiment, DeepMind paired SIMA 2 with Genie 3, a model capable of generating new 3D worlds from a single image or text prompt.

The result?

SIMA 2 was able to orient itself, understand surroundings, and follow user instructions inside worlds that didn’t exist until moments before.

This fusion hints at a future where AI can be dropped into any environment—real or synthetic—and instantly begin collaborating.

Self-Improvement: An AI That Trains Itself

One of the most innovative aspects of SIMA 2 is its self-directed learning loop. After initial human-led training, the agent can generate its own tasks, evaluate its performance using Gemini-based reward signals, and refine its capabilities without human intervention.

DeepMind highlights: “This process allows the agent to improve on previously failed tasks entirely independently of human demonstrations.”

The data collected through this autonomous self-play becomes training material for future versions—effectively enabling SIMA 2 to level itself up.

Challenges Ahead: Long Tasks, Precision, and 3D Complexity

Despite the progress, DeepMind acknowledges several limitations:

  • Difficulty with long, multi-step tasks
  • Short memory during interactive sessions
  • Precision issues when simulating keyboard/mouse controls
  • Need for improved visual understanding of complex 3D scenes

These constraints highlight where future iterations will evolve as DeepMind continues refining the agent.

Responsible Release for Researchers and Developers

SIMA 2 is being rolled out carefully as a limited research preview. DeepMind emphasizes its commitment to responsible development with oversight from internal safety experts.

The long-term ambition?

To eventually apply these capabilities to robotics, where navigation, tool use, and human collaboration are essential.

Meanwhile, Fei-Fei Li’s World Labs Releases ‘Marble’ to the Public

In parallel to DeepMind’s announcement, World Labs, founded by AI visionary Fei-Fei Li, has launched Marble, its generative world model, for public access following a two-month beta.

Marble can:

  • Generate 3D worlds from text, images, videos, or 3D layouts
  • Enable users to interactively edit or expand those worlds

As both SIMA 2 and Marble enter the research ecosystem, they signal a new era where AI doesn’t just understand our worlds—it builds them, navigates them, and collaborates with us inside them.

Read more: Aramex Appoints Amadou Diallo as New Group CEO, Signaling a Powerful New Era for Global Logistics

more insights

GlobalBizOutlook is the platform that provides you with best business practices delivered by individuals, companies, and industries around the globe. Learn more

GlobalBizOutlook is the platform that provides you with best business practices delivered by individuals, companies, and industries around the globe. Learn more

Advertise with GlobalBiz Outlook

Request Media Kit to get Following:

  • Detailed Demographic Data
  • Affilate Partnership Opportunities
  • Subscription Plans as per Business Size

Enter Your Details to Read the Magazine

Advertise with GlobalBiz Outlook

Are you looking to reach your target audience?

Fill the details to get 

  • Detailed demographic data
  • Affiliate partnership opportunities
  • Subscription Plans as per Business Size