What is Gemini Robotics, Google DeepMind’s new AI for humanoids?

Google DeepMind is bringing Gemini 2.0’s intelligence to general-purpose robotic agents in the physical world.
Google has taken a major leap in robotics. Google DeepMind has launched Gemini Robotics, its suite of AI models meant to equip robots with the ability to perform complex physical tasks with unprecedented accuracy and dexterity.
The AI research lab has also launched Gemini Robotics ER along with Gemini Robotics, a combination of two innovative AI models that will allow robots to do complex tasks, even those physical tasks where it may not have prior training.Gemini Robotic ModelsThe AI suit comprises two models—Gemini Robotics and Gemini Robotics ER. The Gemini Robotics is an advanced vision language system (VLS) that has been built upon the Gemini 2.0 framework, essentially adding physical actions to its output modality. Reportedly this model allows robots to process and respond to visual inputs, comprehend language commands, and execute complex physical tasks.
Meanwhile, the Gemini Robotics ER is an AI model that helps robots with spatial understanding and embodied reasoning capabilities. Essentially, it allows roboticists to run their programs with enhanced performance. It allows them to adapt to different types of robots, from bi-arm platforms to humanoids like Apptronik’s Apollo. In the demo shared by the company, both models have displayed remarkable improvements over existing technologies. Gemini Robotics reported a 74.5 per cent success rate in in-distribution task performance compared to 42.6 per cent for multi-task diffusion policies.