Nvidia trains a tiny AI model that controls humanoid robots better than specialists

8 months ago 9

ARTICLE AD BOX

Nvidia researchers have built a small neural network that controls humanoid robots more effectively than specialized systems, even though it uses far fewer resources. The system works with multiple input methods, from VR headsets to motion capture.

The new system, called HOVER, needs only 1.5 million parameters to handle complex robot movements. For context, typical large language models use hundreds of billions of parameters.

The team trained HOVER in Nvidia's Isaac simulation environment, which speeds up robot movements 10,000 times. According to Nvidia researcher Jim Fan, this means that a full year of training in the virtual space takes just 50 minutes of actual computing time on one GPU.

Small and versatile

HOVER moves zero-shot from simulation to physical robots without the need for fine-tuning, says Fan. The system accepts input from multiple sources, including head and hand tracking from XR devices such as Apple Vision Pro, full-body positions from motion capture or RGB cameras, joint angles from exoskeletons, and standard joystick controls.

THE DECODER Newsletter

The most important AI news straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel at any time

The Hover model allows a robot to be remotely controlled via a VR headset without any specific fine-tuning. | Video: Nvidia

The system performs better at each control method than systems built specifically for just one type of input. Lead author Tairan He speculates that this may be due to the system's broad understanding of physical concepts such as balance and precise limb control, which it applies across all control types.

The system builds on the open-source H2O & OmniH2O project and works with any humanoid robot that can run in the Isaac simulator. Nvidia has posted examples and code on GitHub.

Read Entire Article