The simulations ran on specialized AI chips from Nvidia rather than general purpose chips used in computers and servers. As a result, the researchers say they were able to train the robots in less than one-hundredth the time that’s normally required.
Using the specialized chips also presented challenges. Nvidia’s chips excel at calculations that are crucial for rendering graphics and running neural networks, but they’re not well suited to simulating the properties of physics, like climbing and sliding. So researchers had to come up with some clever software workarounds, says Rev Lebaredian, Nvidia’s vice president of simulation technology. “It has taken us a long time to get it right,” he says.
Simulation, AI, and specialized chips have the potential to advance robotic intelligence. Nvidia has developed software tools that make it easier to simulate and control industrial robots using its chips. The company has also established a robotics research lab in Seattle. And it sells chips and software for use in self-driving vehicles.
Unity Technologies, which makes software for building 3D video games, has also branched into making software suitable for roboticists to use. Danny Lange, the company’s senior vice president for artificial intelligence, says Unity noticed how many researchers were using the company’s software to run simulations, so they made it more realistic and compatible with other robotics software. Unity is now working with Algoryx, a Swedish company that is testing whether reinforcement learning and simulation can train forestry robots to pick up logs.
Reinforcement learning has been around for decades but has produced some notable AI milestones recently, thanks to advances in other technology. In 2015, reinforcement learning was used to train a computer to play Go, a subtle and instinctive board game, with superhuman skill. It has more recently been put to practical uses, including automating aspects of chip design that require experience and judgment. The trouble is, learning this way requires a lot of time and data.
For instance, it took the company Open AI more than 14 days to train a robot hand to manipulate a Rubik’s Cube in crude ways with reinforcement learning, using numerous CPUs running together. Having to wait two weeks each time a robot was retrained might discourage companies from using the robot.
Early efforts at training robots with reinforcement learning split the process across several real-world robots. Improvements in the physics simulations have made it possible to accelerate learning in virtual environments.
The new work is “extremely exciting for end users,” says Andrew Spielberg, a student at MIT who has used similar simulation methods to devise new physical designs for robots. He notes that a research group at Google has done related work, speeding up robot learning by splitting it up and running it on one of the company’s custom Tensor Processing Unit chips.
Tully Foote, who manages the widely used open source Robot Operating System at the Open Robotics Foundation, says simulation is increasingly important for commercial users. “Validating software in realistic scenarios before deploying to hardware saves a lot of time and money,” he says. “It can run faster than real time, never breaks the robot, and can be reset automatically and instantly if there’s an error.”
But Tully adds that transferring robot learning to the real world is a lot more challenging. “There’s a lot more uncertainty in the real world,” he says. “Dirt, lighting, weather, hardware non-uniformity, wear and tear, all need to be tracked.”
Lebaredian at Nvidia says the kind of simulation used to train the walking robots may eventually influence the design of the algorithms involved too. “Virtual worlds are valuable for just about everything,” he says. “But definitely one of the most important ones is constructing playgrounds or training grounds for the AIs we want to create.”
This story originally appeared on wired.com.
Listing image by NVIDIA