A preprinted paper and blog post published this week showed that Google researchers have developed an AI system that can learn and simulate animal movements to give robots greater flexibility.
The co-authors of the paper believe that their method can promote the development of robots, so that robots can complete some tasks in life that require more flexibility, such as transporting materials between multi-storey warehouses and fulfillment centers.
The team’s framework uses animal (in this case, dog) motion capture clips and uses reinforcement learning (reinforcement learning) to train control strategies. Reinforcement learning is a training technique that rewards software agents to accomplish goals.
The researchers say that providing the system with different reference movements allowed them to “teach” a quadruped Unitree-Laikago robot to perform a series of actions, from fast walking (up to 2.6 miles per hour) to jumping and turning.
To verify their method, the researchers first compiled a set of data sets for real dogs performing various skills. (Training is mainly performed in physical simulations so that the posture of the reference movement can be closely tracked). Then, by using different movements in the reward function (which describes the behavior of the actor), the researchers trained a simulated robot with about 200 million samples to simulate motor skills.
But simulators usually only provide a rough approximation of the real world. To solve this problem, the researchers adopted an adaptive technique that can randomize the dynamics in the simulation, such as changing physical quantities, such as the mass and friction of the robot. An encoder is used to map these values to a digital representation (ie, encoding), which is passed as input to the robot control strategy. When deploying the strategy to an actual robot, the researchers removed the encoder and directly searched for a set of variables that allowed the robot to successfully execute the skills.
The team says they can use less than 8 minutes of real data in about 50 trials to adapt the strategy to the actual situation. In addition, they also demonstrated that the real robot learns to imitate the various movements of the dog, including pacing and trotting, as well as the key frame movements of the artist’s animation, such as dynamic jumping and turning.
“We prove that by using reference motion data, a learning-based method can automatically synthesize the controller to achieve various behaviors of the legged robot.” The co-author of the paper wrote. “By integrating effective domain adaptive sample technology into the training process, our system is able to learn the adaptive strategy in the simulation and can then quickly apply it to actual deployment.”
However, this control strategy is not perfect. Due to algorithm and hardware limitations, it cannot learn highly dynamic behaviors (such as large jumps and runs) and is not as stable as the best manually designed controllers. (In each of the five scenarios, a total of 15 trials were performed with each method.
The robot in the real world dropped on average after pacing after 6 seconds; dropped on average after trotting backwards after 5 seconds; and dropped on average 9 seconds while spinning). In this regard, the researchers stated that they will continue to improve the robustness of the controller and develop frameworks that can be learned from other motion data sources such as video clips.