With industry giants like Google, Microsoft, Facebook, and Apple investing millions of dollars in Artificial Intelligence, technology is making huge strides. In this context, a future where robots and humans would be able to live together does not seem like a crazy idea. The robots would be able to assist humans with different tasks, learn from their experiences, and make our lives a lot easier. To translate what looks like a distant dream for now into reality, the top scientists in the world are dedicatedly pursuing the use of ‘reinforcement learning’ in robotics.
What is Reinforcement Learning?
Reinforcement Learning is a machine learning approach which is considered suitable for unpredictable environments, goal-oriented learning, and robot learning. Just like machine learning uses data analysis to enable a computer to learn without being programmed, reinforcement learning allows the computers or robots or AI agents to learn from their experiences. Humans learn through trial and error, the same way, reinforcement learning is an attempt to allow robots to solve problems or complete tasks without being told how to.
Elements of Reinforcement Learning
There are four main elements in reinforcement learning tasks. Let’s understand them:
- The Control Policy: A policy defines the learning agent’s way of behaving at a particular time. It is the way the learning algorithm behaves (mapping from states of the environment to actions to be taken in those states). The policy forms the core of reinforcement learning as it determines the AI agent’s behavior which also corresponds with the stimulus-response in psychology.
- The Reward Function: The reward function defines the goals of a reinforcement learning problem. It defines what events are good or bad for the learning agent during the learning process. The aim of the learning agent is to maximize the overall received award.
- The Value Function: Reward function indicates what is good in the immediate sense, whereas value function indicates what is good in the long run. Action choices are based on value judgments. For example, a robot learning to walk through a maze. The state here is the position of its two legs, action would be what the robot can do in each state, i.e. walk to the left or right. When the robot takes an action in a state, it receives a reward (feedback from the environment).
- A model of The Environment: A model is something that imitates the behavior of the environment. In our above example of the robot walking through a maze, given a state and action, the robot might predict the next state and the next reward, by trying different possible actions.
Recent Developments in Reinforcement Learning
- Over the course of some years, researchers like Pieter Abbeel at UC Berkeley have been coming up with ways to teach robots new skills. These robots would be able to grip new objects by studying a database of 3D shapes. The aim of this project has been to enable industrial robot arms to act just like humans instead of just following orders and obeying. Pieter Abbeel says “Learning more directly from human demonstrations and advice in all kinds of formats is intuitively the way to get a system to learn more quickly,” “However, developing a system that is able to leverage a wide range of learning modalities is challenging.”
- DeepMind’s AI robot is teaching itself parkour. This sort of research is being tested in the virtual world so that it can be of help for program robots in the future to climb up or down the stairs in your house. With the help of virtual sensors, the AI robot is self-learning through trial and error as to how to jump, limbo or leap and find the best way possible from point A to point B. Researchers are exploring various control environments to teach complex movements to the robot.
- Google’s AlphaGo AI recently proved that robots are now smarter than humans by defeating the world’s best player of Go. This marked the first time a machine had beaten an expert of this highly complex game, which many experts didn’t expect would happen for another ten years.
It is evidently clear that deep reinforcement learning could be a game changer in almost every field and not just robotics. But as far as robotics is concerned, with newer projects and experiments underway, it is possible for robots to take on human qualities of complex decision making and become more intelligent. There are some concerns too, what if robots could actually overtake humans in the future and become the immortal dictators. Big personalities like Elon Musk and Stephen Hawking too share this concern. But for now, the potential for reinforcement learning is just so promising that it outweighs any such risks which seem too far-fetched.