Guo Yipu from au Fei Temple


No. | public QbitAI qubit reports

Researchers at Berkeley recently used visual modeling reinforcement learning to train a different kind of robot to perform a variety of tasks by exploring on its own: organizing toys, folding laundry, putting away dishes…

In addition, the training process of such a multi-functional robot is unsupervised and does not need to feed data. It is completely learned by the robot after its own exploration.

In other words, the robot takes a look at your messy room and cleans it up on its own.

To help you put the misplaced apples on your plate:

To help you fold up your autumn clothes:

Organize toys:

Yi, erhu egg strays by accident.

It’s the same algorithm that does all this work.

These are amazing skills that Yann LeCun exclaims: Awesome!


Explore the world like a child

As we said at the beginning, this robot doesn’t need to be fed data manually.

So where does the data come from? From the real world where it needs to work.

In a “room” with a variety of objects, the robot can be free to explore and feel everything in the room. Without supervision, the robot can play by itself.

In addition to hard things like cups and toys, you can also play with “soft” towels:

During play, the robot learns the ability to see through various sensors and know what is around it;

You learn the ability to locate, to know what you’re doing;

You learn to do different movements and know what your arm will do when you give different commands.

By learning to use your hands well, you can predict how the movements of your hands will affect objects in your environment.

The whole exploration process has no score, no win or loss. It is completely a process of the robot’s own “curiosity”, by exploring the objects in the room, to form its own “world view”.

Just do the job. Don’t you tell me how to do it

How do we put a robot to work when it already knows everything about the objects in its environment?

Use pixels.

In the whole environment, the robot is marked with task targets by pixels, with red as the starting point and green as the ending point, that is, the robot is told to move things from the red point to the green point.


Robot, you move the apples from the red dot to the green dot.

The robot thought, probably just to pick up the apple, move his arm over, put the apple down, you can?

▽ Inside the robot

Here we go. Summon the arm. Mule or horse. Let’s move one.

Bingo! Success.

For our next task, we’re going to double up the hot jeans.


Think about it, grab a corner and move it over, right?

Try this plan:

Perfect success ~

How does this process work? Berkeley has a video that you can check out:

portal

Finally, the portal is attached as usual.

Paper:

Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

Frederik Ebert, Chelsea Finn, Sudeep Dasari, Annie Xie, Alex Lee, Sergey Levine

https://drive.google.com/file/d/1scfbONOHg8H2_pJ9naRkHfk4dGSNGNWO/view

Blog:

Visual Model-Based Reinforcement Learning as a Path towards Generalist Robots

https://bair.berkeley.edu/blog/2018/11/30/visual-rl/

As for the open source code, the official said “coming soon, perhaps over a period of time will find in this page: https://sites.google.com/view/visualforesight


– the –

qubits

վ’ᴗ’ ի Tracks new developments in AI technology and products

Welcome to pay attention to us:

qubitswww.zhihu.com

And subscribe to our zhihu column

qubitszhuanlan.zhihu.com

Sincere recruitment

Qubit is looking for editors/reporters to work in Zhongguancun, Beijing. Looking forward to the talented and enthusiastic students to join us!

For more details, please reply “Wanted” on the QbitAI chat screen.