Build machines that learn and think like humans

From: cf020031308. Making. IO/cca shut/cs/b…

译 文 : Building machines that learn and think like people

@article{lake2017building,
  title={Building machines that learn and think like people},
  author={Lake, Brenden M and Ullman, Tomer D and Tenenbaum, Joshua B and Gershman, Samuel J},
  journal={Behavioral and brain sciences},
  volume={40},
  year={2017},
  publisher={Cambridge University Press}
}
Copy the code

directory

Abstract: 1. Introduction 1.1 Scope clarification 1.2 Key Ideas 2. Cognitive and neural inspiration for AI 3. Challenges in Building more Human-like machines 3.1 Character Challenges 3.2 Frostbite Challenges 4. The core elements of human intelligence 4.1 "Priming software" 4.1.1 Physical Knowledge 4.1.2 Psychological knowledge 4.2 Learning in rapid construction mode 4.2.1 Combinativity 4.2.2 causality 4.2.3 Meta-learning 4.3 Quick thinking 4.3.1 Approximate reasoning in structured generation models 4.3.2 Modeling and Model-free Reinforcement Learning 5. Answering some FREQUENTLY asked Questions 5.1 The speed comparison between humans and neural networks on specific tasks is meaningless because humans have extensive prior knowledge 5.2 Theoretically intelligence begins with neural networks is more biologically rational 5.3 Why is language so crucial to human intelligence rarely mentioned? 6. Outlook 6.1 Prospects of deep learning 6.2 Future applications of practical AI problems 6.3 Towards more human-like learning and thinking machinesCopy the code

Abstract

Machines that truly learn and think like humans must transcend current engineering trends in both what they can learn and how they learn. They should be able to:

  1. Establish causal model for easy explanation and understanding;
  2. Support and deepen the learning with the basic theory of physics and psychology;
  3. Through Compositionality and learning-to-learn, knowledge can be acquired quickly and promoted to new tasks and scenarios.

1. Introduction

For “machines that learn and think like humans”, this article has the following main contents:

  1. Review some of the current standards for it;
  2. Combine theoretical and experimental data to describe the elements we consider necessary;
  3. Many of these elements have not yet been incorporated into modern deep learning models, resulting in machines solving problems differently from humans;
  4. Discuss the most feasible implementation path.

In addition, we also make a broader distinction between statistics-based pattern recognition methods and modeling methods. The former is based on prediction and has the common characteristics of high-value states. The latter focuses on building models of the real world, so as to understand and explain the real world, imagine and realize the future world.

We also discussed the pattern recognition approach, which is not the core of the modeling approach, but still contributes to it.

1.1 Scope Clarification

There are two kinds of AI, those that purportedly mimic human cognition and draw inspiration from its concepts, and those that don’t. This article focuses on the former: we believe that deconstructing human intelligence can improve AI/ML, especially in areas where humans are better at it.

In addition, although this paper focuses on neural network methods, this is not the only direction for AI improvement.

1.2 Key Ideas

The central goal of this paper is to come up with a set of core elements for building machines that learn and think more like humans. Divided into three groups:

  1. “Priming software”, that is, early cognitive ability. Helps to learn new tasks and speed up learning. Include:

    1. Physical knowledge;
    2. Psychological knowledge.
  2. Learning ability. We think it’s all about modeling, explaining observed data by building causal models of the world. Humans are able to model quickly because:

    1. Combination;
    2. Yuan learning.
  3. Be responsive.

    1. The combination of neural network and modeling method for reasoning;
    2. We show that humans combine both model-based and model-free learning algorithms. This direction has not yet been applied but is promising.

2. Cognitive and neural inspiration to AI

Turing suggested that the way humanoid machines could be achieved was by building a basic machine that was as blank as a child, and then training it with rewards and punishments (reinforcement learning).

At that time, most researchers believed that knowledge could be symbolized and understood through calculus. At the same time, a “sub-symbolic” neural-like approach has emerged, leading people to propose that the essence of cognition is “parallel distributed processing (PDP)”. Neural network and later deep learning all borrowed from PDP model.

It is worth noting that the PDP concept is compatible with “modeling” as well as with “pattern recognition”.

Both neural network model and PDP seem to suggest that the human brain is deeply symbolic, with only a few constraints or perceptual stereotypes to guide learning.

So a common research strategy is to train a relatively generic neural network to do the job, adding other elements only when necessary.

So the question is, without these extra elements, can we develop machines that actually learn and think like humans? If not, how close can you get?

3. The challenge of building machines that are more like people

Since modern cognitive science is inconclusive about the human brain or intelligence, it is extreme to claim that it is a set of general-purpose neural networks with few initial constraints.

An alternative view thus emerges that emphasizes the importance of early perceptual stereotypes and powerful learning algorithms that rely on prior knowledge to extract knowledge from small amounts of training data.

Here we take two ML/AL challenges: simple visual concepts and Frostbite games, and use them as examples to illustrate the importance of core cognitive elements.

3.1 Character Challenge

Neural networks perform as well as humans in word recognition and larger image classification tasks, but this does not mean that humans and machines learn and think in the same way. There are two main differences:

  1. Humans can learn more representations with fewer examples;
  2. Not only can humans learn to recognize patterns, they can generalize concepts from them and apply them.

More difficult challenges may require a combination of deep learning and Probabilistic Program Induction.

3.2 Frostbite challenge

DQN is a major advance in reinforcement learning, proving that multiple complex tasks can be learned even with a single algorithm.

DQN combines Deep CNN model of pattern recognition and Q-learning algorithm of model-free reinforcement learning.

Other than assumptions about the inherent graphical structure, there is very little built into the network, so every new game must be built from scratch to learn visual and conceptual systems. But newer studies have shown that neural networks across these different games can share visual features, be used to train complex tasks, and help learn new games.

Research on Frostbite games shows that even deep networks that have played thousands of games require thousands of hours of training to achieve what a novice gamer can achieve in a few minutes, limited to specific inputs and goals. Humans have a deeper understanding of games, and unlike DQN, which relies on the feedback of constantly completing small goals to eventually achieve large goals, humans can easily understand the larger goals at a higher level. And humans can summarize models from learning and flexibly apply them to any new task or goal.

So what makes human beings special? If there is only more prior knowledge, how to use the prior knowledge to help the machine learn quickly and solve problems?

4. A core element of human intelligence

The elements mentioned in the introduction are not necessarily comprehensive, but they are important parts that are missing from most learning-based AI systems today.

When we say “core elements,” we do not mean that an element is necessarily innate or that an algorithm is pre-installed. We are not going to discuss the origin of these key elements. Admittedly, these core elements benefit even directly from prior knowledge, but we want to emphasize that they play an important role in building more human-like machines, but are not applied to machine learning today.

4.1 Starting Software

4.1.1 Physical knowledge

By the first year of life, children are able to establish the laws, representations and concepts of physics in sequence. The latest research looks at physics as a generalization based on a physics simulation engine. In this view, human beings reconstruct what they feel through subjective understanding of physical properties. This “physics common sense engine”, though crude and simple, is sufficient to make judgments about the short-term motion of things. And can flexibly adapt to a variety of everyday situations, even without perceptual cues.

What are the prospects for integrating physics knowledge into deep learning systems?

Although the methods including PhysNet based on deep CNN performed well, they all required a lot of training to complete a single task and were limited to a small range of scenes, lacking human flexibility.

Instead of making predictions out of physical simulations, can neural networks be trained like children to simulate general physics? The challenge here is, can a general-purpose neural network, trained to learn dark knowledge, extend that knowledge beyond the training scene in the same way that a human has learned explicit physical concepts?

Whether it’s learning the dark side of physics with a neural network or preloading the dark side of physics in a simulator, deep learning can be difficult to integrate with the general knowledge of physics, but it will lead to a huge increase in learning speed and effectiveness, which could be an important step toward a more human-like learning algorithm.

There is no research in this area, and we think there can be.

4.1.2 Psychological common sense

Infants learn to distinguish between subjects and objects by innate perception of low-level cues before they speak, and can then distinguish between subjects’ different socialities. It is generally believed that infant’s desired principal activity is goal-driven, effective, and social.

One explanation sees common psychological knowledge as the sum of simple inferences. While the inferences themselves are easy to calculate, scenarios can be both complex, requiring a greater number of inferences, and prone to change, requiring more types of inferences. This all leads to a rapid expansion of representation based on inferential combinations.

Another explanation is that psychological common sense is a model for the generation of behavioral options. It is assumed that other agents are expected to behave rationally, that is, humans predict their next steps based on their goals by simulating their planning processes, or deduce what they want to do based on their series of actions. This psychological commonsense reasoning based on simulations of other agents can also be recursively nested to understand social interaction.

It is possible to learn visual cues, heuristics and summary statistics in scenes involving subjects. If this were the entire basis of human psychological reasoning, then data-driven deep learning methods would have long been successful in areas involving theory of mind and psychological common sense. So, like physical common sense, whether general deep networks can obtain psychological common sense reasoning depends in part on human representations of psychological common sense. Just as it is difficult to generalize without understanding physical common sense, it is difficult to reason without understanding psychological common sense.

However, any complete representation of psychological common sense reasoning includes the concepts of subject, goal, efficiency, and reciprocity. It is not certain that these concepts can be learned by deep networks trained solely in prediction. This is because after a lot of training, even if they do not learn these concepts, they still have the reasoning ability of a baby. And these concepts are not necessarily true abstractions of human learning, understanding, and applying psychological common sense.

In addition to pre-installation, psychological common sense can be combined with modern deep learning systems in a variety of ways. For example, a small amount of perceptual stereotype is used to stimulate the inference of more subject abstract concepts, and a large number of goal-oriented and social-oriented behaviors are decomposed into simple element superposition results.

The origins of psychological common sense are still debated, but it is important to human learning and thinking.

4.2 Quick build mode learning

Since its establishment, neural network model has always emphasized the importance of learning. Whatever the algorithm, learning is treated as a gradual adjustment of the strength of the connections between data.

Recently, BP neural network and big data sets have solved difficult pattern recognition problems and made machine learning a great success. But even if they are equal on a few issues, they are still far behind on most others.

Deep neural networks need more data.

When learning vocabulary in their native language, children are able to make meaningful generalizations from very sparse data and are better than adults at picking up new concepts. Compared with efficient human learning, neural networks are always hungry for more data because of their versatility, and obviously they do not use information efficiently.

It must be noted that machines are better at many areas than humans, and are slow to learn difficult concepts, but humans are better at most cognitively natural concepts (like children learning words). This kind of learning is the main content of this section. It is the foundation of successful human learning and is more suitable for us to deconstruct and reconstruct. And it may be integrated into the next generation ML/AL algorithm, which is expected to make progress in concept learning.

Even with a few examples, humans can learn practical conceptual models and apply them flexibly.

This practicality and flexibility indicates that modeling learning is more human-like than pattern recognition learning.

Machines do not have the same ability to learn general knowledge from small data sets for other types of tasks. In addition, knowledge representations that neural networks can easily generalize to new tasks are also difficult to learn.

However, we make a character challenge model based on Bayesian program learning (BPL), and express the concept as a combination of reusable likelihood primitives in a certain structure, which can be combined into different concepts in different ways. For new concepts, only one example can generate other new examples (the structure remains the same, replacing the primitives with other likelihood primitives).

BPL is a learning model that performs human-like on challenging one-time categorization tasks, outperforms deep learning models such as CNN, and can be creatively generalized because of the three elements discussed next: associativity, causality, and meta-learning.

2 combinational

Combinativity is a classic idea: new concepts can be expressed as combinations of primitive elements. Compositionality has broad implications in both AI and cognitive science. Combinatorial and meta-learning naturally match, but there are also combinatorial forms that rely less on pre-learning. By combining, the character challenge learning model becomes operational and the Frostbite challenge becomes more effective.

Deep networks have at least limited combinatorial concepts. Not only components, objects and scenes contain combinatorial concepts, but combinatorial concepts are also very important for object decomposition. We want the deep Web to have richer combinatorial concepts to learn faster and more flexibly.

In addition, the model makes it clear that objects, personalities, and relationships need to be coordinated, which is causality.

4.2.2 causality

In concept learning and scene recognition, causality model is the assumption of real process. In control and reinforcement learning, causality model is the representation of environment structure. Conceptual learning and visual models with causality are usually generative, but not all generative models are causal. Causality is very influential in perception theory, but it doesn’t have to be a strict inverse of the true generative mechanism.

Causal knowledge influences how people learn new concepts. Causality is important for learning because it strengthens connections between core features of concepts and weakens others.

In understanding scenes, humans also build causal models, integrating physical, psychological, and combinatorial knowledge to explain and describe representations. The absence of all three, and the causality that connects them, leads to misperceptions, such as image description generation.

There is still a long way to go for deep network related methods to learn the causal model, and the recent improvements still have their own problems, but their improvement points can be explained as the approximation of real causality, so it is possible to greatly improve the deep learning model on causality.

In more complex causal models, deep networks may be used in two ways: as underlayers to structured generative models, making probabilistic reasoning easier, or as causal generative models when the elements are complete.

Holdings yuan learning

Prior knowledge must lead to differences in how humans and machines reason about data. Meta-learning is one of the acquisition methods of prior knowledge, which is closely related to “transfer learning”, “multi-task learning” and “representation learning”. It allows learning of one task to accelerate learning of other tasks.

This mechanism has already been implemented, but it is still not as fast or flexible as humans. Perhaps they need to implement more combinatorial, causal representations first.

Meta-learning has the potential to help address both of the challenges outlined in this article.

Current neural network methods are far from solving the character challenge, but require much more pre-training than humans. We are not sure how humans acquired knowledge in these fields, but it may be similar to BPL. Meta-learning has been applied at many levels of the BPL generation process, and we can also use the generation of good important structures as a priori knowledge of the BPL, just as humans do. In addition, analogical learning also occurs in the process of many human learning new models, and we believe that the combination, hierarchy and causality of the above are also beneficial to the deep network.

In the challenges of video games like Frostbite, the effectiveness of meta-learning is inextricably linked to the representation of knowledge. Human knowledge transfer occurs at all levels, from cognition to strategy. On the one hand, it can quickly identify the objects in the game scene, and on the other hand, it can understand the goals and positive and negative feedback of the game, which rely on existing real knowledge and game experience.

Deep reinforcement learning is successful at transfer learning, but it’s still not as fast as anthropology at playing new games.

In short, the interaction between experience and presentation may be the key to building machines that learn as quickly as humans.

4.3 Quick Thinking

The last section focused on the humanoid ability to learn rich models from sparse data, and the most striking thing about this human ability is its speed.

Rich models combined with efficient reasoning is another approach that can be used in psychology and neuroscience, and it may also be a new way to build successful deep learning.

This section discusses two possible approaches to resolving the conflict between rapid reasoning and structured representation.

4.3.1 Approximate reasoning in the structured generation model

The multi-layer Bayesian model can solve the theoretical structure and causal representation of reality, but it is not practical because of the large amount of calculation. A complete description of learning and reasoning must explain how the human brain can do so much with limited computational resources.

The popular algorithm of approximate inference in probabilistic machine learning has been proposed as a psychological model, the most prominent one is that human approximates Bayesian inference by Monte Carlo method.

Although monte Carlo method is powerful and has asymptotic guarantee, it is still a challenge to deal with complex problems such as program induction and theoretical learning. Usually, in order to find a good model, we have to search the whole huge hypothesis space. But humans are different. Instead of gradually searching for the best approach, they quickly piece together the pieces of knowledge into a causal understanding by learning in leaps and starts, rather like a guided process than monte Carlo’s haphazard model of reasoning.

In the field of rapid learning, human perceptual stereotypes are likely to be used not only to evaluate hypotheses, but also to guide their selection. How do you learn this quick mapping from questions to possible answers?

The most recent achievement is the calculation of conceptual reasoning using an efficient feedforward mapping apportionment. This apportionment also means that solutions to different problems are related because there are shared apportionments, and there is evidence that human reasoning is related. This is a potential way for deep learning models to integrate probabilistic models and probabilistic programs, that is, training neural networks to perform probabilistic reasoning in generative models or probabilistic programs.

Another approach is differential programming based on gradient descent.

4.3.2 Model-based and model-free reinforcement learning

There is substantial evidence that humans use model-free learning algorithms similar to DQN in simple associative learning or discriminative learning tasks and can quickly select actions.

At the same time, there is ample evidence that humans also have model-based learning systems that create “cognitive maps” of the environment and are used to plan actions for complex tasks. This is necessary for human intelligence to flexibly adapt to new tasks and goals.

At the same time, however, the “habituation” of human daily activities may indicate that model-based planning may shift to model-free rapid estimation, possibly due to the trade-off between flexibility and speed in learning systems.

Human reinforcement learning systems supply model-free systems with training data simulated by model-based systems, apportion and cache the calculation of action planning in advance, and this process can be completed offline (for example, for humans, it may be while sleeping or in a daze). So flexibility and efficiency are balanced.

Intrinsic motivation is also important to human learning and behavior, and deep reinforcement learning is just beginning to explore this direction.

Answer some common questions

5.1 The speed comparison between humans and neural networks on specific tasks is meaningless, because humans have extensive prior knowledge

Human beings not only have experience of related or unrelated tasks, but also have undergone long-term evolution. The long history of “advance training” is indeed the reason for multi-task learning and transfer learning.

These form the “boot software” or other modules mentioned earlier, but our focus is not on how these are formed, but rather on how they are the building blocks that facilitate rapid learning from sparse data.

Meta-learning between multiple tasks is a credible way to form these modules, but it is not enough to train CNN of related tasks in large quantities, because meta-learning also depends on building models with correct representation structures.

Some researchers still think that giving deep learning large enough training data is expected to learn priori knowledge comparable to human beings. We think this is unrealistic. Because building human knowledge representation from scratch may require exploring fundamental structural changes in network architecture, this part of exploration and creativity is currently done by humans themselves as researchers, which is not possible with gradient-based learning in weighted Spaces.

Another direction is to establish infantile knowledge representation and core elements or perceptual stereotypes for learning-based AI systems.

No matter which direction AI developers choose, it is compatible with the cognitive elements presented in this article.

5.2 Theoretically intelligence begins with neural network is more biologically rational

Instead of claiming deep neural networks inspired by neuroscience, we focused on how cognitive science can guide and facilitate the engineering of humanoid AI. After all, human intelligence clearly comes from understanding “software” rather than “hardware.”

Neuroscience does provide valuable inspiration for both cognitive models and AI researchers, but there are also many plausible but widely accepted ideas that are not enough to disprove our approach. For example, back propagation, which plays a major role in today’s best pattern recognition systems, is not biologically plausible, and the cognitive significance of Hebbian learning has no biological basis.

So we certainly believe that neuroscience will provide more ground for intelligence theories in the future, but for now, cognitive rationality is more important than biological rationality.

5.3 Why is it rarely mentioned that language is so vital to human intelligence?

Humans are far better at using natural language to communicate and think than machines, and it is widely recognized that neural networks are far from being capable of human language. How can we make machines more capable of speech? We believe that understanding the elements proposed in this paper will help us understand the role of language in intelligence, since these elements predate human mastery of language and semantics.

What else does it take to learn a language? I don’t know, but our list of elements isn’t necessarily comprehensive. All the elements needed to acquire language ability should be added to this list, because language is indeed essential to human intelligence, and building machines that learn and think like humans must eventually acquire language ability.

6. Looking forward to

ML/AI has made great progress in recent years, but is not yet superior to human beings.

Human beings can learn from less data and make flexible and rich applications, which we believe is caused by the causality and combination of human knowledge representation.

Let’s end by talking about the future of deep learning.

6.1 Prospects of deep learning

A recent trend is to integrate elements of psychology into deep networks, especially selective attention, enhanced working memory, and experience replay. The combination of these approaches is an important step towards building humanoid AI.

Another example of pattern recognition and model-based search comes from Go. AI’s challenge to Go is not only to achieve world-class chess skills, but also to try to understand and summarize Go based on the same type and amount of data, clear rules and social learning opportunities as human beings. We believe that Go can also serve as a test bed for the elements discussed in this article.

6.2 Future applications of practical AI problems

  1. Scene understanding. Now deep learning has moved beyond object recognition into understanding scenes. Combination, causality, physical and psychological knowledge are indispensable.
  2. Autonomous agents and intelligent devices. The ability to learn new concepts from small samples is indispensable to causal processes and meta-learning.
  3. Driverless. The perfect driverless car must have the psychological nous to understand the movements of other drivers and pedestrians.
  4. Creative design. Composition and meta-learning provide new ideas, and causality ensures that designs are coordinated.

6.3 Towards learning and thinking machines that are more like humans

Hopefully, the elements we present in this article will help the machine achieve the following goals:

  1. To know things and subjects, not features;
  2. Modeling cause and effect rather than recognizing patterns;
  3. Reorganizing knowledge representation rather than retraining;
  4. Learn to learn, not start over.

comments