1. What is transfer learning

Transfer Learning is A machine Learning method, which takes the model developed for task A as the initial point and re-uses it in the process of developing the model for task B. Transfer learning is a new task that improves learning by transferring knowledge from related tasks that have been learned. Although most machine learning algorithms are designed to solve a single task, the development of algorithms that facilitate transfer learning is a topic of ongoing interest in the machine learning community. Transfer learning is common for humans, for example, we might find that learning to recognize apples might help to recognize pears, or learning to play an electronic organ might help to learn the piano.

To find the similarity of the target problem, transfer learning task is to apply the model learned in the old domain to the new domain from the similarity.

2. Why is it necessary to transfer learning?

  1. Conflict between big Data and less labeling: Although there is a large amount of data, it is often unlabeled, which makes it impossible to train machine learning models. Manual data calibration is too time-consuming.
  2. The contradiction between big data and weak computing: ordinary people can not have a huge amount of data and computing resources. Therefore, migration by means of models is required.
  3. The conflict between a universal model and a personalized need: even on the same task, a single model often fails to meet everyone’s personalized needs, such as specific privacy Settings. This requires the adaptation of models between different people.
  4. Requirements for specific applications, such as cold start.

3. What are the basic problems of transfer learning?

There are three basic problems:

  • How to transfer learning? (Design migration method)
  • What to Transfer: Given a target domain, how to find the corresponding source domain and then migrate it? (Source domain selection)
  • When to transfer: When to transfer and When not to transfer? (Avoid negative migration)

4. What are the common concepts of transfer learning?

  • The basic definition

    • Domain: consists of data features and feature distribution, and is the main body of learning
      • Source domain: a domain of existing knowledge
      • Target domain: The domain to be learned
    • Task: consists of objective function and learning result, which is the result of learning
  • Classification by feature space

    • Homogeneous TRANSFER TL: The source domain and target domain have the same feature space.
    • Heterogeneous transfer learning (Heterogeneous TL): The source domain and target domain have different feature Spaces.
  • Categorize by migration scenario

    • Inductive Transfer Learning: Different learning tasks in source domain and target domain
    • 2. Direct transfertive TL learning: Different source domain and target domain, learning tasks are the same
    • Unsupervised transfer learning: neither source domain nor target domain has a label
  • Classification by migration method

    • Instance Based TL: Migration is carried out by reusing samples of source and target domains with weights

      Instance Based Transfer Learning generates rules according to certain weights, and reuses data samples for Transfer Learning. The figure below vividly shows that there are different kinds of animals in the thought source domain based on the sample transfer method, such as dogs, birds, cats, etc., and there is only one kind of dog in the target domain. During migration, we can artificially increase the weight of samples belonging to the dog category in the source domain for maximum similarity to the target domain.

    • Feature based TL: Transform the features of source domain and target domain into the same space

      Feature based Transfer Learning refers to mutual migration through Feature transformation to reduce the gap between source domain and target domain. Or transform the data features of source domain and target domain into a unified feature space, and then use the traditional machine learning method for classification and recognition. According to the isomorphism and isomerism of features, it can be divided into isomorphism and isomerism transfer learning. The figure below illustrates two character-based transfer learning approaches.

    • Model-based migration (Parameter Based TL) : using the Parameter sharing model of source domain and target domain

      Parameter/Model based Transfer Learning refers to the method of finding the Parameter information shared between source domain and target domain to realize the Transfer. This migration approach assumes that data in the source domain and data in the target domain can share some model parameters. The following figure illustrates the basic idea of a model-based transfer learning approach.

    • Relation based TL: Migrate using logical network relationships in the source domain

      Relation Based Transfer Learning has a completely different idea from the above three methods. This method focuses on the relationship between samples in source domain and target domain. The diagram below visually illustrates similar relationships between different domains.

5. What are the differences between transfer learning and traditional machine learning?

The migration study Traditional machine learning
The data distribution Training and test data do not need to be equally distributed Same distribution of training and test data
Data labels Adequate data annotations are not required Adequate data annotation
modeling You can reuse previous models Each task is modeled separately

6. What are the core and metrics of transfer learning?

The general idea of transfer learning can be summarized as: to develop algorithms to maximize the use of knowledge in the labeled domain to assist knowledge acquisition and learning in the target domain.

The core of transfer learning is to find the similarity between source domain and target domain and make use of it rationally. This similarity is very common. For example, different people are physically similar; Bicycles and motorcycles ride in a similar way; Chess is similar to Chinese chess; Badminton and tennis are played in a similar way. This similarity can also be understood as invariants. To constant should change, in order to remain invincible.

Having this similarity, the next step is how to measure and exploit it. The goal of ** measurement work has two points: one is to measure the similarity of two fields well, not only to tell us qualitatively whether they are similar, but also to give the similarity degree quantitatively. The second is to increase the similarity between the two fields by means of measurement as the criterion, so as to complete the transfer learning.

In a word: similarity is the core, and measurement is an important means.

7. What is the difference between transfer learning and other concepts?

  1. The relationship between transfer learning and multi-task learning:
    • Multi-task learning: multiple related tasks learn together;
    • Transfer learning: emphasizes information reuse and migrates from one domain to another.
  2. Transfer learning and domain adaptation: Domain adaptation: to align two domains with inconsistent feature distributions.
  3. Transfer learning and covariance drift: Covariance drift: the conditional probability distribution of data changes.

8. When can transfer learning be used?

The case where transfer learning is most useful is if you are trying to optimize the performance of task B, which usually has relatively little data. In the radiology department, for example, you know it is difficult to collect a lot of rays scan to build a good performance of radiology diagnosis system, so in this case, you may find a related but different tasks, such as image recognition, which you may use 1 million images training, and learn a lot from low-level features, So that might help the network do a better job on the task on the radiology task, even though the task doesn’t have as much data.

If the difference between the two fields is particularly large, transfer learning can not be used directly, because in this case the effect is not very good. In this case, the recommended approach is to move between two domains with low similarity step by step (crossing the river on stones).

9. What is finetune?

Finetune is perhaps the simplest method of deep network migration. Finetune, also known as fine tuning, is an important concept in deep learning. In short, finetune takes someone else’s trained network and ADAPTS it to your own task. In this sense, it is easy to understand that Finetune is part of transfer learning.

Why do you need a trained network?

In practice, we don’t usually train a neural network from scratch for a new task. This operation is obviously very time consuming. In particular, our training data cannot be as large as ImageNet to train a deep neural network with sufficient generalization ability. Even with so much training data, we can’t afford to start from scratch.

Why finetune?

Because models trained by others may not be fully applicable to our own tasks. Maybe someone else’s training data and our data don’t follow the same distribution; Maybe someone else’s network can do more than our task; Maybe someone else’s network is more complex, and our task is simpler.

10. What is deep network adaptation?

Finetune of deep Network can help us save training time and improve learning accuracy. But Finetune has its inherent weakness: it can’t handle different distributions of training and test data. And this phenomenon is everywhere in practice. Finetune’s basic assumption is that training data and test data follow the same data distribution. This is also not true in transfer learning. Therefore, we need to go further and develop a better method for the deep network to better complete the task of transfer learning.

Taking the data distribution adaptive method we introduced before as reference, many deep learning methods have developed an adaptive layer to complete the self-adaptation of source domain and target domain data. Adaptive can make the data distribution of source domain and target domain closer, so that the network effect is better.

Application of GAN in transfer learning

Generative Adversarial Nets (GAN) are proposed by Generative Adversarial Nets inspired by the idea of two-person zero-sum game in game theory. It consists of two parts:

  • One part is the Generative Network, which generates samples that look as authentic as possible. This part is called the Generator.
  • The other part is the Discriminative Network, which is responsible for judging whether the sample is real or generated by the generator. This part is called the game between the Discriminator generator and the Discriminator, thus completing the antagonism training.

The goal of GAN is clear: to generate training samples. This seems to be at odds with the larger goal of transfer learning. However, in transfer learning, there is naturally a source domain and a target domain, so we can avoid the process of sample generation and directly take the data of one of the domains (usually the target domain) as the generated sample. At this point, the function of the generator changes, no longer generating new samples, but playing the role of feature extraction: constantly learning the characteristics of the domain data makes the discriminator unable to distinguish between the two domains. Thus, the original generator can also be called a Feature Extractor.

12. Code implementation

Transfer Learning example

Data set download:

  • Inception- V3 model: Click to download
  • Flower_photos dataset: Click download

Machine Learning

13. References

[github.com/scutan90/De…]. (github.com/scutan90/De…).


Author: @ mantchs

GitHub:github.com/NLP-LOVE/ML…

Welcome to join the discussion! Work together to improve this project! Group Number: [541954936]