This is the 23rd day of my participation in the First Challenge 2022

This is the third part of the Prompt series. In the last issue, I shared a classic work of Prompt Engineering, AutoPrompt, and introduced a method of automatic design of Prompt, which can achieve good results without the painstaking design of researchers. This issue introduces the essay Null Prompts, directly “Prompts”, even prompt templates are not designed, found that the effect is actually also ok, but also incidentally explored a new way of thinking about local parameters learning, is a more interesting work to share with you.

Note: This column doesn’t indicate an update or an iteration of a technique. Prompts such as AutoPrompt and Null Prompts are just a way of thinking about Prompt methods. At the same time, there are a large number of Prompt related works. I just randomly selected a few introductions that I am more interested in, so it does not mean that these works are the most representative ones, please note.

This article was uploaded to arXiv in late June 2021. Lead author Robert L. Logan IV is from UC Irvine. Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

Motivation

Good manual Prompt design relies on human intuition and is time-consuming and difficult to replicate; There has been a lot of work done to automate prompt design before, but this introduces the complexity of Prompt design itself, such as the need for large generation models.

The author attempts to directly direct the input and [MASK] mark on a simple link, that is, NULL prompts.

Method

The existing work shows that different manual design prompt has similar accuracy after fine-tuning. Inspired by this, the author further shows that such manual design can be completely avoided. The authors take an extremely simplistic approach, adding [MASK] directly to the end of the input as a Prompt template.

  • In-context Learning: actually, it is the famous Frozen LM. LMS learn by tuning prompt rather than updating their own parameters. It was particularly successful in large LMS.
  • Prompt-Based Finetuning: Fine-tuning all parameters of the LM Based on Prompt. The main advantage over in-context is that it can achieve higher precision, especially when the LM is relatively small. The main disadvantage is that the same model cannot be reused for different tasks.
  • Null Prompts (article) : They are simpler and reusable than Prompt-Base Finetuning.

Recommended reading (Frozen) : After learning the text knowledge, I can directly understand the pictures! – Xi Xiaoyao’s cute house

Experiments

Prompt + Finetuning

The authors compared the effects of 6 targets only by fine-tuning on 9 data sets.

Null Prompts and Null Verbalizer perform better than traditional [CLS] tweaks, demonstrating that Prompt prompts are effective.

But while a carefully crafted prompt is still the most effective, the author recommends using Null prompts because they are simpler and can be effective.

Prompt-Only Tuning

Only last week, the authors compared four targets chosen by insurgents, including the AutoPrompt mentioned above (part 2). Prompt. They can take a blank tip. Prompt.

The performance of Prompt-only is much lower than Prompt + funetuning, and fine tuning is necessary.

Memory-Efficiency

The advantage of Frozen is high memory efficiency, LM does not need to learn a new set of parameters, but it is greatly affected by prompt changes. Prompt + funetuning works better, but memory efficiency is low. The author hopes to achieve a balance between the two by limiting the areas in the model where parameters are allowed to change and only allowing some parameters to change during learning. The author selects four methods and adjusts only part of the areas in these methods. The specific Settings are as follows:

  • Adapters: Neural network layer in Transformer FNN;
  • BitFit: Transformer bias;
  • Tuning: Output layer embedding related to Verbalizer Tokens
  • Calibration: Affine Transforms on Logits associated with Verbalizer Tokens.

The BitFit method achieves the best precision – efficiency balance.

Summary

The author demonstrates the advantages of Prompt + fune-tuning and shows the simplicity and effectiveness of Null prompts. In addition, the authors show that prompt based fine-tuning can improve memory efficiency by limiting the tuning of only a few parameters, among which adjusting Transformer Bias only using BitFit is a good method. In summary, the authors suggest a simple, precise, memory efficient way to tweak a Null prompt in a BitFit.


Previous recommendations:

(1) Introduction: New paradigm of NLP? (2) AutoPrompt: Do not design blind, do not please the effort – nuggets (juejin. Cn)