[Prompt series] (2) [Paper sharing] AutoPrompt: Do not blindly design, laborious but thankless

This is the 22nd day of my participation in the First Challenge 2022

This is the second part of the Prompt series. In the last installment, I introduced you to some of the basics of Prompt, including a brief introduction to “How Prompt.” Among them, how to design Prompt, namely Prompt Engineering, is a subject worthy of further study and has also received extensive attention from scholars at home and abroad. This issue selects a classic work related toPrompt Engineering, AutoPrompt, to share with you.

This article was uploaded to arXiv in early November 2020 by a team of researchers from the University of California, Irvine and The University of California, Berkeley. The first author is Taylor Shin. Prompts AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts

Motivation

Designing prompts manually is a complex task that takes time, and it is not clear whether the same wording will work for every model, or what criteria determine whether a particular wording will best elicit the desired information. The authors propose a method to create prompt automatically.

Method

Taking the emotion analysis task as an example, the template λ for generating prompt is constructed from input. The template defines the position of xinpx_{inp}xinp and additional markers in the prompt for each input sequence, as well as the position of the special [MASK] to be filled in by MLM. To distinguish it from the masking identifier in MLM, This is denoted by [P]. Templates are also built with trigger tokens, xtrigx_{trig}xtrig, represented by [T].

Gradient-Based Prompt Search

The author adopts a prompt search method based on gradient, which is relatively complex and is roughly as follows: At the beginning, these [T] are all initialized by [MASK], and then updated iteratively, replacing one trigger token with another each time to form candidate set and. These prompt are used to predict, and the ones with high probability are retained… (I really did not understand, not disorderly introduction…… Feels a bit like genetic algorithms?)

Automating Label Token Selection

The authors also provide a method for automatically selecting the set of tokens for labels. In the first step, train a logical classifier to predict class tags using context embedding of [MASK] tokens as input. The second step is to replace the output word embedding of MLM with the predicted probability value as the score. Then, the tag tag set is constructed from the k highest-scoring words. (My personal understanding is that the maximum value of the predicted probability means that the model can give a more confident result based on the input, which indirectly means that the input is reasonable rather than chaotic and meaningless.)

Experiment

In short, the effect is good. In the case of small samples, the effect is better than that of manual design Prompt method, and the accuracy is close to fine-tuning, and even reaches SOTA in some cases.

Summary

The author designs a method to automatically design prompts, and the results are good. Compared with fine-tuning, prompt can also perform downstream tasks without introducing additional parameters and fine-tuning, and may achieve better results. AutoPrompt can extract more factual knowledge from MLMs than manual design Prompt.

Previous recommendations:

(1) Introduction: New paradigm of NLP? Pre-training new Prompt with Ignition – Nuggets (juejin. Cn)

[Prompt series] (2) [Paper sharing] AutoPrompt: Do not blindly design, laborious but thankless

Motivation

Method

Experiment

Summary

Related Posts

TensorFlow Java+ Eclipse environment construction

Plotly -express-8-plotly The scatter diagram is drawn

This section describes the input pipeline of TensorFlow