This is the 23rd day of my participation in the First Challenge 2022


[2006.03535] CoCon: A Supervised Approach for Controlled Text Generation (arxiv.org)

Reading this article is mainly my writing to improve the material, I think neither belongs to the paper intensive reading nor coarse reading, may be some want to intensive reading or coarse reading article and there is no reference value. More suitable for people learning how to write articles.

Abstract

The pre-trained language model based on Transformer shows excellent natural language generation capability.

While there are text generation tasks that attempt to control high-level attributes of text (emotion, theme, etc.), precise control of content at the word and phrase level is still lacking.

CoCon is proposed to control the output of pretrained language models at a fine-grained level for input.

In this paper’s self-supervised approach, CoCon helps the language model complete text sequences by constraining content input. Through experiments, we found that CoCon could naturally incorporate the target content into the generated text and control the high-level text attributes in a zero-shot manner.

introduction

Pre-trained language modeling based on Transformer leads the new trend of natural language processing tasks, which can be used as a method to extract contextual word embedding or as a text generator.

Since the large pre-training model has been trained on a large number of text corpora and can generate text very smoothly, some articles began to explore how to control the output text.

  • ArXiv: 1909.05858, 2019.
  • ArXiv: 1912.02164, 2019.

Retraining pre-trained models from scratch is too expensive. (arXiv: 1909.05858, 2019)

Tweaking a particular attribute affects its generalization. (arXiv: 1909.08593, 2019)

Without changing the pretrained language model, you can try to control the generated text through the attribute module. (Tweaking a particular attribute affects its generalization. (arXiv: 1909.08593, 2019)

While some success has been achieved in controlling high-level text attributes (emotion, topic), the same target attributes can produce text with vastly different content at the word and phrase levels. There are still gaps in fine-grained control of text generation in pre-trained language models.

CoCon:

  • Pre-trained language models
  • CoCon layer

The CoCon layer merges the representation of the input into the encoded text. It is then passed on to the pre-trained language model.

Use self-supervised learning. Training data are text samples generated by the pretraining model itself.

CoCon advantage:

  • The consistency of fine-grained control output can also affect high-level text attributes, such as emotion and theme
  • Multiple influencing factors can be comprehensively dealt with and the influence of different factors can be controlled
  • Modular and can be combined at will with different pre-trained language models based on Transformer

The pre-trained language model used in this paper is GPT2.

Related work

There is a lot of work to generate text for desired attributes through neural networks.

Use conditional generation models. The neural network is trained with text data marked with target attributes. Networks can be trained by reinforcement learning or adversarial generative networks.

  • Controlling output length in neural encoder-decoders.
  • Controlling linguistic style aspects in neural language generation.
  • Fine-tuning language models from human preferences.
  • Sequence generative adversarial nets with policy gradient.

The requirement for predetermined attributes in these methods limits the possible types of text that can be generated.

Generate controlled text by using control code. Because the structure is similar to GPT-2, high-quality text is generated, but its control code is determined during training.

  • A conditional transformer language model for controllable generation.

The closest thing to this article’s work is the Plug and play Language Model (PPLM), which controls text on a pretrained language model without fine-tuning through a relatively small “pluggable” attribute model. But it only focuses on attributes of high-level text, and the training process requires markup data.

  • Plug and play language models: a simple approach to controlled text generation.

The rest of the comparison article, but it’s much better than INSET.

For weighting, increase the weight of the target word in decoder to control the output text, but it will produce incoherent text.

The conditional language production method used for problem production focuses on contextual text, such as subject-verb-object types.

Small adapters for translation rely on annotated sentence resources in different languages.

Also mentioned is a mouth related text style conversion

Convert one text style to another.

Autoencoders are used to separate the stylistic features and non-stylistic potential representations of text, which can change the text style while preserving most of the original text content.

Identify attribute tags associated with a particular style in the text corpus and modify the text style by substitution. This is more text oriented and requires predefined styles.