Make writing a habit together! This is the 7th day of my participation in the “Gold Digging Day New Plan · April More text Challenge”. Click here for more details.

Introduction:

Learning needs, understanding and using Coreference Resolution. The learning resources for the classic Stanford CS224n 2021 courses, video link is: www.bilibili.com/video/BV18Y…

This blog continues to introduce along with the ideas of the previous article, which is mainly divided into the following parts:

  • Different reference types
  • Rule based Coreference Resolution method: Hobbs algorithm
  • Mention Pair model

Different reference types

Before we learn more about the model of Coreference Resolution, we will add some linguistic knowledge. First, we need to know the difference between Coreference and Anaphora.

  • When two references refer to the same entity in the world, they are called coreferences. [Barack] Obama traveled to New York and [Obama] enjoy this trip.
  • While Anaphora refers to a later word reversely or in aphora to a previous word, the interpretation of anaphor depends to some extent on the interpretation of antecedent. For example: [Barack Obama] said [he] would sign the bill.

Their differences can be shown in the following figure:

  • Not all noun phrases have a reference

    • Every dancer twisted her knee
    • No dancer twisted her knee

    • Each sentence has three NPs; Because the first one is non-indicative, and neither are the other two

  • Not all anaphora relationships have anaphora, as shown in the figure below. The relationship between concert and the ticket is a linguistic phenomenon called bridging anaphora, because their relationship is the ticket of concert.

Sometimes the aphora is not always preceding but sometimes the referential noun may appear later. This phenomenon is called cataphora. But generally speaking, we do not differentiate between anaphora and cataphora, but collectively refer to anaphora.

Language is usually interpreted in context, and we’ve seen many examples of this before. Such as

  • I took money out of the bank
  • The boat disembarked from the bank

The word “bank” in these two sentences has different meanings. The first one refers to the bank and the second one refers to the river bank. Coreference is often used in long text statements or paragraphs. Therefore, Coreference and Anaphora are one of the key points for us to understand natural language texts.

Four different kinds of Coreference models

After understanding the above basic knowledge, four different types of Coreference models are introduced here, mainly including:

  • Rule-based models
  • Mention Pair
  • Mention Ranking
  • clustering

The traditional method of pronoun anaphora is the naive Hobbs algorithm

Hobbs algorithm was proposed by Hobbs in 1978 and is one of the earliest co-signatory resolution algorithms. The modified algorithm is an algorithm based on pure rules, and its general process is as follows: First, the text is syntactic analyzed to build a text syntactic analysis tree. After that, an anaphora is fixed first, and then the anaphora is repeatedly backtracked and breadth-first traversed from the anaphora node in the syntactic analysis tree according to a series of rules until the anaphora is found.

Specific rules of the naive Hobbs algorithm are shown in the figure below:

Here, Manning gives a simple example. The algorithm first finds the pronoun him, and then, according to the rules, ultimately determines that it refers to the original noun phrase Niall Ferguson.

But at the same time, this simple rule can cause problems by failing to take semantic information into account. Here are a few examples.

  • She poured water from the pitcher into the cup until it was full.
  • She poured water from the pitcher into the cup until it was empty.

Here, the two sentences have the same grammatical structure, but because of our knowledge of the external world, we know that after pouring water, the cup is full (the first “it” refers to the cup) and the pitcher is empty (the second “it” refers to the pitcher).

  • The city council refused the women a permit because they feared violence.
  • The city council refused the women a permit because they advocated violence.

Here, the women they represent, and the city Council.

Despite these problems, the Hobbs algorithm is still a strong baseline until around 2015.

The Coreference model for the Mention Pair

The simplest way to use machine learning to solve Coreference Resolution is to make a decision between any two Mentions, that is, whether they refer to the same entity.

For example, we can train a binary classifier. For any two mentions, the probability p(mi, MJ)p(m_i,m_j)p(mi, MJ) predicts that they are referred to the same category, making the positive sample prediction probability close to 1 and the negative sample prediction probability close to 0.

For such a simple binary classifier, the loss function during training is defined as follows:

In the test, we cluster all the referents in the same category.

Since we have already predicted the probability of coreferences between any two Mentions, we only need to set a threshold (say 0.5) for linking.

At the same time, we want to keep closure. For example, after adding links to she–>I and she–>my, we need to manually add links to my–>I.

However, if there is a wrong connection, it will cause the entire category to fail. For example, if there is a link between my–>he, all pronouns mention will be clustered into the same category.

Also, assume that our long document has many mentions like the following. Many mentions have only one clear antecedent, but it is very difficult for us to ask models to predict them. On the contrary, the training model predicts only one antecedent for each mention, which is more reasonable in language.

conclusion

This blog first introduces the different types of referential in English linguistics, and then explains the traditional rule-based method of referential resolution — the naive Hobbs algorithm and the machine-learning singer-pair model respectively. Finally, the advantages and disadvantages of several Mention-pair models are analyzed. In the next blog post, we’ll improve the star-pair model.