Abstract:With the widespread use of administrative documents for delivering and recording business information, there is an urgent need for methods that can robustand efficiently automatically extract and understand content from these documents. This paper proposes a graph neural network to solve the problem of entity recognition (NER) and relation extraction in semi-structured documents.

This article was shared from Huawei Cloud Community”Application of Graph Neural Network to Named Entity Recognition and Relation Extraction in Semi-structured Documents”, originally written by Chg, a little rookie.

Abstract:

With the widespread use of administrative documents for delivering and recording business information, there is an urgent need for methods that can robustand efficiently automatically extract and understand content from these documents. In addition, the graph-based representation is flexible in adapting to changes in different document templates, which makes the graph-based representation fit well with the semi-structured nature of these managed documents. Because GNN can well learn the relationship between data elements in documents, this paper proposes to use GNN to solve the problem of entity recognition (NER) and relation extraction in semi-structured documents. The article testified by experiment method in word grouping, entity classification, the relationship between forecast has been made on the SOTA three tasks as a result, at the same time FUNSD (form) and IEHHR (handwritten marriage archives understand) two completely different types of data sets on the experimental results further verify the interpretation of this article the generalization of the proposed method.

Method 1.

GNN is widely used in NER, table extraction and other tasks. On this basis, this paper proposes to apply GNN to the task of extracting key-value pairs, which not only classifies the entities in document pictures, but also predicts the relationships between entities.

Given an input document, the tasks the model needs to perform include: (a) word grouping: detecting document entities by grouping words with the same semantics; (b) Entity classification: the classification of detected entities into preset categories; (c) Relationship prediction: finding inter-entity pairing relationships.

(1) Structure of the graph

In this paper, we propose to construct two graphs to represent documents, and train three different models to solve the corresponding tasks: word grouping F_1F1, entity classification F_2F2, and relation prediction F_3F3. As shown in Figure 1, the document will be represented as a graph G_1=(V_1,E_1)G1=(V1,E1) constructed by the OCR result, where V_1v1 is a node set composed of each word in the OCR result. The distance between the upper left corner of the text box of each word is KK-nearest neighbor (K = 10K =10) to generate edge E_1E1. The fraction S = F_1 (G_1) S = F1 (G1) for each edge is calculated, and the threshold \tauτ (FUNSD = 0.65) is selected. IEHHR is set to 0.9) to get the result of grouping words.

Fig. 1 Schematic diagram of graph structure construction of word grouping

Fig. 2 Schematic diagram of graph structure of entity classification and relationship prediction

As shown in Fig. 2, G_2=(V_2,E_2)G2=(V2, E_2)G2=(V2,E2) was constructed from each entity after entities (i.e., word grouping) were obtained on the basis of G_1G1, where V_2v2 represented the entity set screened by G_1G1, and E_2E2 was the edge set obtained by full connection between nodes of each entity. The result of entity classification was obtained from C = F_2 (G_2) C = F2 (G2). The relationship prediction results were obtained from S = F_3 (G_3) S = F3 (G3).

(2) Figure calculation

2. Experimental results

According to the results of FSD experiment, compared with LayoutLM, the method proposed in this interpretation paper still has room for optimization, which may be due to the small data volume of FSD. The results of IEHHR experiment show that this method also has a certain effect in other fields of form recognition, namely, the understanding of handwritten records, which reflects its generalization.

Click on the attention, the first time to understand Huawei cloud fresh technology ~