A quick word about EmbedKGQA

Thesis: Improving multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings Aclanthology.org/2020.acl-ma… Source: github.com/malllabiisc…

Subgraphs build search answers to questions

If the KG is incomplete, the edge has_genre(Gangster No.1, Crime) is missing. So in the process of reasoning, the answer Crime is in the position of 4 hops. If the subgraph search is only limited to 3 hops, there is no answer in the constructed subgraph.

EmbedKGQA method

  • KG Embedding Module: Learns all entites’ Embedding from the input KG. It uses ComplEx Embedding in real code, but other KGE methods are also implemented and can be used instead

  • Question Embedding Module: learning the Embedding of problems, that is, the semantic information extracted from a sentence problem is converted into vector

    1. In this paper, the ReLU activation function was transformed into a 7688-dimensional vector after Roberta was added with 4 layers of MLP
    2. What is actually written in the code is only 1 MLP behind Roberta and no ReLU activation function is converted into a 768 dimensional vector
  • Answer Selection Module: Select the final Answer based on the similarity score of the question and the relationship

    • In two cases: no matter KG is large or KG is small, the answer is selected from candidate entities generated in Answer scoring, while answer scoring calculates the scores of all entities in KG and selects top K as candidate entities

    • In the case of small KG, as all entities in KG are added, the Entity with the highest score is selected through Answer scoring, namely top1

    • In the case of large KG, pruning would improve performance, so there is a pruning strategy that selects top200 points in the code

      1. Firstly, a score of q(question) and all relations is calculated. If it is greater than 0.5, it is added to the candidate relation set R_a

      1. Then calculate the shortest path of each entity from the head Entity to the candidate entity set, save the relationship through the path, and generate the relationship set R_a’ (the relationship set is for a specific candidate entity)

      2. The intersection of two relation sets is taken

      3. If the intersection is not empty, then the candidate entity is the answer