Welcome to yunjia community, access to more Tencent mass technology practice dry goods oh ~

Translator: Waitingalone

This article is translated from NEURAL TENSOR NETWORK: EXPLORING RELATIONS AMONG TEXT ENTITIES published by Gaurav Bhatt at Deeplearn-AI.com. The copyright, image code and other data belong to the author. The translation has been slightly modified for localization.

In this article, I will introduce neural tensor networks (NTN) as described in Reasoning with a Knowledge base based on neural tensor networks. My NTN implementation uses the latest versions of Python 2.7, Keras 2.0, and Theano 0.9.

Jump directly to the GitHub repository with the code.

What is knowledge base completion?

In knowledge base completion, the task is to determine the relationship between two entity pairs. For example, consider two entity pairs -<cat, tail> and <supervised learning, machine learning>. If we are asked to determine the relationship between a given pair – <cat, R, tail> and <supervised learning, R, machine Learning > – then the first relationship can best be summed up as

Have a type

And the second relationship can be boiled down to

The instance

. So, we can redefine these two pairs as <cat, has, tail> and <supervised learning, instance of, machine learning>. Neural tensor networks (NTN) are trained on databases of entity-relation pairs to explore additional relationships between entities. This is done by representing each entity in the database (that is, each object or individual) as a vector. These vectors can capture facts about the entity and how it might be part of a relationship. Each relation is defined by the parameters of a new neural tensor network that can explicitly involve two entity vectors

Predict new relational triples using NTN.

Neural models of relational reasoning

Being able to recognize that certain facts exist purely because of other existing relationships is the goal of learning models of common sense reasoning. NTN aims to discover the relation between entities < E1, E2 >, that is, to predict the relation R. For example, if (E1,R, E2) = (Bengal tiger, has part, tail) is true and deterministic. Neural tensor networks (NTN) replace a standard linear neural network layer with a bilinear tensor layer, which directly relates two solid vectors on multiple dimensions. The model calculates the probability score of two entities in a particular relationship by using the following NTN-based function:

Among themIt’s a standard nonlinear unit application,It’s a tensor, bilinear tensor productProduce the vector, where a slice of each entry tensorCalculation:. Other parameters R is the standard form of a neural network:and.

Visualize the tensor layer

NTN uses tensor variablesModel the relationship between two entities by multiplication. As shown above, NTN is an extension of the simple neural layer, adding these tensor variables. So, if we delete from the figure aboveFinally, the objective function is defined as

This is a simple entity vector connection, and the bias term.

Training objectives

NTN is trained by contrastive maximum margin objective function. Given a triplet in a training sample, by randomly replacing the second entity withTo create a negative sample, where j is a random index. Finally, the objective function is defined as

Among them,Is the regularization parameter.

Implementation details

Now that we have seen the work of NTN, it is time for further implementation. The important thing to consider here is that each given relation has its own set of tensor parameters. Let me briefly describe what we need to do with The help of Keras.

Each relationship is attributed to a separate Keras model, which also adds tensor parameters. Now, assume that the tensor layer is added between model initialization and composition. In a later article, I will explain the construction of a tensor layer. It is easy to conclude from the figure above that we need to process the training data in some way so that it can be passed to all individual models at the same time. All we want is to update the tensor parameters corresponding to a particular relation. However, Keras did not let us update a single model, while the rest. So we need to break the data down into different relationships. Each training sample will contain one instance of all relationships, that is, a pair of entities for each relationship.

The implementation of NTN layer

Let’s start by implementing the tensor layer. The prerequisite for this section is to write custom layers in Keras. If you’re not sure what this means, then check out the Keras documentation for writing your own Keras layers.

We first initialize the NTN class with parameters INP_size, out_size, and activation. The inp_size is the shape of the input variable, the entity in our example; The out_size is the tensor argument (K), and activation is the activation function to be used (default tanh).

from ntn_input import *
from keras import activations

class ntn_layer(Layer):
     def __init__(self, inp_size, out_size, activation='tanh', **kwargs):
          super(ntn_layer, self).__init__(**kwargs)
          self.k = out_size
          self.d = inp_size
          self.activation = activations.get(activation)
          self.test_out = 0Copy the code


The naming of the dimensions remains the same, that is, k corresponds to the number of tensor parameters for each relation, and D is the shape of the entity.

Now we need to initialize the tensor layer parameters. To better understand what we are doing here, take a look at the diagram below of a tensor network.

We initialize the four tensor parameters, namely W, V, b and U, as follows:

def build(self,input_shape):
     self.W = self.add_weight(name='w',shape=(self.d, self.d, self.k), initializer='glorot_uniform', trainable=True)

     self.V = self.add_weight(name='v', shape=(self.k, self.d*2), initializer='glorot_uniform', trainable=True)

     self.b = self.add_weight(name='b', shape=(self.k,), initializer='zeros', trainable=True)

     self.U = self.add_weight(name='u', shape=(self.k,), initializer='glorot_uniform',trainable=True)

     super(ntn_layer, self).build(input_shape)Copy the code

Here, we initialize the parameters with Glorot_UNIFORM sampling. In practice, this initialization results in better performance than other initializations. The other argument to the add_weight function is trainable and can be set to false if we do not want to update specific tunables. For example, we could set the W parameter to be untrainable and, as mentioned earlier, the NTN model would behave like a simple neural network.

Once the parameters have been initialized, it is time to implement the following equation:

The above equation gives the score for each entity pair. As you can see, we have to iterate over k tensor parameters (slices of the tensor model). This is done by counting the intermediate products for each iteration and, finally, summing up all these products. The code snippet below does this for you. Do not change the names of the functions, as they are consistent with the Keras API.

def call(self ,x ,mask=None):
     e1=x[0] 1 # entity
     e2=x[1] 2 # entity
     batch_size = K.shape(e1)[0]
     V_out, h, mid_pro = [],[],[]
     for i in range(self.k): # Calculate internal products
          V_out = K.dot(self.V[i],K.concatenate([e1,e2]).T)
          temp = K.dot(e1,self.W[:,:,i])
          h = K.sum(temp*e2,axis=1)
          mid_pro.append(V_out+h+self.b[i])

    tensor_bi_product = K.concatenate(mid_pro,axis=0)
    tensor_bi_product = self.U*self.activation(K.reshape(tensor_bi_product,(self.k,batch_size))).T

    self.test_out = K.shape(tensor_bi_product)
    return tensor_bi_productCopy the code


Finally, to complete the IMPLEMENTATION of the NTN layer, we must add the following functionality. This has nothing to do with NTN; Keras uses the following functions for internal processing.

def compute_output_shape(self, input_shape):
     return (input_shape[0][0],self.k)Copy the code


We have built a NTN layer that can be called just like any other neural layer in Keras. Let’s look at how to use the NTN layer on a real data set.

The data set

I’ll use the Wordbase and Freebase datasets mentioned in the article. I have the data set ready (part of the preprocessing is taken from the GitHub repository) and I can do the following.

import ntn_input

data_name = 'wordbase' # 'wordbase' or 'freebase'
data_path = 'data'+data_name raw_training_data = ntn_input.load_training_data(ntn_input.data_path) raw_dev_data = ntn_input.load_dev_data(ntn_input.data_path) entities_list = ntn_input.load_entities(ntn_input.data_path) relations_list  = ntn_input.load_relations(ntn_input.data_path) indexed_training_data = data_to_indexed(raw_training_data, entities_list, relations_list) indexed_dev_data = data_to_indexed(raw_dev_data, entities_list, relations_list) (init_word_embeds, entity_to_wordvec) = ntn_input.load_init_embeds(ntn_input.data_path) num_entities = len(entities_list) num_relations = len(relations_list)Copy the code


At this point you can print and view the entities and their corresponding relationships. Now we need to partition the data set based on relationships so that all Keras models can be updated at the same time. I’ve included a pre-processing feature for you to perform this step. A negative sample is also added to this step. The negative sample is passed to the prepare_data function as a corrupted sample. If corrupt_samples = 1, a negative sample is added for each training sample. This means that the entire training data set will be doubled.

import ntn_input
e1,e2,labels_train,t1,t2,labels_dev,num_relations = prepare_data(corrupt_samples)Copy the code

The definition of NTN is stored in a file called NTN, which is easy to import and use.

Build a model

To train the model, we need to define the contrast maximum edge loss function.

def contrastive_loss(y_true, y_pred):
     margin = 1
     return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))Copy the code


We should be able to call this custom missing function from the Keras compilation function.

from ntn import *

def build_model(num_relations):
     Input_x, Input_y = [], []
     for i in range(num_relations):
          Input_x.append(Input(shape=(dimx,)))
          Input_y.append(Input(shape=(dimy,)))

     ntn, score = [], [] # Store separate tensor parameters
     for i in range(num_relations): # iterate 'k' through each slice
          ntn.append(ntn_layer(inp_size=dimx, out_size=4)([Input_x[i],Input_y[i]]))
          score.append(Dense(1,activation='sigmoid')(ntn[i]))

     all_inputs = [Input_x[i]for i in range(num_relations)]
     all_inputs.extend([Input_y[i]for i in range(num_relations)]) Aggregate all models

     model = Model(all_inputs,score)
     model.compile(loss=contrastive_loss,optimizer='adam')
     return modelCopy the code


Finally, we need to aggregate the data to train the model

e, t, labels_train, labels_dev = aggregate(e1, e2, labels_train, t1, t2, labels_dev, num_relations)
model.fit(e, labels_train, nb_epoch=10, batch_size=100, verbose=2)Copy the code


At this point, you can see that the model begins to train, and the loss of each individual model decreases gradually. Furthermore, in order to calculate the accuracy of NTN on the knowledge base data set, we need to calculate the cost of all relationships and select the cost of the maximum score. As described in this article, the accuracy achieved is close to 88% (on average).

What’s next?

In this paper, we look at neural tensor networks for building knowledge base completion. In the next article, we’ll see how NTN can be used to solve other NLP problems, such as answers based on non-factual questions.

Original link: http://deeplearn-ai.com/2017/11/21/neural-tensor-network-exploring-relations-among-text-entities/

Gaurav Bhatt

reading

Collaborative filtering of recommendation algorithms

Monitoring algorithm for interns: curve prediction using time series model

Examples of deep learning with R and Keras

This article has been authorized by the author yunjia community published, reproduced please note
The original source