Abstract: This article takes you from zero to experience the application of quantum neural network in natural language processing

This article is shared from “Experiencing Quantum Neural Networks in Natural Language Processing” by Jeffding, Huawei Cloud Community.

This article takes you from zero to experience the application of quantum neural network in natural language processing.

I. Operating environment

CPU: Intel(R) Core(TM) i7-4712Mq CPU @ 2.30GHz

Memory: 4 gb

Operating system: Ubuntu 20.10

Mindspore version: 1.2

Install MindSpore

Refer to the website to install document: https://www.mindspore.cn/inst…

Install MindQuantum reference documents: https://gitee.com/mindspore/m…

__version__ View version by Mindspore

Experience the application of quantum neural network in natural language processing

1. Environmental preparation

Import numpy as np import time from projectq.ops import QubitOperator import mindspore.ops as ops import mindspore.dataset as ds from mindspore import nn from mindspore.train.callback import LossMonitor from mindspore import Model from mindquantum.nn import MindQuantumLayer from mindquantum import Hamiltonian, Circuit, RX, RY, X, H, The UN # data preprocessing def GenerateWordDictAndSample (corpus, window = 2) : all_words = corpus.split() word_set = list(set(all_words)) word_set.sort() word_dict = {w: i for i,w in enumerate(word_set)} sampling = [] for index, word in enumerate(all_words[window:-window]): around = [] for i in range(index, index + 2*window + 1): if i ! = index + window: around.append(all_words) sampling.append([around,all_words[index + window]]) return word_dict, sampling word_dict, sample = GenerateWordDictAndSample("I love natural language processing") print(word_dict) print('word dict size: ', len(word_dict)) print('samples: ', sample) print('number of samples: ', len(sample))

Running results:

The current emulator thread is 1. If your emulation is slow, set OMP_NUM_THREADS to an appropriate number depending on your model. {‘I’: 0, ‘language’: 1, ‘love’: 2, ‘natural’: 3, ‘processing’: 4} word dict size: 5 samples: [[[‘ I ‘, ‘love’, ‘language’, ‘processing’], ‘natural’]] sample: 1

From the above information, we get that the dictionary size of this sentence is 5, which can produce a sample point.

2. Code lines

def Genera**coderCircuit(n_qubits, prefix=''): if len(prefix) ! = 0 and prefix[-1] ! = '_': prefix += '_' circ = Circuit() for i in range(n_qubits): circ += RX(prefix + str(i)).on(i) return circ Genera**coderCircuit(3,prefix='e')

Running results:

RX(e_0|0)

RX(e_1|1)

RX(e_2|2)

We usually use | 0 ⟩ “> | 0 ⟩ ⟩ ⟩ and | | 0 1” > 1 ⟩ | | 1 ⟩ to tag a two-level quantum bit two state, by state superposition principle, quantum bits can also be in a superposition state of the two:

| bits ⟩ = alpha beta | | 0 ⟩ + 1 ⟩ “> | bits ⟩ = alpha beta | | 0 ⟩ + 1 ⟩ | bits ⟩ = alpha beta | | 0 ⟩ + 1 ⟩

For the quantum state of n”>nn bits, it will be in a 2n”>2n2n dimensional Hilbert space. For the above dictionary composed of 5 words, we only need ⌈log2 log 5 =3″> log25 dot =3 and log2 log 5 dot =3 qubits to complete the encoding, which also reflects the superiority of quantum computing.

For example, for “love” in the dictionary above, whose binary notation is labeled 2,2 is 010, we simply set E0, E1, and E_2 in the coding line to 0″>00, π”>ππ, and 0″>00, respectively.

# Evolution validates from Mindspore. Nn import generate_evolution_operator from Mindspore import context from Mindspore import Tensor n_qubits = 3 # number of qubits of this quantum circuit label = 2 # label need to encode label_bin = bin(label)[-1:1:-1].ljust(n_qubits,'0') # binary form of label label_array = np.array([int(i)*np.pi for i in label_bin]).astype(np.float32) # parameter value of encoder encoder = Genera**coderCircuit(n_qubits, prefix='e') # encoder circuit encoder_para_names = encoder.parameter_resolver().para_name # parameter names of encoder print("Label is: ", label) print("Binary label is: ", label_bin) print("Parameters of encoder is: \n", np.round(label_array, 5)) print("Encoder circuit is: \n", encoder) print("Encoder parameter names are: \n", encoder_para_names) context.set_context(mode=context.GRAPH_MODE, device_target="CPU") # quantum state evolution operator evol = generate_evolution_operator(param_names=encoder_para_names, circuit=encoder) state = evol(Tensor(label_array)) state = state.asnumpy() quantum_state = state[:, 0] + 1j * state[:, 1] amp = np.round(np.abs(quantum_state)**2, 3) print("Amplitude of quantum state is: \n", amp) print("Label in quantum state is: ", np.argmax(amp))

Running results:

Label is: 2 Binary Label is: 010 Parameters of encoder is: [0. 3.14159 0.] encoder circuit is: RX(e_0|0) RX(e_1|1) RX(e_2|2) Encoder parameter names are: [‘e_0’, ‘e_1’, ‘e_2’] Amplitude of quantum state is: [0. 0. 1. 0. 0. 0. 0. 0.] Label in quantum state is: 2

Through the above verification, we found that, for the data labeled 2, the maximum amplitude of the quantum state finally obtained was also at 2, so the quantum state obtained was just the encoding of the input label. We summarize the process of encoding data to generate parameter values as the following function.

def GenerateTrainData(sample, word_dict):
    n_qubits = np.int(np.ceil(np.log2(1 + max(word_dict.values()))))
    data_x = []
    data_y = []
    for around, center in sample:
        data_x.append([])
        for word in around:
            label = word_dict[word]
            label_bin = bin(label)[-1:1:-1].ljust(n_qubits,'0')
            label_array = [int(i)*np.pi for i in label_bin]
            data_x[-1].extend(label_array)
        data_y.append(word_dict[center])
    return np.array(data_x).astype(np.float32), np.array(data_y).astype(np.int32)
GenerateTrainData(sample, word_dict)

Running results:

(array([[0., 0., 0., 0., 0., 3.1415927, 0..))

3.1415927, 0., 0., 0., 3.1415927], dtype=float32),

array([3], dtype=int32))

According to the above results, we combine the information encoded by the four input words into a longer vector, which is convenient for subsequent neural network calls.

3. The Ansatz lines

Def generateAnsatz Circuit(n_qubits, layers, prefix= "): if Len (prefix)! = 0 and prefix[-1] ! = '_': prefix += '_' circ = Circuit() for l in range(layers): for i in range(n_qubits): circ += RY(prefix + str(l) + '_' + str(i)).on(i) for i in range(l % 2, n_qubits, 2): if i < n_qubits and i + 1 < n_qubits: circ += X.on(i + 1, i) return circ GenerateAnsatzCircuit(5, 2, 'a')

Running results:

RY(a_0_0|0)

RY(a_0_1|1)

RY(a_0_2|2)

RY(a_0_3|3)

RY(a_0_4|4)

X(1 <-: 0)

X(3 <-: 2)

RY(a_1_0|0)

RY(a_1_1|1)

RY(a_1_2|2)

RY(a_1_3|3)

RY(a_1_4|4)

X(2 <-: 1)

X(4 <-: 3)

4. Measure

def GenerateEmbeddingHamiltonian(dims, n_qubits): hams = [] for i in range(dims): s = '' for j, k in enumerate(bin(i + 1)[-1:1:-1]): if k == '1': s = s + 'Z' + str(j) + ' ' hams.append(Hamiltonian(QubitOperator(s))) return hams GenerateEmbeddingHamiltonian(5, 5)

Running results:

[1.0z0, 1.0z1, 1.0z0 Z1, 1.0z2, 1.0z0 Z2]

5. Quantum version of word vector embedding layer

Run export OMP_NUM_THREADS=4 on the terminal before running

def QEmbedding(num_embedding, embedding_dim, window, layers, n_threads): n_qubits = int(np.ceil(np.log2(num_embedding))) hams = GenerateEmbeddingHamiltonian(embedding_dim, n_qubits) circ = Circuit() circ = UN(H, n_qubits) encoder_param_name = [] ansatz_param_name = [] for w in range(2 * window): encoder = Genera**coderCircuit(n_qubits, 'Encoder_' + str(w)) ansatz = GenerateAnsatzCircuit(n_qubits, layers, 'Ansatz_' + str(w)) encoder.no_grad() circ += encoder circ += ansatz encoder_param_name.extend(list(encoder.parameter_resolver())) ansatz_param_name.extend(list(ansatz.parameter_resolver())) net = MindQuantumLayer(encoder_param_name, ansatz_param_name, circ, hams, n_threads=n_threads) return net class CBOW(nn.Cell): def __init__(self, num_embedding, embedding_dim, window, layers, n_threads, hidden_dim): super(CBOW, self).__init__() self.embedding = QEmbedding(num_embedding, embedding_dim, window, layers, n_threads) self.dense1 = nn.Dense(embedding_dim, hidden_dim) self.dense2 = nn.Dense(hidden_dim, num_embedding) self.relu = ops.ReLU() def construct(self, x): embed = self.embedding(x) out = self.dense1(embed) out = self.relu(out) out = self.dense2(out) return out class LossMonitorWithCollection(LossMonitor): def __init__(self, per_print_times=1): super(LossMonitorWithCollection, self).__init__(per_print_times) self.loss = [] def begin(self, run_context): self.begin_time = time.time() def end(self, run_context): self.end_time = time.time() print('Total time used: {}'.format(self.end_time - self.begin_time)) def epoch_begin(self, run_context): self.epoch_begin_time = time.time() def epoch_end(self, run_context): cb_params = run_context.original_args() self.epoch_end_time = time.time() if self._per_print_times ! = 0 and cb_params.cur_step_num % self._per_print_times == 0: print('') def step_end(self, run_context): cb_params = run_context.original_args() loss = cb_params.net_outputs if isinstance(loss, (tuple, list)): if isinstance(loss[0], Tensor) and isinstance(loss[0].asnumpy(), np.ndarray): loss = loss[0] if isinstance(loss, Tensor) and isinstance(loss.asnumpy(), np.ndarray): loss = np.mean(loss.asnumpy()) cur_step_in_epoch = (cb_params.cur_step_num - 1) % cb_params.batch_num + 1 if isinstance(loss, float) and (np.isnan(loss) or np.isinf(loss)): raise ValueError("epoch: {} step: {}. Invalid loss, terminating training.".format( cb_params.cur_epoch_num, cur_step_in_epoch)) self.loss.append(loss) if self._per_print_times ! = 0 and cb_params.cur_step_num % self._per_print_times == 0: print("\repoch: %+3s step: %+3s time: %5.5s, loss is %5.5s" % (cb_params.cur_epoch_num, cur_step_in_epoch, time.time() -self.epoch_begin_time, loss), flush=True, end='') import mindspore as ms from mindspore import context from mindspore import Tensor context.set_context(mode=context.GRAPH_MODE, device_target="CPU") corpus = """We are about to study the idea of a computational process. Computational processes are abstract beings that inhabit computers. As they evolve, processes manipulate other abstract things called data. The evolution of a process is directed by a pattern of rules called a program. People create programs to direct processes. In effect, we conjure the spirits of the computer with our spells.""" ms.set_seed(42) window_size = 2 embedding_dim = 10 hidden_dim  = 128 word_dict, sample = GenerateWordDictAndSample(corpus, window=window_size) train_x,train_y = GenerateTrainData(sample, word_dict) train_loader = ds.NumpySlicesDataset({ "around": train_x, "center": train_y },shuffle=False).batch(3) net = CBOW(len(word_dict), embedding_dim, window_size, 3, 4, hidden_dim) net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, Reduction ='mean') net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9) loss_monitor = LossMonitorWithCollection (500) model = model (.net, net_loss, net_opt) model. The "train" (350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)

Running results:

As the epoch of the epoch was 25 step: 20 time: 36.14, loss was 2.945, the epoch was 75 step: 20 time: 36.14, loss was 2.945, the epoch was 25 step: 20 time: 36.14, loss was 2.945, the epoch was 50 step: 20 time: 36.14, loss was 2.945 36.71, loss is 0.226 epoch: 100 step: 20 time: 36.56, loss is 0.016 Total time used: 3668.7517251968384

Print the value of the loss function in the convergence process:

import matplotlib.pyplot as plt plt.plot(loss_monitor.loss,'.') plt.xlabel('Steps') plt.ylabel('Loss') plt.show()

The parameters in the quantum circuit of the printed quantum embedded layer

Net. Embedding. Weight. Asnumpy () array ([6.4384632 e-02, 1.2658586 e-01, 1.0083634 e-01, 1.3011757 e-01,

1.4005195E-03, -1.9296107E-04, -7.9315618E-02, -2.9339856E-01, 7.6259784E-02, 2.9878360E-01, -1.3091319E-04, 6.8271365E-03, -8.5563213E-02, -2.4168481E-01, -8.2548901E-02, 3.0743122E-01, -7.8157615E-04, -3.2907310E-03, -1.4412615E-01, -1.9241245E-01, -7.5561814E-02, -3.1189525E-03, -3.8330450E-03, -1.4486053E-04, -4.8195502E-01, 5.3657538E-01, 3.8986996E-02, 1.7286544E-01, -3.4090234E-03, -9.5573599E-03, -4.8208281E-01, 5.9604627E-01, -9.7009525E-02, 1.8312852E-01, 9.5267012E-04, -1.2261710E-03, 3.4219343E-02, 8.0031365E-02, -4.5349425E-01, 3.7360430E-01, 8.9665735E-03, 2.1575980E-03, -2.3871836E-01, -2.4819574E-01, -6.2781256E-01, 4.3640310E-01, -9.7688911E-03, -3.9542126E-03, -2.4010721E-01, 4.8120108E-02, -5.6876510E-01, 4.3773583E-01, 4.7241263E-03, 1.4138421E-02, -1.2472854E-03, 1.1096644E-01, 7.1980711E-03, 7.3047012E-02, 2.0803964E-02, 1.1490706E-02, -1.2472854E-03, 1.1096644E-01, 7.1980711E-03, 7.3047012E-02, 1.1490706E-02, 8.6638138E-02, 2.0503466E-01, 4.7177267E-03, -1.8399477E-02, 1.1631225E-02, 2.0587114E-03, 7.6739892E-02, -6.3548386e-02, 1.7298019e-01, -1.9143591e-02, 4.1606693e-04, -9.2881303e-03], dtype=float32)

6. Classic version of word vector embedding layer

class CBOWClassical(nn.Cell): def __init__(self, num_embedding, embedding_dim, window, hidden_dim): super(CBOWClassical, self).__init__() self.dim = 2 * window * embedding_dim self.embedding = nn.Embedding(num_embedding, embedding_dim, True) self.dense1 = nn.Dense(self.dim, hidden_dim) self.dense2 = nn.Dense(hidden_dim, num_embedding) self.relu = ops.ReLU() self.reshape = ops.Reshape() def construct(self, x): embed = self.embedding(x) embed = self.reshape(embed, (-1, self.dim)) out = self.dense1(embed) out = self.relu(out) out = self.dense2(out) return out train_x = [] train_y = [] for  i in sample: around, center = i train_y.append(word_dict[center]) train_x.append([]) for j in around: train_x[-1].append(word_dict[j]) train_x = np.array(train_x).astype(np.int32) train_y = np.array(train_y).astype(np.int32) print("train_x shape: ", train_x.shape) print("train_y shape: ", train_y.shape) train_loader = ds.NumpySlicesDataset({ "around": train_x, "center": train_y },shuffle=False).batch(3) net = CBOWClassical(len(word_dict), embedding_dim, window_size, hidden_dim) net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, Reduction ='mean') net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9) loss_monitor = LossMonitorWithCollection (500) model = model (.net, net_loss, net_opt) model. The "train" (350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)

Running results:

Train_Y Shape: (58, 1) epoch: train_Y Shape: (58, 2) epoch: train_Y Shape: (58, 2) epoch: train_Y Shape: (58, 2) epoch: 25 step: 20 time: 0.077, loss is 3.156 epoch: 50 step: 20 time: 0.095, loss is 3.025 epoch: 75 step: 20 time: 0.115, loss is 2.996 epoch: 100 step: 20 time: 0.088, loss is 1.773 epoch: 125 step: 20 time: 0.083, loss is 0.172 epoch: 150 step: 20 time: 0.110, loss is 0.008 epoch: 175 step: 20 time: 0.086, loss is 0.003 epoch: 200 step: 20 time: 0.081, loss is 0.001 epoch: 225 step: 20 time: 0.081, loss is 0.000 epoch: 250 step: 20 time: 0.078, loss is 0.000 epoch: 275 step: 20 time: 0.079, loss is 0.000 epoch: 300 step: 20 time: 0.000 epoch: 0.078, loss is 0.000 epoch: 275 step: 20 time: 0.080, loss is 0.000 epoch: 325 step: 20 time: 0.078, loss is 0.000 epoch: 350 step: 20 time: 0.081, loss is 0.000 Total time used: 30.569124698638916

Figure of convergence:

It can be seen from the above that the quantum version word embedding model obtained through quantum simulation can also complete the embedding task well. Quantum computers will be able to handle these kinds of problems easily when the data sets are too large for a classical computer to handle.

Click on the attention, the first time to understand Huawei cloud fresh technology ~