The Story of Foundations and Nuclei - 3 Regenerates the Hilbert nuclear Space

This is the fourth day of my November challenge

The original link

Introduction to nuclear methods

You can begin to understand kernel methods with the concept of functional bases.

Kernel method has been widely used in various data analysis techniques. Its inspiration lies in mapping a vector in Rn\mathcal{R}^nRn space to another vector in an eigenspace. As shown in the figure below, there are some red and blue dots that are difficult to separate in Rn\mathcal{R}^nRn space, and it may be easier to separate them if they are mapped to higher dimensional feature space.

Characteristics of decomposition

For A symmetric matrix A\mathbf{A}A (AT=A\mathbf{A}^T=\mathbf{A}AT=A), there exists A real number λ\lambdaλ and A vector x\mathbf{x}x:

\mathbf{A}\mathbf{x}=\lambda\mathbf{x}

λ\lambda lambda is the eigenvalue of A\mathbf{A}A, and x\mathbf{x}x is the eigenvector. If A\mathbf{A}A has two eigenvalues λ1,λ2\lambda_1,\lambda_2λ1,λ2, and two eigenvectors x1,x2\mathbf{x}_1,\mathbf{x}_2x1,x2, it can obviously be deduced:

\lambda_1\mathbf{x}_1^T\mathbf{x}_2=\mathbf{x}_1^T\mathbf{A}^T\mathbf{x}_2=\mathbf{x}_1^T\mathbf{A}\mathbf{x}_2=\lambda_ 2\mathbf{x}_1^T\mathbf{x}_2

Lambda_1 \neq \lambda_2λ1=λ2, x1Tx2=0\mathbf{x}^ t_1\ mathbf{x}_2=0x1Tx2=0, Therefore x1T\mathbf{x}_1^Tx1T and x2\mathbf{x}_2x2 are orthogonal.

For A∈Rn×n\mathbf{A} \in \mathcal{R}^{n \times n}A∈Rn×n, n eigenvalues and their corresponding eigenvectors can be found. Thus, A\mathbf{A}A can be expressed as:

\mathbf{A}=\mathbf{Q}\mathbf{D}\mathbf{Q}^T

Here the Q = (q1,… ,qn)\mathbf{Q}=(\mathbf{q}_1,… ,\mathbf{q}_n)Q=(q1,… , qn) is an orthogonal matrix (i.e. QQT = E \ mathbf {Q} \ mathbf {Q} ^ T = EQQT = E), and D = diag (lambda 1,… , lambda n) \ mathbf {D} = diag (\ lambda_1,… , \ lambda_n) D = diag (lambda 1,… ,λn) (diag is diagonal matrix). The above formula can be expanded as:

, I = 1 n \ {qi} {\ mathbf {q} _i \} ^ n_ (I = 1} {qi} I = 1 n is an Rn \ mathcal {R} ^ nRn orthogonal basis set in space.

Kernel function

A function f(x)f(x)f(x) can be thought of as an infinite vector, and a function K(x,y)K(x,y)K(x,y) K(x,y) can be thought of as an infinite matrix with two independent variables. If K(x,y)=K(y,x)K(x,y)=K(y,x)K(x,y)=K(y,x), and:

\int \int f(x)K(x,y)f(y)dxdy \geq 0

Then for any function FFF, K(x,y)K(x,y)K(x,y) is symmetric and positive definite, then K(x,y)K(x,y)K(x,y) is a kernel function.

Let A∈Rn×nA\in R^{n \times n}A∈Rn×n, if A=ATA=A^TA=AT, XTAX>0X^TAX>0XTAX>0 for any 0≠X∈Rn0 \neq X \in R^n0=X∈Rn, A is called A symmetric positive definite matrix.

And the eigenvalue λ\lambdaλ and the eigenfunction ψ(x)\psi(x)ψ(x) are such that:

\int K(x,y)\psi(x)dx=\lambda \psi (y)

For different eigenvalues $\lambda_1$ and $\lambda_2$ And the corresponding eigenfunctions $\psi_1(x)$ and $\psi_2(x)$ Easy:

Therefore, it can be concluded that:

<\psi_1,\psi_2>=\int \psi_1(x) \psi_2(x)dx = 0

We know that the eigenfunction is orthogonal, where ψ\psiψ represents the function (infinite vector) itself.

For a kernel function, if there is infinite eigenvalues up I} {lambda I = 1 \ {\ lambda_i \} ^ \ infty_ {I = 1} {lambda I} I = 1 up and infinite characteristic function up I} {bits I = 1 \ {\ psi_i \} ^ \ infty_ {I = 1} {bits I}, I = 1 up It can be obtained as in the matrix case:

K(x,y)=\sum^\infty_{i=0}\lambda_i\psi_i(x)\psi_i(y)

This is also known as Mercer’s theorem: any semidefinite symmetric function can be a kernel. Up here, I} {bits I = 1 \ {\ psi_i \} ^ \ infty_ {I = 1} {bits I} I = 1 up constitute a set of orthogonal basis in a function space.

Common kernel functions are:

Polynomial kernel function is: (x, y) = K (gamma xTy + C) dK (x, y) = (\ gamma x ^ Ty + C) ^ dK (x, y) = (gamma xTy + C) d, where d = 1, 2,… Nd = 1, 2,… Nd = 1, 2,… , N.
Gaussian radial basis kernel (Gaussian radial basis kernel), K (x, y) = exp (- gamma ∣ ∣ x – y ∣ ∣ 2) K (x, y) = exp (- \ gamma | | x-y | | ^ 2) K (x, y) = exp (- gamma ∣ ∣ x – y ∣ ∣ 2).
Sigmoid kernel: K(x,y)=tanh(γxTy+C)K(x,y)=tanh(\gamma x^Ty+C)K(x,y)=tanh(γxTy+C),tanh refers to the hyperbolic tangent function.

Reproducing Kernel Hilbert Space

Will {lambda I bits I}, I = 1 up \ {\ SQRT {\ lambda_i} \ psi_i \} ^ \ infty_ {I = 1} {lambda I bits I} I = 1 up as a set of orthogonal basis constructed a Hilbert space H \ mathcal {H} H. Any function or vector in space can be represented as a linear combination of these bases. Hilbert space concept

Assumptions:

f=\sum^\infty_{i=1}f_i\sqrt{\lambda_i}\psi_i

FFF can be defined as an infinite vector in H\mathcal{H}H:

f=(f_1,f_2,…) ^T_{\mathcal{H}}

For another function g=g1,g2… HTg={g_1,g_2,… }^T_{\mathcal{H}}g=g1,g2,… HT:

<f,g>_{\mathcal{H}}=\sum^\infty_{i=1}f_ig_i

For a kernel function K, use K(x,y)K(x,y)K(x,y) to represent the evaluation of K at the point (x,y)(x,y)(x,y) as a scalar. Using the K (⋅ ⋅) K (, \ \ cdot cdot) K (⋅ ⋅) (infinite vector) to represent the function itself, using the K (x, ⋅) K (x, \ cdot) K (x, ⋅) to represent a matrix of the first x lines. One parameter of the kernel function is defined as XXX, and then it can be regarded as a function with one parameter or an infinite vector, obtaining:

K(x,\cdot)=\sum^\infty_{i=0}\lambda_i\psi_i(x)\psi_i

In space H\mathcal{H}H we can define:

K(x,\cdot)=(\sqrt{\lambda_1}\psi_1(x),\sqrt{\lambda_2}\psi_2(x),…) ^T_{\mathcal{H}}

Therefore, it can be obtained:

<K(x,\cdot),K(y,\cdot)>_{\mathcal{H}}=\sum^\infty_{i=0}\lambda_i\psi_i(x)\psi_i(y)=K(x,y)

This is the regenerative property, that is, the kernel function is used to regenerate the inner product of two functions. The regenerative property allows us to calculate only the kernel function instead of the inner product in the higher dimensional characteristic space, which greatly reduces the computation. Therefore, H\mathcal{H}H is called the regenerative kernel Hilbert space (RKHS).

Back to the original question: how do you map points to the feature space using kernels?

Define a mapping:

\Phi(x)=K(x,\cdot)=(\sqrt{\lambda_1}\psi_1(x),\sqrt{\lambda_2}\psi_2(x),…) ^T

This allows the point x to be mapped to H\mathcal{H}H, where Phi Phi Phi does not represent a function but points to a vector or function in the characteristic space H\mathcal{H}H. Then we get:

<\Phi(x),\Phi(y)>_{\mathcal{H}}=<K(x,\cdot),K(y,\cdot)>_{\mathcal{H}}=K(x,y)

So, you don’t need to know what a mapping is, where the eigenspace is, or what the basis of the eigenspace is. For a symmetric positive definite function KKK, there must be a mapping Phi Phi Phi and an eigenspace H\mathcal{H}H such that:

<\Phi(x),\Phi(y)>=K(x,y)

That’s the trick with the kernel.

A simple case

Define the kernel function:

K(x,y)=(x_1,x_2,x_1x_2)(y_1,y_2,y_1y_2)^T=x_1y_1+x_2y_2+x_1x_2y_1y_2

Define x = (x1, x2) T, y = (y1, y2) T \ mathbf {x} = (x_1, x_2) ^ T, \ mathbf {} y = (y_1, y_2) ^ Tx = (x1, x2) T, y = T (y1, y2). make Lambda = 1 lambda = lambda 3 = 1, 2 bits (x) = 1 x1, bits of (x) = 2 x2, bits of 3 (x) = x1x2 \ lambda_1 = \ lambda_2 = \ lambda_3 = 1, \ psi_1 (\ mathbf {x}) = x_1, \ psi_2 (\ mathbf {x}) = x_2, \ ps I_3 (\ mathbf {x}) = x_1x_2 lambda = 1 lambda = lambda 3 = 1, 2 bits (x) = 1 x1, bits of (x) = 2 x2, bits of 3 (x) = x1x2, mapping can be defined as a:

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

The Story of Foundations and Nuclei – 3 Regenerates the Hilbert nuclear Space

Introduction to nuclear methods

Characteristics of decomposition

Kernel function

Reproducing Kernel Hilbert Space

A simple case

The Story of Foundations and Nuclei – 3 Regenerates the Hilbert nuclear Space

Introduction to nuclear methods

Characteristics of decomposition

Kernel function

Reproducing Kernel Hilbert Space

A simple case

Related Posts

Grafana Loki open source log aggregation system replaces ELK or EFK

Notes on Ng’s Structured Machine Learning Project (notes before July 19)

RPC implementation and related learning ~