Author: CHEONG

AI Machine Learning and Knowledge Graph

Research interests: Natural language processing and knowledge mapping

Before reading this article, two things should be noted:

1. Machine learning series of articles often contain a large number of formula derivation and proof. In order to better understand, important conclusions of this paper will be given at the beginning of the article to facilitate the understanding of the core of this paper as soon as possible. You can look further down the line if you want to know more about the derivation

2. There are a lot of formulas in this paper. If readers need to obtain the Word document containing the original formula, they can follow the official account [AI Machine Learning and Knowledge Graph] and reply: Probability graph model. Original is not easy, reprint please inform and indicate the source!

This paper mainly introduces conditional independence and factorization in undirected graphs. See the previous article for conditional independence and factorization in digraphs


This article conclusion


Undirected graph conditional independence is reflected in three aspects: global markov, local Markov and paired Markov. And the three aspects exist mutually, that is, the establishment of one of them can lead to the establishment of the other two.

The conditional independence of probability graph and factor decomposition of probability graph are not independent of each other. The factor decomposition formula can deduce the conditional independence hypothesis of probability graph.

The factorization of undirected graphs is more complex than that of directed graphs. Directed graphs can write factorization formula directly according to the topological structure, while undirected graphs introduce the improvement of maximum clips for factorization.


Body content


1. Undirected graph conditional independence

The conditional independence of undirected graph model is reflected in the following three aspects:

1. Global Markov property

As shown in the figure below, the conditional independence of undirected graphs is very simple. If sets XAX_AXA and XCX_CXC are independent of each other given set XBX_BXB, the following conditions must be met: Node A belongs to set XAX_AXA, and node C belongs to set XCX_CXC. If there is a connection between node B and nodes A and C, node B must exist in set XBX_BXB.

2. Local Markov property:

As shown in the figure below, local Markov property means that node A is independent from non-neighbor nodes in the case of given neighbor nodes B, C and D.

3. Paired Markov property:

Note: Global, local and pairwise Markov properties are interdependent, i.e. if one is true, the other two are also true.


Undirected graph factorization

The topological structure of the root graph is easy to write down the factorization formula, but the factorization of undirected graph is more complicated, whether directed graph or undirected graph, the factorization must reflect the conditional independence of graph, factorization and the conditional independence of graph are equivalent.

In order to solve undirected graph factorization, the concepts of clique and maximal clique are introduced first. Group refers to the collection of nodes in an undirected graph, and all nodes are connected. Maximum group means that all nodes are not connected when one more node is added to the group, namely the concept of maximum group in an undirected graph.

The formula of undirected graph factorization based on maximum clique is directly given below:

Where xcix{c_i} xCI is the set of nodes in the graph corresponding to the value group CIC_ICI, ϕ\phiϕ function is a potential function, ϕ (xci) = exp (-) E (xci) \ phi {c_i} (x) = exp ({E} {c_i} (x)) ϕ (xci) = exp ((xci) – E), must be positive, E (xci) E {c_i} (x) E (xci) energy function, and z is the normalized factor, The calculation method is as follows:

Hammesley-clifford theorem can be used to prove that the above factorization method can deduce the conditional independence of undirected graph, and the rationality of undirected graph decomposition based on maximum cliques can be explained from the side.

Conclusion: Factorization of digraph, conditional independence; Factorization of undirected graphs, conditional independence.

Machine learning 【 Whiteboard derivation series 】 author: Shuhuai008

Pattern Recognition and Machine Learning