Example 2.2 Boolean text joins

Consider learning up to NNN Boolean characters X1,… , xnx_1,… , x_nx1,… Boolean characters are variables xix_ixi, I ∈[1, n] I ∈[1, n] I ∈[1, n], or its negative x, (4) I \overline x_ixi. For n= 4N = 4N =4, an example of my face is conjunctions: x1∧x red 2∧x4x_1∧ \overline x_2∧ X_4x1 ∧ X2 ∧x4, where x red 2\overline x_2x2 represents the negation of the Boolean text x_2. ,0,0,1,0,0,1 (1) (1) (1,0,0,1) is the concept of positive examples, and,0,0,0 (1) (1,0,0,0),0,0,0 (1) is a negative example.

Note that for n = 4N = 4N = 4, the positive example (1,0,1,0)(1,0,1 0)(1,0,1,0)(1,0,1,0) means that the target concept cannot contain the words x,1 \overline X_1x1 and x, 3\overline x_3x3, Nor can it contain the characters x2x2x2 and x4x4x4. In contrast, a negative example is less rich because it does not know which of its NNN bits is incorrect. Thus, a simple algorithm for finding consistent assumptions is based on positive examples and includes the following: For each positive example (B1,… ,bn)(b_1,… ,b_n)(b1,… ,bn) and I ∈ [1,n] I ∈ [1,n], if bi = 1b_I = 1bi = 1 then x face red I \overline x_ixi is excluded as a possible text in the concept class, if bi = 0b_I = 0bi = 0, Xix_ixi is excluded. Therefore, the conjunctions of all words not excluded are assumed to be consistent with the goal. Figure 2.4 shows the sample training sample and the consistent assumption for n = 6 n = 6.

We have ∣ ∣ H = ∣ ∣ Cn = 3 n | H | | = | Cn = 3 H ^ n ∣ ∣ = ∣ Cn ∣ n = 3, because each word can certainly contains, negative included or does not contain. Interpolating it into the sample complexity bounds of the consistent hypothesis yields the following sample complexity bounds for any ϵ> 0\epsilon >0 ϵ> 0 and δ>0 δ>0 δ>0:


m p 1 ϵ ( ( l o g 3 ) n + l o g 1 Delta t. ) . ( 2.10 ) M \ ge \ frac {1} {\ epsilon} \ big ((log3) n + log \ frac {1} {g} \ big). (2.10)

Therefore, a concatenation class of up to NNN Boolean characters is PAC learnable. Note that the computational complexity is also polynomial, since the training cost per example is O(n)O(n)O(n). For δ = 0.02δ = 0.02δ = 0.02, ϵ = 0.1\epsilon = 0.1ϵ = 0.1 and n = 10n = 10n = 10, the boundary changes to M ≥ 149m ≥ 149m ≥ 149. Therefore, for labeled samples with at least 149149149 samples, the bounds guarantee 99%99\%99% accuracy and at least 98%98\%98% confidence.



Figure 2.4Each of the first six rows of the table represents a training example whose label
+ +
 或 
Indicates in the last column. If all of the positive examples
i i
An item of
0 0
(or
1 1
), the last row is in the column
i     [ 1 .   6 ] I ∈ [1, 6]
Contained in the
0 0
(or
1 1
). if
0 0
 和 
1 1
As the first of some positive examples
i i
, then it contains”
? ?
“‘. Therefore, for this training sample, the consistency algorithm described in this paper returns an assumption of
x 1   Sunday afternoon x 2   Sunday afternoon x 5   Sunday afternoon x 6 X_1 Sunday afternoon x_2 Sunday afternoon x_5 Sunday afternoon x_6
.

Example 2.3 Generic concept classes

Consider the set X = {0, 1}nX = \{0, 1\}^nX = {0, 1}n of all Boolean vectors with NNN components, and let UnU_nUn be a concept class formed by all subsets of XXX. Is this concept class PAC learnable? In order to ensure consistent with the hypothesis, hypothesis must include the concept, therefore ∣ H ∣ p ∣ Un ∣ = 2 (2 n) | H | | or U_n | = 2 ^ {} (2 ^ n) ∣ H ∣ p ∣ Un ∣ = 2 (2 n). Theorem 2.1 gives the following sample complexity bounds:


m p 1 ϵ ( ( l o g 2 ) 2 n + l o g 1 Delta t. ) . ( 2.11 ) M \ ge \ frac {1} {\ epsilon} \ big ((log2) 2 ^ n + log \ frac {1} {g} \ big). (2.11)

Here, the number of training samples required is the exponent of NNN, which is the cost expressed by the midpoint of XXX. Therefore, PAC learning cannot be guaranteed by the theorem. In fact, it is not difficult to prove that this generic concept class is not PAC learnable.