Channel more in-depth articles, please focus on cloud computing: https://yq.aliyun.com/cloud


Introduction

Machine learning is as powerful as an armory of incredibly powerful weapons, but you have to learn how to use them first. Regression, for example, is a powerful tool for analyzing data effectively, but it has no control over highly complex data. Support Vector Machines (SVM) are like a sharp knife, especially when modeling on small data sets.

More than 550 people have signed up to take this quiz, which is designed specifically for SVM and its applications. Find out how much your KNOWLEDGE of SVM can score, and check your gaps.


Helpful Resources

1. Ten Commonly used Machine learning Algorithms (with Python and R code)

2. Principle and code of SVM

 

Skill test Questions and Answers

Suppose you use a linear SVM classifier to solve a class 2 classification problem, as shown below. The points circled in red represent support vectors. Answer questions 1 and 2 accordingly:



1. If the circled data is removed, will the decision boundary (i.e., the separation hyperplane) change?

A. yes B. No C. No D. No

Answer: A,

Tips: Changing the position of any of these three points introduces relaxation constraints, and the decision boundary changes.

 

2. If all data except the three circled points are removed from the data, will the decision boundary change?

A. trarue B. False

Answer: B

Tips: Decision boundaries are only affected by support vectors, not other points.

 

3. What is correct about the generalization error of SVM

A. Distance between hyperplane and support vector

B. VM’s ability to predict unknown data

C. The error threshold of VM

Answer: B

Tips: Generalization error in statistics refers to the ability of a model to predict unknown data.

 

4. If the penalty parameter C goes to infinity, which of the following statements is true?

A. If the optimal separation hyperplane exists, the data can be separated completely

B. Soft interval classifier can complete data classification

C. None of that is true

 

Answer: A,

Tips: If the misclassification penalty is high, the soft interval will not always exist because there is no more room for error

 

5. What is true about the hard spacing description below

A. VM allows slight errors in classification

B. VM allows a large number of errors in classification

C. None of the above is correct

 

Answer: A,

Tips: Hard spacing means that SVM is very strict in classification and performs as well as possible in training sets, which may lead to overfitting.

 

6. The minimum time complexity of SVM training is O(n2), so which of the following data sets is not suitable for SVM?

A. Large dataset B. small dataset C. medium dataset d. independent of dataset size

 

Answer: A,

Data sets with well-defined classification boundaries are best for SVM

 

7. The efficiency of SVM depends on:

A. B. Kernel parameters C. soft interval parameters C. All of the above

 

Answer: D

Tips: The efficiency of SVM depends on the above three basic requirements, which can improve efficiency and reduce errors and overfitting

 

8. Support vectors are those data points closest to the decision plane.

A. trarue B. FALSE

 

Answer: A,

 

9. SVM performs poorly in the following situations:

A. Linear fractionable data b. cleaned data C. noisy data and overlapping data points

 

Answer: C

Tips: It is difficult to draw a clean and unerringly classified hyperplane when the data contains noisy data and overlapping points

 

10. Suppose you use a RBF nucleus with a large gamma value, which means:

A. The model will consider modeling points far from the hyperplane

B. The model is modeled using only points close to the hyperplane

C. The model is not affected by the distance from the point to the hyperplane

D. None of the above is true

 

Answer: B

Tips: γ in SVM tuning measures the influence of points near and far from the hyperplane. For smaller γ, the model is constrained to consider all points in the training set, and there are no patterns that really capture the data. For larger γ, the model learns the model well.

 

11. Cost parameters in SVM are expressed as follows:

A. Number of times of cross-validation

B. The use of nuclear

C. Balance between misclassification and model complexity

D. None of the above

 

Answer: C

Tips: Cost parameters determine the extent to which SVM can fit training data. If you want a smooth decision plane, it’s cheaper; If you have to classify more data correctly, it can be more expensive. It can be simply understood as the cost of misclassification.

 

Suppose you use SVM to learn data X, and some points in data X have errors. Now if you use a quadratic kernel function, polynomial of order 2, using the relaxation variable C as one of the overarguments, answer 12-13.

12. When you use larger C (C tends to infinity) then:

A. You can still classify data correctly

B. Not properly classified

C. Not sure

D. None of the above is correct

 

Answer: A,

Tips: With a larger C, the penalty for misclassification points is greater, so decision boundaries will classify data as perfectly as possible.

 

13. If smaller C is used (C approaches 0) then:

A. misclassification

B. Correct classification

C. Not sure

D. None of the above is correct

 

Answer: A,

Tips: The classifier maximizes the interval between most points, and a few points misclassify because the penalty is too small.

 

14. If I use all the features of the data set and can achieve 100% accuracy, but can only achieve about 70% accuracy on the test set, it means:

A. Underfit b. great model C. overfit

 

Answer: C

Tips: If the model can easily achieve 100% accuracy on the training set, check to see if fitting has occurred.

 

15. Which of the following is a SVM application

A. Text and hypertext classification

B. Image classification

C. New article clustering

D. The above are

 

Answer: D

Tips: SVM is widely used in practical problems, including regression, clustering, handwritten number recognition, etc.

 

Suppose that after you train a SVM, you get a linear decision boundary, and you think the model is under-fitting. Questions 16 to 18 are based on this.

16. In the next iteration of training model, the following should be considered:

A. Add training data

B. Reducing training data

C. Calculate more variables

D. Reduce the characteristics

 

Answer: C

Tips: Since it is underfitting, the best option is to create more features to bring into the model training.

 

17. Assuming you made the correct choice in the previous question, which of the following would happen:

1. Reduce bias

2. Reduce variance

3. Increase bias

4. Reduce variance

 

A. 1 and 2

B. 2 and 3

C. 1 and 4

D. 2 and 4

 

Answer: C

Tips: Better models reduce bias and increase variance

 

18. What should you do if you want to modify the parameters of the SVM to achieve the same effect as the model?

A. Increment parameter C

B. Reduce parameter C

C. Changing C doesn’t work

D. None of the above is correct

 

Answer: A,

Tips: Increasing parameter C will result in a regularized model

 

19. Feature normalization is usually carried out before gaussian kernel function is used in SVM. Is the following description of feature normalization correct?

1. The new features obtained by feature regularization are better than the old ones

2. Feature normalization cannot handle category variables

3. Feature normalization is always useful when gaussian kernel function is used in SVM

A. 1 B. 1 C. 1 D. 2

Answer: B

 

Suppose you are using SVM to deal with 4 categories of classification problems. You use one-vs-all strategy and answer 20-22 accordingly

20. How many times should THE SVM model be trained in this case?

A.1

B.2

C. 3

D. 4

 

Answer: D

To use the one-Vs-all strategy, train four times, treating one class as positive and the others as negative, and then learn four models, taking the class with the largest function value for the new data as the prediction category

 

21. Given that it takes 10 seconds to train a SVM with one-Vs-all, what is the total number of seconds to train?

A. 20

B. 40

C. 60

80 D.

 

Answer: B

If you do 4 10-second sessions, that’s 40 seconds

 

22. Assuming there are only two classes, how many times does SVM need to be trained in this case?

1 A.

2 B.

3 C.

4 D.

Answer: A,

Tips: Train both classes once

 

Suppose you have trained a linear kernel-based SVM with polynomial order 2 and 100% accuracy on both the training set and the test set. Answer 23-24 accordingly

23. What happens if you increase the model complexity or the polynomial order of the kernel function?

A. Lead to overfitting

B. Result in under-fitting

C. No impact, because the model is 100% accurate

D. None of the above is correct

 

Answer: A,

Tips: Increasing model complexity can lead to overfitting

 

24. If you find that the accuracy rate on the training set is still 100% after increasing the model complexity, what may be the reason?

1. The data remains unchanged and more polynomial terms or parameters are adapted. The algorithm starts to remember everything in the data

2. The data is unchanged, so SVM does not need to search the classified hyperplane in a larger hypothesis space

 

1 A.

2 B.

C. 1 and 2

D. None of the above is correct

 

Answer: C

 

25. The following statements about the kernel function of SVM are correct

1. Kernel function maps low-dimensional data to high-dimensional space

2. It is a similarity function.

 

1 A.

2 B.

C. 1 and 2

D. None of the above is correct

 

Answer: C


Overall Distribution

More than 350 people have taken the test so far, and the scores are distributed as follows:



Author blog has other such as dimension reduction, machine learning, SQL self-test questions, interested can go to the author blog to see.



The author information




Ankit Gupta Ankit Gupta is a freelance data scientist who has solved complex data mining problems in many fields and is keen to learn more about data science and machine learning.

GitHub: github.com/anki1909

LinkedIn: www.linkedin.com/in/ankit-gu…


This article is translated by Ali Yunqi Community Organization.

25 Questions to Test a Data Scientist on Support Vector Machines

The article is a brief translation. For more details, please refer to the original text