1. A supermarket studies sales record data and finds that people who buy beer are highly likely to also buy diapers. What kind of data mining problem is this?


A. Discover association rules

B. clustering

Classification of c.

D. natural language processing


2. Which two evaluation criteria for classification algorithms are corresponding to the following two descriptions?

(a) Police catching thieves, describing the standard of how many of those arrested by police are thieves.

(b) Criteria describing what percentage of thieves are caught by the police.


Amy polumbo recision and Recall

B.R ecall, Precision

C.P recision, ROC

D.R ecall, ROC


3. In which of the following steps is the task of integrating, transforming, dimension specification, and numerical specification of raw data?


A. Frequent pattern mining

B. Classification and prediction

C. Data preprocessing

D. Data stream mining


4. When the label of data is not known, which technology can be used to separate data with the same label from data with other labels?


Classification of A.

B. clustering

C. Association analysis

D. Hidden Markov chains


5. What is KDD?


A. Data mining and knowledge discovery

B. Domain knowledge discovery

C. Document knowledge discovery

D. Dynamic knowledge discovery


6. Exploring data using interactive and visual techniques falls into what category of tasks of data mining?


A. Exploratory data analysis

B. Modeling description

C. Predictive modeling

D. Look for patterns and rules


7. Model the overall distribution of data; What kind of task does dividing multidimensional space into groups belong to data mining?


A. Exploratory data analysis

B. Modeling description

C. Predictive modeling

D. Look for patterns and rules


8. Build a model to predict which kind of task of data mining belongs to other variable values according to known variable values.


A. Search by content

B. Modeling description

C. Predictive modeling

D. Look for patterns and rules


9. Users have a pattern they are interested in and want to find similar patterns in the data set. Which kind of task is data mining?


A. Search by content

B. Modeling description

C. Predictive modeling

D. Look for patterns and rules


10. The following is the scalable clustering algorithm.


A, CURE

B, DENCLUE

C, CLIQUE

D, OPOSSUM


11. Which of the following is not a data preprocessing method?


Substitution of the A variable

B discretization

C gathered

D Estimated missing values


12. Assume that the 12 sales price record groups have been sorted as follows: 5,10,11,13,15,35,50,55,72,92,204,215 divide them into four boxes using each of the following methods. In which box is 15 in equal frequency (equal depth) division?


A first

B the second

The third C

D the fourth


13. In the above question, when dividing by equal width (width 50), which box is 15 in?


A first

B the second

The third C

D the fourth


14. Which of the following is not an attribute type for data:


A nominal

B ordinal

C range

D different


15. In the above question, the types of attributes that are quantitative are:


A nominal

B ordinal

C range

D different


16. Binary properties that are important only if they are non-zero are called:


A count attribute

B Discrete attribute

C an asymmetric binary attribute

D Symmetric attribute


17. Which of the following is not a standard method for feature selection:


A embedded

Filter B

C packaging

D sampling


18. The following methods that are not related to creating a new attribute are:


A Feature extraction

B Feature Modification

C Maps data to a new space

D characteristic structure


19. Considering the value set {1, 2, 3, 4, 5, 90}, its truncated mean (p=20%) is


A 2

B 3

C 3.5

D 5


20. Which of the following is a way of mapping data to a new space?


A Fourier transform

B Feature weighting

C Progressive sampling

D dimension reduction


21. Entropy is the amount of information needed to eliminate uncertainty, and the entropy of throwing uniform regular hexahedral dice is:


A1 bit

B 2.6 bits

C 3.2 bits

D 3.8 bits


22. It is assumed that the maximum and minimum values of attribute income are 12000 yuan and 98000 yuan respectively. The value of an attribute is mapped to a range of 0 to 1 using maximum and minimum normalization. The 73600 yuan of the property income will be converted to:


A 0.821

B 1.224

C 1.458

D 0.716


23. Assume that the data used for analysis contains the attribute AGE. The values of age in the data tuple are as follows (in ascending order) : 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 30, 33, 33, 33, 35, 35, 36, 40, 45, 46, 52, 70, question: The above data was smoothed by means of box smoothing with box depth of 3. The second box value is:


A 18.3

B 22.6

C 26.8

D 27.9


24. Considering the set of values {12 24 33 24 55 68 26}, the quartile range is:


A 31

B 24

C 55

D 3


25. The number of students in a university is 200 in the first year, 160 in the second year, 130 in the third year and 110 in the fourth year. Then the mode of grade attribute is:


A first grade

B grade two

C grade three

D grade four


26. Which of the following is not a technology dedicated to the visualization of temporal and spatial data:


A Contour map

B the pie chart

C surface figure

D vector field diagram


27. Among the sampling methods, when it is difficult to determine the appropriate sample size, the sampling methods that can be used are:


A has A simple random sample of putting back

B Simple random sampling without substitution

C Stratified sampling

D Progressive sampling


28. Data warehouses change over time. The following description is incorrect


A. Data warehouse continues to add new data content over time;

B. The new data captured overwrites the original snapshot.

C. The data warehouse deletes the old data content constantly with the change of events;

D. The data warehouse contains a large amount of comprehensive data, which will be constantly reintegrated with the change of time.


29. Metadata relating to basic data refers to:


A. Basic metadata information related to the structure of data sources, data warehouses, data marts and applications;

B. Basic metadata includes enterprise-related management data and information;

C. Basic metadata includes log files and scheduling information for resume execution processing;

D. Basic metadata includes information about load and update processing, analysis processing, and management


30. The following description of data granularity is incorrect:


A. Granularity refers to the detail level and level of small data units in the data warehouse;

B. The more detailed the data, the smaller the granularity and the higher the level;

C. The higher the data synthesis degree, the larger the granularity and the higher the level;

D. The specific division of granularity will directly affect the amount of data and query quality in the data warehouse.


31. Incorrect descriptions of the characteristics of data warehouse development are:


A. Data warehouse development should start from data;

B. The requirements of data warehouse use should be clear when developing;

C. The development of data warehouse is a continuous cycle process, which is a heuristic development;

D. In the data warehouse environment, there is no fixed and precise processing flow in the operational environment. Data analysis and processing in the data warehouse are more flexible, and there is no fixed mode


32. In relation to data warehouse tests, the following statements are not true:


A. In the process of completing the implementation of the data warehouse, various tests need to be carried out on the data warehouse. Testing efforts should include unit testing and system testing.

B. When each individual component of the data warehouse is completed, they need to be unit tested.

C. Integration testing of the system requires extensive functional testing and regression testing of all components of the data warehouse.

D. There is no need to make a detailed test plan before testing.


33. At the core of OLAP technology are:


A. In linear;

B. Quick response to users;

C. Interoperability.

D. Multidimensional analysis;


34. What is true about OLAP’s features is:

(1) rapidity (2) analyzability (3) multi-dimension (4) information (5) sharing


A.(1)(2)(3)

B.(2)(3)(4)

C.(1)(2)(3)(4)

D.(1)(2)(3)(4)(5)


35. The description of the difference between OLAP and OLTP is incorrect:


A. LAP is mainly about how to make sense of the large amount of different data gathered together. It is different from the OTAP application.

B. Unlike OLAP applications, OLTP applications contain a large number of relatively simple transactions.

C. LAP is characterized by large transaction volume, simple transaction content and high repetition rate.

D. LAP is based on a data warehouse, but its final data source, like OLTP, comes from the underlying database system to the same users.


36.OLAM technology is commonly referred to as “data Online analytic mining”, where the following statements are correct:


Both A. LAP and OLAM are based on the client/server model, with only the latter having interactivity with the user;

B. Due to the essential difference between OLAM’s cube and the cube used for OLAP.

C. Web-based OLAM is a combination of WEB technology and OLAM technology.

D. LAM server receives user’s analysis instructions through user’s graphic excuse, and makes certain operations on the super cube under the knowledge of metadata.


37. With regard to OLAP and OLTP, the following are incorrect:


A. LAP has a large amount of transactions, but the transaction content is relatively simple and the repetition rate is high.

B. LAP’s final data source is different from OLTP.

C.OLTP is for decision makers and senior management.

D. Lp is application-centered and application-driven.


38. Let X={1,2,3} be a frequent item set, then ____ association rules can be generated by X.


4 A,

B, 5,

C, 6,

D, 7


39. The relationship between frequent itemsets, frequent closed itemsets and maximum frequent itemsets is as follows:


A. Frequent itemset Frequent closed itemset = maximum frequent itemset

B. Frequent item set = maximum frequent item set of frequent closed item set

C. Frequent item set Frequent closed item set Maximum frequent item set

D. Frequent itemset = Frequent closed itemset = maximum frequent itemset


40. The conceptual layering diagram is the ____ diagram.


A, undirected and acyclic

B, directed acyclic

C. directed and looped

[D]


The answer:

AACBA, ABCAA,

DBADC, CDBCA,

BDAAA, BDCDC,

ADDDC, DACCB



Online live





Live broadcast time: 20:00-21:30

Practical applications of natural language understanding and deep learning


Note: Scan the code to log in and register successfully immediately. After receiving the registration message, you will enter the wechat of Add Live Helper. When adding, mark “8.23 Live” and the helper will pull you into the group.


Follow public accounts


【 Pegasus Club 】


Weixin.qq.com/r/bThZQajE7… (Qr code automatic recognition)

Pegasus will

AI artificial intelligence/big data/technology management and other personnel learning exchange park


Previous welfare concerns about the pegasus public number, reply to the corresponding keywords package download learning materials; Reply “join the group”, join the Pegasus AI, big data, project manager learning group, and grow together with excellent people!


From beginning to research, the 10 most Readable books in the field of artificial intelligence

RSVP number “2” machine learning & Data Science must-read classic book with resource pack!

Into AI & ML: Learning machine Learning from Basic Statistics (PDF download)

Answer the number “4” to learn about ARTIFICIAL intelligence, 30 books should not be missed (with electronic PDF download)

Answer number “6” AI AI: 54 Industry Blockbuster Reports

Reply number “12”

Small white | Python + + machine learning Matlab neural network theory + practice + + + depth video + courseware + source code, download attached!

Reply number “14” small white | machine learning and deep learning required books + machine learning field video/PPT + large data analysis books recommend!

526 Industry reports + White papers: AI, Artificial intelligence, robotics, smart mobility, smart home, Internet of Things, VR/AR, blockchain, etc. (download)


Reply number “19” 800G ARTIFICIAL intelligence learning materials :AI ebook +Python language introduction + tutorial + machine learning and other limited time free access!

Machine learning: How to go from beginner to Never Giving up? (With benefits)

Respond to digital “24” flash download | 132 g programming data: Python, JAVA, C, C + +, robot programming, PLC, entry to the proficient in ~

Reply number “25” limited resources | 177 g Python/machine learning/TensorFlow video/deep learning algorithm, introduction to cover/intermediate/project each stage!

Reply number “26” introduction to artificial intelligence book list recommended, learn AI please collect well (attached PDF download)

Reply | digital “27” Wu En of Stanford CS230 deep learning course a full range of information release (download)

Reply digital “30” receive | 100 + artificial intelligence study, deep learning, machine learning, big data, algorithms such as data, decisive collection!


FMI Artificial Intelligence and Big Data Summit Guest Speech PPT

Top 10 AI Jianghu Fields

Machine Learning Practical Experience Guide

More than 100 Papers on deep Learning

Top ten Classic Algorithms of Data Mining

6.10 Ele. me & Pegasus Project Management Practice PPT