1. A fact is said to be partially additive if ___________.
Correct : B. additive over atleast one but not all of the dimensions.
2. A fact is said to be non-additive if ___________.
Correct : C. not additive over any dimension.
3. Non-additive measures can often combined with additive measures to create new _________.
Correct : A. additive measures.
4. A fact representing cumulative sales units over a day at a store for a product is a _________.
Correct : B. fully additive fact.
5. ____________ of data means that the attributes within a given entity are fully dependent on the entire
primary key of the entity.
Correct : C. functional dependency.
6. Which of the following is the other name of Data mining?
Correct : D. all of the above.
7. Which of the following is a predictive model?
Correct : B. regression.
8. Which of the following is a descriptive model?
Correct : C. sequence discovery.
9. A ___________ model identifies patterns or relationships.
Correct : A. descriptive.
10. A predictive model makes use of ________.
Correct : B. historical data.
11. ____________ maps data into predefined groups.
Correct : D. classification.
12. __________ is used to map a data item to a real valued prediction variable.
Correct : B. time series analysis.
13. In ____________, the value of an attribute is examined as it varies over time.
Correct : B. time series analysis.
14. In ________ the groups are not predefined.
Correct : C. clustering.
15. Link Analysis is otherwise called as ___________.
Correct : C. both a & b.
16. _________ is a the input to KDD.
Correct : A. data.
17. The output of KDD is __________.
Correct : D. useful information.
18. The KDD process consists of ________ steps.
Correct : C. five.
19. Treating incorrect or missing data is called as ___________.
Correct : B. preprocessing.
20. Converting data from different sources into a common format for processing is called as ________.
Correct : C. transformation.
21. Various visualization techniques are used in ___________ step of KDD.
Correct : D. interpretation.
22. Extreme values that occur infrequently are called as _________.
Correct : A. outliers.
23. Box plot and scatter diagram techniques are _______.
Correct : B. geometric.
24. __________ is used to proceed from very specific knowledge to more general information.
Correct : A. induction.
25. Describing some characteristics of a set of data by a general model is viewed as ____________
Correct : B. compression.
26. _____________ helps to uncover hidden information about the data.
Correct : C. approximation.
27. _______ are needed to identify training data and desired results.
Correct : C. users.
28. Overfitting occurs when a model _________.
Correct : B. does not fit in future states.
29. The problem of dimensionality curse involves ___________.
Correct : D. all of the above.
30. Incorrect or invalid data is known as _________.
Correct : B. noisy data.
31. ROI is an acronym of ________.
Correct : A. return on investment.
32. The ____________ of data could result in the disclosure of information that is deemed to be
confidential.
Correct : B. unauthorized use.
33. ___________ data are noisy and have many missing attribute values.
Correct : C. real-world.
34. The rise of DBMS occurred in early ___________.
Correct : C. 1970\s
35. SQL stand for _________.
Correct : B. structured query language.
36. Which of the following is not a data mining metric?
Correct : D. all of the above.
37. Reducing the number of attributes to solve the high dimensionality problem is called as ________.
Correct : B. dimensionality reduction.
38. Data that are not of interest to the data mining task is called as ______.
Correct : C. irrelevant data.
39. ______ are effective tools to attack the scalability problem.
Correct : C. both a & b.
40. Market-basket problem was formulated by __________.
Correct : A. agrawal et al.
41. Data mining helps in __________.
Correct : D. all of the above.
42. The proportion of transaction supporting X in T is called _________.
Correct : B. support.
43. The absolute number of transactions supporting X in T is called ___________.
Correct : C. support count.
44. The value that says that transactions in D that support X also support Y is called ______________.
Correct : A. confidence.
45. If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam,
10000 transaction contain both bread and jam. Then the support of bread and jam is _______.
Correct : A. 2%
46. 7 If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam,
10000 transaction contain both bread and jam. Then the confidence of buying bread with jam is _______.
Correct : D. 50%
47. The left hand side of an association rule is called __________.
Correct : C. antecedent.
48. The right hand side of an association rule is called _____.
Correct : A. consequent.
49. Which of the following is not a desirable feature of any efficient algorithm?
Correct : D. to have maximal code length.
50. All set of items whose support is greater than the user-specified minimum support are called as
_____________.
Correct : B. frequent set.
51. If a set is a frequent set and no superset of this set is a frequent set, then it is called ________.
Correct : A. maximal frequent set.
52. Any subset of a frequent set is a frequent set. This is ___________.
Correct : B. downward closure property.
53. Any superset of an infrequent set is an infrequent set. This is _______.
Correct : C. upward closure property.
54. If an itemset is not a frequent set and no superset of this is a frequent set, then it is _______.
Correct : B. border set.
55. A priori algorithm is otherwise called as __________.
Correct : B. level-wise algorithm.
56. The A Priori algorithm is a ___________.
Correct : D. bottom-up search.
57. The first phase of A Priori algorithm is _______.
Correct : A. candidate generation.
58. The second phaase of A Priori algorithm is ____________.
Correct : C. pruning.
59. The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent, from
being considered for counting support.
Correct : B. pruning.
60. The a priori frequent itemset discovery algorithm moves _______ in the lattice.
Correct : A. upward.
61. After the pruning of a priori algorithm, _______ will remain.
Correct : B. no candidate set.
62. The number of iterations in a priori ___________.
Correct : A. increases with the size of the maximum frequent set.
63. MFCS is the acronym of _____.
Correct : C. maximal frequent candidate set.
64. Dynamuc Itemset Counting Algorithm was proposed by ____.
Correct : A. bin et al.
65. Itemsets in the ______ category of structures have a counter and the stop number with them.
Correct : A. dashed.
66. The itemsets in the _______category structures are not subjected to any counting.
Correct : C. solid.
67. Certain itemsets in the dashed circle whose support count reach support value during an iteration
move into the ______.
Correct : A. dashed box.
68. Certain itemsets enter afresh into the system and get into the _______, which are essentially the
supersets of the itemsets that move from the dashed circle to the dashed box.
Correct : D. dashed circle.
69. The itemsets that have completed on full pass move from dashed circle to ________.
Correct : B. solid circle.
70. The FP-growth algorithm has ________ phases.
Correct : B. two.
71. A frequent pattern tree is a tree structure consisting of ________.
Correct : D. both a & b.
72. The non-root node of item-prefix-tree consists of ________ fields.
Correct : B. three.
73. The frequent-item-header-table consists of __________ fields.
Correct : B. two.
74. The paths from root node to the nodes labelled 'a' are called __________.
Correct : D. prefix subpath.
75. The transformed prefix paths of a node 'a' form a truncated database of pattern which co-occur with a
is called _______.
Correct : C. conditional pattern base.
76. The goal of _____ is to discover both the dense and sparse regions of a data set.
Correct : C. clustering.
77. Which of the following is a clustering algorithm?
Correct : B. clara.
78. _______ clustering technique start with as many clusters as there are records, with each cluster having
only one record.
Correct : A. agglomerative.
79. __________ clustering techniques starts with all records in one cluster and then try to split that cluster
into small pieces.
Correct : B. divisive.
80. Which of the following is a data set in the popular UCI machine-learning repository?
Correct : D. mushroom.
81. In ________ algorithm each cluster is represented by the center of gravity of the cluster.
Correct : B. k-means.
82. In ___________ each cluster is represented by one of the objects of the cluster located near the
center.
Correct : A. k-medoid.
83. Pick out a k-medoid algoithm.
Correct : C. pam.
84. Pick out a hierarchical clustering algorithm.
Correct : B. birch.
85. CLARANS stands for _______.
Correct : C. clustering large applications based on randomized search.
86. BIRCH is a ________.
Correct : C. hierarchical-agglomerative algorithm.
87. The cluster features of different subclusters are maintained in a tree called ___________.
Correct : A. cf tree.
88. The ________ algorithm is based on the observation that the frequent sets are normally very few in
number compared to the set of all itemsets.
Correct : D. partition.
89. The partition algorithm uses _______ scans of the databases to discover all frequent sets.
Correct : A. two.
90. The basic idea of the apriori algorithm is to generate________ item sets of a particular size & scans
the database.
Correct : A. candidate.
91. An algorithm called________is used to generate the candidate item sets for each pass after the first.
Correct : B. apriori-gen.
92. The basic partition algorithm reduces the number of database scans to ________ & divides it into
partitions.
Correct : B. two.
93. ___________and prediction may be viewed as types of classification.
Correct : C. estimation.
94. ___________can be thought of as classifying an attribute value into one of a set of possible classes.
Correct : B. prediction.
95. Prediction can be viewed as forecasting a_________value.
Correct : C. continuous.
96. _________data consists of sample input data as well as the classification assignment for the data.
Correct : D. training.
97. Rule based classification algorithms generate ______ rule to perform the classification.
Correct : A. if-then.
98. ____________ are a different paradigm for computing which draws its inspiration from neuroscience.
Correct : B. neural networks.
99. The human brain consists of a network of ___________.
Correct : A. neurons.
100. Each neuron is made up of a number of nerve fibres called _____________.