Quiznetik

Data Mining and Data Warehouse | Set 2

1. A fact is said to be partially additive if ___________.

Correct : B. additive over atleast one but not all of the dimensions.

2. A fact is said to be non-additive if ___________.

Correct : C. not additive over any dimension.

3. Non-additive measures can often combined with additive measures to create new _________.

Correct : A. additive measures.

4. A fact representing cumulative sales units over a day at a store for a product is a _________.

Correct : B. fully additive fact.

5. ____________ of data means that the attributes within a given entity are fully dependent on the entire primary key of the entity.

Correct : C. functional dependency.

6. Which of the following is the other name of Data mining?

Correct : D. all of the above.

7. Which of the following is a predictive model?

Correct : B. regression.

8. Which of the following is a descriptive model?

Correct : C. sequence discovery.

9. A ___________ model identifies patterns or relationships.

Correct : A. descriptive.

10. A predictive model makes use of ________.

Correct : B. historical data.

11. ____________ maps data into predefined groups.

Correct : D. classification.

12. __________ is used to map a data item to a real valued prediction variable.

Correct : B. time series analysis.

13. In ____________, the value of an attribute is examined as it varies over time.

Correct : B. time series analysis.

14. In ________ the groups are not predefined.

Correct : C. clustering.

15. Link Analysis is otherwise called as ___________.

Correct : C. both a & b.

16. _________ is a the input to KDD.

Correct : A. data.

17. The output of KDD is __________.

Correct : D. useful information.

18. The KDD process consists of ________ steps.

Correct : C. five.

19. Treating incorrect or missing data is called as ___________.

Correct : B. preprocessing.

20. Converting data from different sources into a common format for processing is called as ________.

Correct : C. transformation.

21. Various visualization techniques are used in ___________ step of KDD.

Correct : D. interpretation.

22. Extreme values that occur infrequently are called as _________.

Correct : A. outliers.

23. Box plot and scatter diagram techniques are _______.

Correct : B. geometric.

24. __________ is used to proceed from very specific knowledge to more general information.

Correct : A. induction.

25. Describing some characteristics of a set of data by a general model is viewed as ____________

Correct : B. compression.

26. _____________ helps to uncover hidden information about the data.

Correct : C. approximation.

27. _______ are needed to identify training data and desired results.

Correct : C. users.

28. Overfitting occurs when a model _________.

Correct : B. does not fit in future states.

29. The problem of dimensionality curse involves ___________.

Correct : D. all of the above.

30. Incorrect or invalid data is known as _________.

Correct : B. noisy data.

31. ROI is an acronym of ________.

Correct : A. return on investment.

32. The ____________ of data could result in the disclosure of information that is deemed to be confidential.

Correct : B. unauthorized use.

33. ___________ data are noisy and have many missing attribute values.

Correct : C. real-world.

34. The rise of DBMS occurred in early ___________.

Correct : C. 1970\s

35. SQL stand for _________.

Correct : B. structured query language.

36. Which of the following is not a data mining metric?

Correct : D. all of the above.

37. Reducing the number of attributes to solve the high dimensionality problem is called as ________.

Correct : B. dimensionality reduction.

38. Data that are not of interest to the data mining task is called as ______.

Correct : C. irrelevant data.

39. ______ are effective tools to attack the scalability problem.

Correct : C. both a & b.

40. Market-basket problem was formulated by __________.

Correct : A. agrawal et al.

41. Data mining helps in __________.

Correct : D. all of the above.

42. The proportion of transaction supporting X in T is called _________.

Correct : B. support.

43. The absolute number of transactions supporting X in T is called ___________.

Correct : C. support count.

44. The value that says that transactions in D that support X also support Y is called ______________.

Correct : A. confidence.

45. If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the support of bread and jam is _______.

Correct : A. 2%

46. 7 If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the confidence of buying bread with jam is _______.

Correct : D. 50%

47. The left hand side of an association rule is called __________.

Correct : C. antecedent.

48. The right hand side of an association rule is called _____.

Correct : A. consequent.

49. Which of the following is not a desirable feature of any efficient algorithm?

Correct : D. to have maximal code length.

50. All set of items whose support is greater than the user-specified minimum support are called as _____________.

Correct : B. frequent set.

51. If a set is a frequent set and no superset of this set is a frequent set, then it is called ________.

Correct : A. maximal frequent set.

52. Any subset of a frequent set is a frequent set. This is ___________.

Correct : B. downward closure property.

53. Any superset of an infrequent set is an infrequent set. This is _______.

Correct : C. upward closure property.

54. If an itemset is not a frequent set and no superset of this is a frequent set, then it is _______.

Correct : B. border set.

55. A priori algorithm is otherwise called as __________.

Correct : B. level-wise algorithm.

56. The A Priori algorithm is a ___________.

Correct : D. bottom-up search.

57. The first phase of A Priori algorithm is _______.

Correct : A. candidate generation.

58. The second phaase of A Priori algorithm is ____________.

Correct : C. pruning.

59. The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent, from being considered for counting support.

Correct : B. pruning.

60. The a priori frequent itemset discovery algorithm moves _______ in the lattice.

Correct : A. upward.

61. After the pruning of a priori algorithm, _______ will remain.

Correct : B. no candidate set.

62. The number of iterations in a priori ___________.

Correct : A. increases with the size of the maximum frequent set.

63. MFCS is the acronym of _____.

Correct : C. maximal frequent candidate set.

64. Dynamuc Itemset Counting Algorithm was proposed by ____.

Correct : A. bin et al.

65. Itemsets in the ______ category of structures have a counter and the stop number with them.

Correct : A. dashed.

66. The itemsets in the _______category structures are not subjected to any counting.

Correct : C. solid.

67. Certain itemsets in the dashed circle whose support count reach support value during an iteration move into the ______.

Correct : A. dashed box.

68. Certain itemsets enter afresh into the system and get into the _______, which are essentially the supersets of the itemsets that move from the dashed circle to the dashed box.

Correct : D. dashed circle.

69. The itemsets that have completed on full pass move from dashed circle to ________.

Correct : B. solid circle.

70. The FP-growth algorithm has ________ phases.

Correct : B. two.

71. A frequent pattern tree is a tree structure consisting of ________.

Correct : D. both a & b.

72. The non-root node of item-prefix-tree consists of ________ fields.

Correct : B. three.

73. The frequent-item-header-table consists of __________ fields.

Correct : B. two.

74. The paths from root node to the nodes labelled 'a' are called __________.

Correct : D. prefix subpath.

75. The transformed prefix paths of a node 'a' form a truncated database of pattern which co-occur with a is called _______.

Correct : C. conditional pattern base.

76. The goal of _____ is to discover both the dense and sparse regions of a data set.

Correct : C. clustering.

77. Which of the following is a clustering algorithm?

Correct : B. clara.

78. _______ clustering technique start with as many clusters as there are records, with each cluster having only one record.

Correct : A. agglomerative.

79. __________ clustering techniques starts with all records in one cluster and then try to split that cluster into small pieces.

Correct : B. divisive.

80. Which of the following is a data set in the popular UCI machine-learning repository?

Correct : D. mushroom.

81. In ________ algorithm each cluster is represented by the center of gravity of the cluster.

Correct : B. k-means.

82. In ___________ each cluster is represented by one of the objects of the cluster located near the center.

Correct : A. k-medoid.

83. Pick out a k-medoid algoithm.

Correct : C. pam.

84. Pick out a hierarchical clustering algorithm.

Correct : B. birch.

85. CLARANS stands for _______.

Correct : C. clustering large applications based on randomized search.

86. BIRCH is a ________.

Correct : C. hierarchical-agglomerative algorithm.

87. The cluster features of different subclusters are maintained in a tree called ___________.

Correct : A. cf tree.

88. The ________ algorithm is based on the observation that the frequent sets are normally very few in number compared to the set of all itemsets.

Correct : D. partition.

89. The partition algorithm uses _______ scans of the databases to discover all frequent sets.

Correct : A. two.

90. The basic idea of the apriori algorithm is to generate________ item sets of a particular size & scans the database.

Correct : A. candidate.

91. An algorithm called________is used to generate the candidate item sets for each pass after the first.

Correct : B. apriori-gen.

92. The basic partition algorithm reduces the number of database scans to ________ & divides it into partitions.

Correct : B. two.

93. ___________and prediction may be viewed as types of classification.

Correct : C. estimation.

94. ___________can be thought of as classifying an attribute value into one of a set of possible classes.

Correct : B. prediction.

95. Prediction can be viewed as forecasting a_________value.

Correct : C. continuous.

96. _________data consists of sample input data as well as the classification assignment for the data.

Correct : D. training.

97. Rule based classification algorithms generate ______ rule to perform the classification.

Correct : A. if-then.

98. ____________ are a different paradigm for computing which draws its inspiration from neuroscience.

Correct : B. neural networks.

99. The human brain consists of a network of ___________.

Correct : A. neurons.

100. Each neuron is made up of a number of nerve fibres called _____________.

Correct : D. dendrites.