Quiznetik

Data Mining and Business Intelligence | Set 2

1. ________of data means that the attributes within a given entity are fully dependent on the entire primary key of the entity.

Correct : C. Functional dependency

2. A fact is said to be fully additive if_________.

Correct : A. It is additive over every dimension of its dimensionality

3. A fact is said to be partially additive if_______.

Correct : B. Additive over at least one but not all of the dimensions

4. A fact is said to be non-additive if_______.

Correct : C. Not additive over any dimension

5. Non-additive measures can often combined with additive measures to create new_________.

Correct : A. Additive measures

6. A fact representing cumulative sales units over a day at a store for a product is a_________.

Correct : B. Fully additive fact

7. Which of the following is the other name of Data mining?

Correct : D. All of the above

8. Which of the following is a predictive model?

Correct : B. Regression

9. Which of the following is a descriptive model?

Correct : C. Sequence discovery

10. A_________model identifies patterns or relationships.

Correct : A. Descriptive

11. A predictive model makes use of______.

Correct : B. Historical data.

12. ______ maps data into predefined groups.

Correct : D. Classification

13. _____ is used to map a data item to a real valued prediction variable.

Correct : B. Time series analysis

14. In _____ , the value of an attribute is examined as it varies over time.

Correct : B. Time series analysis

15. In ______ the groups are not predefined.

Correct : C. Clustering

16. Link Analysis is otherwise called as ____.

Correct : C. Both A & B

17. ______ is a the input to KDD.

Correct : A. Data

18. The output of KDD is ______.

Correct : D. Useful information

19. The KDD process consists of ____steps.

Correct : C. Five

20. Treating incorrect or missing data is called as________.

Correct : B. Preprocessing

21. Converting data from different sources into a common format for processing is called as____ .

Correct : C. Transformation

22. Various visualization techniques are used in_________step of KDD.

Correct : D. Interpretation

23. Extreme values that occur infrequently are called as___________.

Correct : A. Outliers

24. Box plot and scatter diagram techniques are_________.

Correct : B. Geometri

25. _____ is used to proceed from very specific knowledge to more general information.

Correct : A. Induction

26. Describing some characteristics of a set of data by a general model is viewed as___________.

Correct : B. Compression

27. ______ helps to uncover hidden information about the data.

Correct : C. Approximation

28. ______ are needed to identify training data and desired results.

Correct : C. Users

29. Over fitting occurs when a model_________.

Correct : B. Does not fit in future states

30. The problem of dimensionality curse involves___________.

Correct : D. All of the above

31. Incorrect or invalid data is known as _______.

Correct : B. Noisy data

32. ROI is an acronym of _______.

Correct : A. Return on Investment

33. The ______of data could result in the disclosure of information that is deemed to be confidential.

Correct : B. Unauthorized use

34. _________data are noisy and have many missing attribute values.

Correct : D. D Tr

35. The rise of DBMS occurred in early _______.

Correct : C. 1970's

36. SQL stand for_________.

Correct : B. Structured Query Language

37. Which of the following is not a data mining metric?

Correct : D. All of the above

38. Reducing the number of attributes to solve the high dimensionality problem is called as_____________.

Correct : B. Dimensionality reduction

39. Data that are not of interest to the data mining task is called as _____.

Correct : C. Irrelevant data

40. _________are effective tools to attack the scalability problem.

Correct : C. Both A & B

41. Market-basket problem was formulated by____________.

Correct : A. Agrawal et al

42. Data mining helps in________.

Correct : D. All of the above

43. The proportion of transaction supporting X in T is called_____________.

Correct : B. Support

44. The absolute number of transactions supporting X in T is called _______.

Correct : C. Support count

45. The value that says that transactions in D that support X also support Y is called__________.

Correct : A. Confidence

46. If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the support of bread and jam is_________.

Correct : A. 2%

47. 7 If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the confidence of buying bread with jam is____________.

Correct : D. 50%

48. The left hand side of an association rule is called________.

Correct : C. Antecedent

49. The right hand side of an association rule is called__________.

Correct : A. Consequent

50. Which of the following is not a desirable feature of any efficient algorithm?

Correct : D. To have maximal code length

51. All set of items whose support is greater than the user-specified minimum support are called as_____________

Correct : B. Frequent set

52. If a set is a frequent set and no superset of this set is a frequent set, then it is called____________

Correct : A. Maximal frequent set

53. Any subset of a frequent set is a frequent set. This is_________

Correct : B. Downward closure property

54. Any superset of an infrequent set is an infrequent set. This is ___________

Correct : C. Upward closure property

55. If an itemset is not a frequent set and no superset of this is a frequent set, then it is

Correct : B. Border set

56. A priori algorithm is otherwise called as_________

Correct : B. Level-wise algorithm

57. The A Priori algorithm is a____________

Correct : D. Bottom-up search

58. The first phase of A Priori algorithm is___________

Correct : A. Candidate generation

59. The second phase of A Priori algorithm is____________

Correct : C. Pruning

60. The step eliminates the extensions of (k-1)-itemsets which are not found to be frequent, from being considered for counting support.

Correct : B. Pruning

61. The a priori frequent itemset discovery algorithm moves in the lattice

Correct : A. Upward

62. After the pruning of a priori algorithm,__________will remain

Correct : B. No candidate set

63. The number of iterations in a priori

Correct : A. Increases with the size of the maximum frequent set

64. MFCS is the acronym of____________

Correct : C. Maximal Frequent Candidate Set

65. Dynamic Itemset Counting Algorithm was proposed by

Correct : A. Bin et al

66. Itemsets in the category of structures have a counter and the stop number with them

Correct : A. Dashed

67. The itemsets in the_________category structures are not subjected to any counting

Correct : C. Soli

68. Certain itemsets in the dashed circle whose support count reach support value during an iteration move into the______________

Correct : A. Dashed box

69. Certain itemsets enter afresh into the system and get into the , which are essentially the supersets of the itemsets that move from the dashed circle to the dashed box

Correct : D. Dashed circle

70. The item sets that have completed on full pass move from dashed circle to________

Correct : B. Solid circle

71. The FP-growth algorithm has phases

Correct : B. Two

72. A frequent pattern tree is a tree structure consisting of ________

Correct : D. Both A & B

73. The non-root node of item-prefix-tree consists of fields

Correct : B. Three

74. The frequent-item-header-table consists of fields

Correct : B. Two.

75. The paths from root node to the nodes labelled 'a' are called_________

Correct : D. Prefix subpath

76. The transformed prefix paths of a node 'a' form a truncated database of pattern which cooccur with a is called________

Correct : C. Conditional pattern base

77. The goal of________is to discover both the dense and sparse regions of a data set

Correct : C. Clustering

78. Which of the following is a clustering algorithm?

Correct : B. CLARA

79. clustering technique start with as many clusters as there are records, with each cluster having only one record

Correct : A. Agglomerative

80. clustering techniques starts with all records in one cluster and then try to split that

Correct : B. Divisive.

81. Which of the following is a data set in the popular UCI machine-learning repository?

Correct : D. MUSHROOM

82. In algorithm each cluster is represented by the center of gravity of the cluster

Correct : B. K-means

83. In each cluster is represented by one of the objects of the cluster located near the center

Correct : A. K-medoid

84. Pick out a k-medoid algorithm

Correct : C. PAM

85. Pick out a hierarchical clustering algorithm

Correct : D. BIRCH

86. CLARANS stands for

Correct : C. Clustering Large Applications based on Randomized Search

87. BIRCH is a________

Correct : C. Hierarchical-agglomerative algorithm

88. The cluster features of different subclusters are maintained in a tree called_________

Correct : A. CF tree

89. The_______algorithm is based on the observation that the frequent sets are normally very few in number compared to the set of all itemsets

Correct : D. Partition

90. The partition algorithm uses scans of the databases to discover all frequent sets

Correct : A. Two

91. The basic idea of the Apriori algorithm is to generate_____item sets of a particular size & scans the database

Correct : A. Candidate

92. is the most well-known association rule algorithm and is used in most commercial products

Correct : A. Apriori algorithm

93. An algorithm called________is used to generate the candidate item sets for each pass after the first

Correct : B. Apriori-gen

94. The basic partition algorithm reduces the number of database scans to __________ & divides it into partitions

Correct : B. Two

95. and prediction may be viewed as types of classification

Correct : C. Estimation.

96. can be thought of as classifying an attribute value into one of a set of possible classes

Correct : B. Prediction.

97. Prediction can be viewed as forecasting a value

Correct : C. Continuous.

98. data consists of sample input data as well as the classification assignment for the data

Correct : B. Measuring.

99. Rule based classification algorithms generate_________rule to perform the classification

Correct : A. If-then

100. are a different paradigm for computing which draws its inspiration from neuroscience

Correct : B. Neural networks