Quiznetik

Data Mining and Business Intelligence | Set 2

1. ________of data means that the attributes within a given entity are fully dependent on the entire primary key of the entity.

A. Additively

B. Granularity

C. Functional dependency

D. Dimensionality

Correct : C. Functional dependency

2. A fact is said to be fully additive if_________.

A. It is additive over every dimension of its dimensionality

B. Additive over at least one but not all of the dimensions

C. Not additive over any dimension

D. None of the above

Correct : A. It is additive over every dimension of its dimensionality

3. A fact is said to be partially additive if_______.

A. It is additive over every dimension of its dimensionality

B. Additive over at least one but not all of the dimensions

C. Not additive over any dimension

D. None of the above

Correct : B. Additive over at least one but not all of the dimensions

4. A fact is said to be non-additive if_______.

A. It is additive over every dimension of its dimensionality

B. Additive over at least one but not all of the dimensions

C. Not additive over any dimension

D. None of the above

Correct : C. Not additive over any dimension

5. Non-additive measures can often combined with additive measures to create new_________.

A. Additive measures

B. Non-additive measures

C. Partially additive

D. All of the above

Correct : A. Additive measures

6. A fact representing cumulative sales units over a day at a store for a product is a_________.

A. Additive fact

B. Fully additive fact

C. Partially additive fact

D. Non-additive fact

Correct : B. Fully additive fact

7. Which of the following is the other name of Data mining?

A. Exploratory data analysis

B. Data driven discovery

C. Deductive learning

D. All of the above

Correct : D. All of the above

8. Which of the following is a predictive model?

A. Clustering

B. Regression

C. Summarization

D. Association rules

Correct : B. Regression

9. Which of the following is a descriptive model?

A. Classification

B. Regression

C. Sequence discovery

D. Association rules

Correct : C. Sequence discovery

10. A_________model identifies patterns or relationships.

A. Descriptive

B. Predictive

C. Regression

D. Time series analysis

Correct : A. Descriptive

11. A predictive model makes use of______.

A. Current data.

B. Historical data.

C. Both current and historical data.

D. Assumptions

Correct : B. Historical data.

12. ______ maps data into predefined groups.

A. Regression

B. Time series analysis

C. Prediction

D. Classification

Correct : D. Classification

13. _____ is used to map a data item to a real valued prediction variable.

A. Regression

B. Time series analysis

C. Prediction

D. Classification

Correct : B. Time series analysis

14. In _____ , the value of an attribute is examined as it varies over time.

A. Regression

B. Time series analysis

C. Sequence discovery

D. Prediction

Correct : B. Time series analysis

15. In ______ the groups are not predefined.

A. Association rules

B. Summarization

C. Clustering

D. Prediction

Correct : C. Clustering

16. Link Analysis is otherwise called as ____.

A. Affinity analysis

B. Association rules

C. Both A & B

D. Prediction

Correct : C. Both A & B

17. ______ is a the input to KDD.

A. Data

B. Information

C. Query

D. Process

Correct : A. Data

18. The output of KDD is ______.

A. Data

B. Information

C. Query

D. Useful information

Correct : D. Useful information

19. The KDD process consists of ____steps.

A. Three

B. Four

C. Five

D. Six

Correct : C. Five

20. Treating incorrect or missing data is called as________.

A. Selection

B. Preprocessing

C. Transformation

D. Interpretation

Correct : B. Preprocessing

21. Converting data from different sources into a common format for processing is called as____ .

A. Selection

B. Preprocessing

C. Transformation

D. Interpretation

Correct : C. Transformation

22. Various visualization techniques are used in_________step of KDD.

A. Selection

B. Transformation

C. Data mining

D. Interpretation

Correct : D. Interpretation

23. Extreme values that occur infrequently are called as___________.

A. Outliers

B. Rare values

C. Dimensionality reduction

D. All of the above

Correct : A. Outliers

24. Box plot and scatter diagram techniques are_________.

A. Graphical

B. Geometri

C. C Icon-base

D. D Pixel-based

Correct : B. Geometri

25. _____ is used to proceed from very specific knowledge to more general information.

A. Induction

B. Compression

C. Approximation

D. Substitution

Correct : A. Induction

26. Describing some characteristics of a set of data by a general model is viewed as___________.

A. Induction.

B. Compression

C. Approximation

D. Summarization

Correct : B. Compression

27. ______ helps to uncover hidden information about the data.

A. Induction

B. Compression

C. Approximation

D. Summarization

Correct : C. Approximation

28. ______ are needed to identify training data and desired results.

A. Programmers

B. Designers

C. Users

D. Administrators

Correct : C. Users

29. Over fitting occurs when a model_________.

A. Does fit in future states

B. Does not fit in future states

C. Does fit in current state

D. Does not fit in current state

Correct : B. Does not fit in future states

30. The problem of dimensionality curse involves___________.

A. The use of some attributes may interfere with the correct completion of a data mining task.

B. The use of some attributes may simply increase the overall complexity.

C. Some may decrease the efficiency of the algorithm.

D. All of the above

Correct : D. All of the above

31. Incorrect or invalid data is known as _______.

A. Changing data

B. Noisy data

C. Outliers

D. Missing data

Correct : B. Noisy data

32. ROI is an acronym of _______.

A. Return on Investment

B. Return on Information

C. Repetition of Information

D. Runtime of Instruction

Correct : A. Return on Investment

33. The ______of data could result in the disclosure of information that is deemed to be confidential.

A. Authorized use

B. Unauthorized use

C. Authenticated use

D. Unauthenticated use

Correct : B. Unauthorized use

34. _________data are noisy and have many missing attribute values.

A. Preprocessed

B. Cleaned

C. Real-worl

D. D Tr

Correct : D. D Tr

35. The rise of DBMS occurred in early _______.

A. 1950's

B. 1960's

C. 1970's

D. 1980's

Correct : C. 1970's

36. SQL stand for_________.

A. Standard Query Language

B. Structured Query Language

C. Standard Quick List.

D. Structured Query list

Correct : B. Structured Query Language

37. Which of the following is not a data mining metric?

A. Space complexity

B. Time complexity

C. ROI

D. All of the above

Correct : D. All of the above

38. Reducing the number of attributes to solve the high dimensionality problem is called as_____________.

A. Dimensionality curse

B. Dimensionality reduction

C. Cleaning

D. Over fitting

Correct : B. Dimensionality reduction

39. Data that are not of interest to the data mining task is called as _____.

A. Missing data

B. Changing data

C. Irrelevant data

D. Noisy data

Correct : C. Irrelevant data

40. _________are effective tools to attack the scalability problem.

A. Sampling

B. Parallelization

C. Both A & B

D. None of the above

Correct : C. Both A & B

41. Market-basket problem was formulated by____________.

A. Agrawal et al

B. Steve et al

C. Toda et al

D. Simon et al

Correct : A. Agrawal et al

42. Data mining helps in________.

A. Inventory managemen

B. Sales promotion strategies

C. Marketing strategies

D. All of the above

Correct : D. All of the above

43. The proportion of transaction supporting X in T is called_____________.

A. Confidence

B. Support

C. Support count

D. All of the above

Correct : B. Support

44. The absolute number of transactions supporting X in T is called _______.

A. Confidence

B. Support

C. Support count

D. None of the above

Correct : C. Support count

45. The value that says that transactions in D that support X also support Y is called__________.

A. Confidence

B. Support

C. Support count

D. None of the above

Correct : A. Confidence

46. If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the support of bread and jam is_________.

A. 2%

B. 20%

C. 3%

D. 30%

Correct : A. 2%

47. 7 If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the confidence of buying bread with jam is____________.

A. 33.33%

B. 66.66%

C. 45%

D. 50%

Correct : D. 50%

48. The left hand side of an association rule is called________.

A. Consequent

B. Onset

C. Antecedent

D. Precedent

Correct : C. Antecedent

49. The right hand side of an association rule is called__________.

A. Consequent

B. Onset

C. Antecedent

D. Precedent

Correct : A. Consequent

50. Which of the following is not a desirable feature of any efficient algorithm?

A. To reduce number of input operation

B. To reduce number of output operations

C. To be efficient in computing

D. To have maximal code length

Correct : D. To have maximal code length

51. All set of items whose support is greater than the user-specified minimum support are called as_____________

A. Border set

B. Frequent set

C. Maximal frequent set

D. Lattice

Correct : B. Frequent set

52. If a set is a frequent set and no superset of this set is a frequent set, then it is called____________

A. Maximal frequent set

B. Border set

C. Lattice

D. Infrequent sets

Correct : A. Maximal frequent set

53. Any subset of a frequent set is a frequent set. This is_________

A. Upward closure property

B. Downward closure property

C. Maximal frequent set

D. Border set

Correct : B. Downward closure property

54. Any superset of an infrequent set is an infrequent set. This is ___________

A. Maximal frequent set

B. Border set

C. Upward closure property

D. Downward closure property

Correct : C. Upward closure property

55. If an itemset is not a frequent set and no superset of this is a frequent set, then it is

A. Maximal frequent set

B. Border set

C. Upward closure property

D. Downward closure property

Correct : B. Border set

56. A priori algorithm is otherwise called as_________

A. Width-wise algorithm

B. Level-wise algorithm

C. Pincer-search algorithm

D. FP growth algorithm

Correct : B. Level-wise algorithm

57. The A Priori algorithm is a____________

A. Top-down search

B. Breadth first search

C. Depth first search

D. Bottom-up search

Correct : D. Bottom-up search

58. The first phase of A Priori algorithm is___________

A. Candidate generation

B. Itemset generation

C. Pruning

D. Partitioning

Correct : A. Candidate generation

59. The second phase of A Priori algorithm is____________

A. Candidate generation

B. Itemset generation

C. Pruning

D. Partitioning

Correct : C. Pruning

60. The step eliminates the extensions of (k-1)-itemsets which are not found to be frequent, from being considered for counting support.

A. Candidate generation

B. Pruning

C. Partitioning

D. Itemset eliminations

Correct : B. Pruning

61. The a priori frequent itemset discovery algorithm moves in the lattice

A. Upward

B. Downward

C. Breadthwise

D. Both upward and downward

Correct : A. Upward

62. After the pruning of a priori algorithm,__________will remain

A. Only candidate set

B. No candidate set

C. Only border set

D. No border set

Correct : B. No candidate set

63. The number of iterations in a priori

A. Increases with the size of the maximum frequent set

B. Decreases with increase in size of the maximum frequent set

C. Increases with the size of the data

D. Decreases with the increase in size of the data

Correct : A. Increases with the size of the maximum frequent set

64. MFCS is the acronym of____________

A. Maximum Frequency Control Set

B. Minimal Frequency Control Set

C. Maximal Frequent Candidate Set

D. Minimal Frequent Candidate Set

Correct : C. Maximal Frequent Candidate Set

65. Dynamic Itemset Counting Algorithm was proposed by

A. Bin et al

B. Argawal et at

C. Toda et al

D. Simon et at

Correct : A. Bin et al

66. Itemsets in the category of structures have a counter and the stop number with them

A. Dashed

B. Circle

C. Box

D. Solid

Correct : A. Dashed

67. The itemsets in the_________category structures are not subjected to any counting

A. Dashes

B. Box

C. Soli

D. D Circle

Correct : C. Soli

68. Certain itemsets in the dashed circle whose support count reach support value during an iteration move into the______________

A. Dashed box

B. Solid circle

C. Solid box

D. None of the above

Correct : A. Dashed box

69. Certain itemsets enter afresh into the system and get into the , which are essentially the supersets of the itemsets that move from the dashed circle to the dashed box

A. Dashed box

B. Solid circle

C. Solid box

D. Dashed circle

Correct : D. Dashed circle

70. The item sets that have completed on full pass move from dashed circle to________

A. Dashed box

B. Solid circle

C. Solid box

D. None of the above

Correct : B. Solid circle

71. The FP-growth algorithm has phases

A. One

B. Two

C. Three

D. Four

Correct : B. Two

72. A frequent pattern tree is a tree structure consisting of ________

A. An item-prefix-tree

B. A frequent-item-header table

C. A frequent-item-node

D. Both A & B

Correct : D. Both A & B

73. The non-root node of item-prefix-tree consists of fields

A. Two

B. Three

C. Four

D. Five

Correct : B. Three

74. The frequent-item-header-table consists of fields

A. Only one.

B. Two.

C. Three.

D. Four

Correct : B. Two.

75. The paths from root node to the nodes labelled 'a' are called_________

A. Transformed prefix path

B. Suffix subpath

C. Transformed suffix path

D. Prefix subpath

Correct : D. Prefix subpath

76. The transformed prefix paths of a node 'a' form a truncated database of pattern which cooccur with a is called________

A. Suffix path

B. FP-tree

C. Conditional pattern base

D. Prefix path

Correct : C. Conditional pattern base

77. The goal of________is to discover both the dense and sparse regions of a data set

A. Association rule

B. Classification

C. Clustering

D. Genetic Algorithm

Correct : C. Clustering

78. Which of the following is a clustering algorithm?

A. A priori

B. CLARA

C. Pincer-Search

D. FP-growth

Correct : B. CLARA

79. clustering technique start with as many clusters as there are records, with each cluster having only one record

A. Agglomerative

B. Divisive

C. Partition

D. Numeric

Correct : A. Agglomerative

80. clustering techniques starts with all records in one cluster and then try to split that

A. Agglomerative.

B. Divisive.

C. Partition.

D. Numeric

Correct : B. Divisive.

81. Which of the following is a data set in the popular UCI machine-learning repository?

A. CLARA.

B. CACTUS.

C. STIRR.

D. MUSHROOM

Correct : D. MUSHROOM

82. In algorithm each cluster is represented by the center of gravity of the cluster

A. K-medoid

B. K-means

C. Stirr

D. Rock

Correct : B. K-means

83. In each cluster is represented by one of the objects of the cluster located near the center

A. K-medoid

B. K-means

C. Stirr

D. Rock

Correct : A. K-medoid

84. Pick out a k-medoid algorithm

A. DBSCAN

B. BIRCH

C. PAM

D. CURE

Correct : C. PAM

85. Pick out a hierarchical clustering algorithm

A. DBSCAN

B. CURE

C. PAM

D. BIRCH

Correct : D. BIRCH

86. CLARANS stands for

A. CLARA Net Server

B. Clustering Large Application Range Network Search

C. Clustering Large Applications based on Randomized Search

D. Clustering Application Randomized Search

Correct : C. Clustering Large Applications based on Randomized Search

87. BIRCH is a________

A. Agglomerative clustering algorithm

B. Hierarchical algorithm

C. Hierarchical-agglomerative algorithm

D. Divisive

Correct : C. Hierarchical-agglomerative algorithm

88. The cluster features of different subclusters are maintained in a tree called_________

A. CF tree

B. FP tree

C. FP growth tree

D. B tree

Correct : A. CF tree

89. The_______algorithm is based on the observation that the frequent sets are normally very few in number compared to the set of all itemsets

A. A priori

B. Clustering

C. Association rule

D. Partition

Correct : D. Partition

90. The partition algorithm uses scans of the databases to discover all frequent sets

A. Two

B. Four

C. Six

D. Eight

Correct : A. Two

91. The basic idea of the Apriori algorithm is to generate_____item sets of a particular size & scans the database

A. Candidate

B. Primary

C. Secondary

D. Superkey

Correct : A. Candidate

92. is the most well-known association rule algorithm and is used in most commercial products

A. Apriori algorithm

B. Partition algorithm

C. Distributed algorithm

D. Pincer-search algorithm

Correct : A. Apriori algorithm

93. An algorithm called________is used to generate the candidate item sets for each pass after the first

A. Apriori

B. Apriori-gen

C. Sampling

D. Partition

Correct : B. Apriori-gen

94. The basic partition algorithm reduces the number of database scans to __________ & divides it into partitions

A. One

B. Two

C. Three

D. Four

Correct : B. Two

95. and prediction may be viewed as types of classification

A. Decision.

B. Verification.

C. Estimation.

D. Illustration

Correct : C. Estimation.

96. can be thought of as classifying an attribute value into one of a set of possible classes

A. Estimation.

B. Prediction.

C. Identification.

D. Clarification

Correct : B. Prediction.

97. Prediction can be viewed as forecasting a value

A. Non-continuous.

B. Constant.

C. Continuous.

D. variable

Correct : C. Continuous.

98. data consists of sample input data as well as the classification assignment for the data

A. Missing.

B. Measuring.

C. Non-training.

D. Training

Correct : B. Measuring.

99. Rule based classification algorithms generate_________rule to perform the classification

A. If-then

B. While

C. Do while

D. Switch

Correct : A. If-then

100. are a different paradigm for computing which draws its inspiration from neuroscience

A. Computer networks

B. Neural networks

C. Mobile networks

D. Artificial networks

Correct : B. Neural networks