Quiznetik

Data Mining | Set 1

1. Adaptive system management is

Correct : A. it uses machine-learning techniques. here program can learn from past experience and adapt themselves to new situations.

2. Bayesian classifiers is

Correct : A. a class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory.

3. Algorithm is

Correct : B. computational procedure that takes some value as input and produces some value as output.

4. Bias is

Correct : B. any mechanism employed by a learning system to constrain the search space of a hypothesis.

5. Background knowledge referred to

Correct : A. additional acquaintance used by a learning algorithm to facilitate the learning process.

6. Case-based learning is

Correct : C. an approach to the design of learning algorithms that is inspired by the fact that when people encounter new situations, they often explain them by reference to familiar experiences, adapting the explanations to fit the new situation.

7. Classification is

Correct : A. a subdivision of a set of examples into a number of classes.

8. Binary attribute are

Correct : A. this takes only two values. in general, these values will be 0 and 1 and .they can be coded as one bit

9. Classification accuracy is

Correct : B. measure of the accuracy, of the classification of a concept that is given by a certain theory.

10. Biotope are

Correct : B. the natural environment of a certain species

11. Cluster is

Correct : A. group of similar objects that differ significantly from other objects

12. Black boxes are

Correct : C. systems that can be used without knowledge of internal operations

13. A definition of a concept is-----if it recognizes all the instances of that concept

Correct : A. complete

14. A definition or a concept is------------- if it classifies any examples as coming within the concept

Correct : B. consistent

15. Data selection is

Correct : B. the stage of selecting the right data for a kdd process

16. DNA (Deoxyribonucleic acid)

Correct : C. an extremely complex molecule that occurs in human chromosomes and that carries genetic information in the form of genes.

17. Hybrid is

Correct : A. combining different types of method or information

18. Discovery is

Correct : B. the process of executing implicit previously unknown and potentially useful information from data.

19. Euclidean distance measure is

Correct : C. the distance between two points as calculated using the pythagoras theorem.

20. Hidden knowledge referred to

Correct : C. information that is hidden in a database and that cannot be recovered by a simple sql query.

21. Enrichment is

Correct : A. a stage of the kdd process in which new data is added to the existing selection

22. Heterogeneous databases referred to

Correct : A. a set of databases from different b vendors, possibly using different database paradigms

23. Enumeration is referred to

Correct : B. the process of finding a solution for a problem simply by enumerating all possible solutions according to some pre-defined order and then testing them

24. Heuristic is

Correct : B. an approach to a problem that is not guaranteed to work but performs well in most cases

25. Hybrid learning is

Correct : A. machine-learning involving different techniques

26. Kohonen self-organizing map referred to

Correct : B. it automatically maps an external signal space into a system\s internal representational space. they are useful in the performance of classification tasks

27. Incremental learning referred to

Correct : B. the learning algorithmic analyzes the examples on a systematic basis and makes incremental adjustments to the theory that is learned

28. Knowledge engineering is

Correct : A. the process of finding the right formal representation of a certain body of knowledge in order to represent it in a knowledge-based system

29. Information content is

Correct : A. the amount of information with in data as opposed to the amount of redundancy or noise.

30. Inductive learning is

Correct : C. learning by generalizing from examples

31. Inclusion dependencies

Correct : C. restriction that requires data in one column of a database table to the a subset of another-column

32. KDD (Knowledge Discovery in Databases) is referred to

Correct : A. non-trivial extraction of implicit previously unknown and potentially useful information from data

33. Learning is

Correct : C. a process where an individual learns how to carry out a certain task when making a transition from a situation in which the task cannot be carried out to a situation in which the same task under the same circumstances can be carried out.

34. Naive prediction is

Correct : C. a prediction made using an extremely simple method, such as always predicting the same output.

35. Learning algorithm referrers to

Correct : A. an algorithm that can learn

36. Knowledge is referred to

Correct : C. collection of interesting and useful patterns in a database

37. Node is

Correct : A. a component of a network

38. Machine learning is

Correct : B. a sub-discipline of computer science that deals with the design and implementation of learning algorithms

39. Projection pursuit is

Correct : C. discipline in statistics that studies ways to find the most interesting projections of multi-dimensional spaces

40. Inductive logic programming is

Correct : A. a class of learning algorithms that try to derive a prolog program from examples

41. Statistical significance is

Correct : B. measure of the probability that a certain hypothesis is incorrect given certain observations.

42. Multi-dimensional knowledge is

Correct : B. a table with n independent attributes can be seen as an n-dimensional space

43. Prediction is

Correct : A. the result of the application of a theory or a rule in a specific case

44. Query tools are

Correct : C. tools designed to query a database.

45. Operational database is

Correct : B. a database containing volatile data used for the daily operation of an organization

46. ...................... is an essential process where intelligent methods are applied to extract data patterns.

Correct : B. data mining

47. Which of the following is not a data mining functionality?

Correct : C. selection and interpretation

48. ............................. is a summarization of the general characteristics or features of a target class of data.

Correct : A. data characterization

49. ............................. is a comparison of the general features of the target class data objects against the general features of objects from one or multiple contrasting classes.

Correct : A. data characterization

50. Strategic value of data mining is ......................

Correct : C. time-sensitive

51. ............................. is the process of finding a model that describes and distinguishes data classes or concepts.

Correct : A. data characterization

52. iv) Handling uncertainty, noise, or incompleteness of data

Correct : D. all i, ii, iii and iv

53. The full form of KDD is ..................

Correct : A. knowledge database

54. The out put of KDD is .............

Correct : A. data

55. . The full form of OLAP is

Correct : C. online advanced preparation

56. ......................... is a subject-oriented, integrated, time-variant, nonvolatile collection or data in support of management decisions.

Correct : A. data mining

57. The data is stored, retrieved and updated in ....................

Correct : B. oltp

58. An .................. system is market-oriented and is used for data analysis by knowledge workers, including managers, executives, and analysts.

Correct : A. olap

59. ........................ is a good alternative to the star schema.

Correct : A. star schema

60. The ............................ exposes the information being captured, stored, and managed by operational systems.

Correct : C. data source view

61. The type of relationship in star schema is ...............

Correct : A. many to many

62. The .................. allows the selection of the relevant information necessary for the data warehouse.

Correct : D. business query view

63. Which of the following is not a component of a data warehouse?

Correct : C. lightly summarized data

64. Which of the following is not a kind of data warehouse application?

Correct : D. transaction processing

65. Data warehouse architecture is based on .......................

Correct : B. rdbms

66. .......................... supports basic OLAP operations, including slice and dice, drill-down, roll-up and pivoting.

Correct : C. data mining

67. The core of the multidimensional model is the ....................... , which consists of a large set of facts and a number of dimensions.

Correct : B. dimensions cube

68. The data from the operational environment enter ........................ of data warehouse.

Correct : A. current detail data

69. A data warehouse is ......................

Correct : A. updated by end users.

70. Business Intelligence and data warehousing is used for ..............

Correct : B. data mining

71. Data warehouse contains ................ data that is never found in the operational environment.

Correct : A. normalized

72. ................... are responsible for running queries and reports against data warehouse tables.

Correct : D. middle ware

73. The biggest drawback of the level indicator in the classic star schema is that is limits ............

Correct : B. quantify

74. ............................. are designed to overcome any limitations placed on the warehouse by the nature of the relational data model.

Correct : A. operational database

75. KDD describes the _________.

Correct : A. whole process of extraction of knowledge from data

76. SQL helps to find _______.

Correct : D. data under constraints that are already known

77. Translation of problem to learning technique is called as _______.

Correct : C. representational engineering.

78. Which one of the following is not a part of empirical cycle in scientific research?

Correct : C. Self learning.

79. ________and __________ are the important qualities of good learning algorithm.

Correct : A. Consistent, Complete.

80. Redundancy refers to the elements of a message that can be derived from other parts of _________.

Correct : C. same message.

81. Metadata describes __________.

Correct : B. structure of contents of database.

82. The partition of overall data warehouse is _______.

Correct : C. data mart.

83. __________ is used to load the information from operational database.

Correct : A. Replication technique.

84. ___________ multiprocessing machines share same hard disk and internal memory.

Correct : B. Symmetric.

85. A trivial result that is obtained by an extremely simple method is called _______.

Correct : A. naive prediction.

86. The information on two attributes is displayed in ____________ in scatter diagram.

Correct : C. cartesian space.

87. OLAP stands for ________.

Correct : A. Online Analytical Processing.

88. K-nearest neighbor is one of the _______.

Correct : C. purest search technique.

89. The intermediate unit in perceptron is ________.

Correct : B. associators.

90. OLAP is used to explore the ___________ knowledge.

Correct : C. multidimensional.

91. A natural way to visualize the process of training a self-organizing map is called __________.

Correct : A. kohonen movie.

92. Hidden knowledge can be found by using ________.

Correct : B. pattern recognition algorithm.

93. Deep knowledge can be found only by using ________.

Correct : A. clues.

94. The next stage to data selection in KDD process ______.

Correct : C. cleaning.

95. Enrichment means ____.

Correct : A. adding external data.

96. The decision support system is used only for _______.

Correct : D. queries.

97. In _________ approach data ware house is build first and all information needed is selected.

Correct : A. top-down.

98. The DB vendor who is able to operate massively parallel computers is ________.

Correct : B. IBM.

99. Which of the following is closely related to statistical significance and transparency?

Correct : B. Transparency.

100. ________ is a creative activity that has to be performed repeatedly in order to get best results.

Correct : C. Coding.