Quiznetik

Data Mining and Data Warehouse | Set 4

1. Which is not the type of attribute used in distance measure?

Correct : D. rank

2. _____ method is used to find the distance between two objects represented by numerical attributes.

Correct : D. all of these

3. Contingency table is prepared for _______ attribute data.

Correct : C. binay

4. Which are the applications of proximity measures?

Correct : D. all of these

5. _________ matrix represents the distance between all objects in the dataset

Correct : B. dissimilarity

6. If o1 and o2 are two objects and distance between these objects is zero then it means_____

Correct : A. o1 and o2 are totally similar

7. Identify the correct subtype of Binary attribute.

Correct : D. both b and c

8. _____ Lower when objects are more alike.

Correct : A. dissimilarity

9. Adaptive system management is

Correct : A. It uses machine-learning techniques. Here program can learn from past experience and adapt themselves to new situations

10. Algorithm is

Correct : B. Computational procedure that takes some value as input and produces some value as output

11. Background knowledge referred to

Correct : A. Additional acquaintance used by a learning algorithm to facilitate the learning process

12. Back propagation networks is

Correct : B. A neural network that makes use of a hidden layer

13. Bayesian classifiers is

Correct : A. A class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory.

14. Bias is

Correct : B. Any mechanism employed by a learning system to constrain the search space of a hypothesis.

15. Case-based learning is

Correct : C. An approach to the design of learning algorithms that is inspired by the fact that when people encounter new situations, they often explain them by reference to familiar experiences, adapting the explanations to fit the new situation.

16. Binary attribute are

Correct : A. This takes only two values. In general, these values will be 0 and 1 and they can be coded as one bit

17. Biotope are

Correct : B. The natural environment of a certain species

18. Black boxes

Correct : C. Systems that can be used without knowledge of internal operations

19. Artificial intelligence is

Correct : C. Science of making machines performs tasks that would require intelligence when performed by humans

20. Cache is

Correct : A. It is a memory buffer that is used to store data that is needed frequently by an algorithm in order to minimize input/ output traffic

21. Cardinality of an attribute is

Correct : B. The number of different values that a given attribute can take

22. Cartesian space is

Correct : A. It is a memory buffer that is used to store data that is needed frequently by an algorithm in order to minimize input/ output traffic

23. Classification is

Correct : A. A subdivision of a set of examples into a number of classes

24. Classification accuracy is

Correct : B. Measure of the accuracy, of the classification of a concept that is given by a certain theory

25. Cluster is

Correct : A. Group of similar objects that differ significantly from other objects

26. Data is

Correct : C. Symbolic representation of facts or ideas from which information can potentially be extract

27. A definition of a concept is——if it recognizes all the instances of that concept.

Correct : A. Complete

28. A definition or a concept is ———————if it does not classify any examples as coming within the concept

Correct : B. Consistent

29. Classification task referred to

Correct : C. The task of assigning a classification to a set of examples

30. Database is

Correct : A. Large collection of data mostly stored in a computer system

31. Data cleaning is

Correct : B. The removal of noise errors and incorrect input from a database

32. Data dictionary is

Correct : C. The systematic description of the syntactic structure of a specific database. It describes the structure of the attributes the tables and foreign key relationships.

33. Data mining is

Correct : A. The actual discovery phase of a knowledge discovery process

34. Data selection is

Correct : B. The stage of selecting the right data for a KDD process

35. Data warehouse is

Correct : C. A subject-oriented integrated time- variant non-volatile collection of data in support of management

36. Coding is

Correct : B. Operations on a database to transform or simplify data in order to prepare it for a machine-learning algorithm

37. DB/2 is

Correct : A. A family of relational database manage- ment systems marketed by IBM

38. Decision support systems (DSS) is

Correct : B. Interactive systems that enable decision makers to use databases and models on a computer in order to solve ill- structured problems

39. Decision trees is

Correct : C. It consists of nodes and branches starting from a single root node. Each node represents a test, or decision.

40. Deep knowledge referred to

Correct : A. It is hidden within a database and can only be recovered if one is given certain clues (an example IS encrypted information)

41. Discovery is

Correct : B. The process of executing implicit previously unknown and potentially useful information from dat(A)

42. DNA (Deoxyribonucleic acid)

Correct : C. An extremely complex molecule that occurs in human chromosomes and that carries genetic information in the form of genes.

43. Enrichment is

Correct : A. A stage of the KDD process in which new data is added to the existing selection

44. Enumeration is referred to

Correct : B. The process of finding a solution for a problem simply by enumerating all possible solutions according to some pre-defined order and then testing them

45. Euclidean distance measure is

Correct : C. The distance between two points as calculated using the Pythagoras theo- rem

46. Heuristic is

Correct : B. An approach to a problem that is not guaranteed to work but performs well in most cases.

47. Heterogeneous databases referred to

Correct : A. A set of databases from different vendors, possibly using different database paradigms

48. Hidden knowledge referred to

Correct : C. Information that is hidden in a database and that cannot be recovered by a simple SQL query.

49. Hybrid is

Correct : A. Combining different types of method or information

50. Evolutionary computation is

Correct : B. Approach to the design of learning algorithms that is structured along the lines of the theory of evolution.

51. Expert systems

Correct : C. Decision support systems that contain an Information base filled with the knowledge of an expert formulated in terms of if-then rules

52. Extendible architecture is

Correct : A. Modular design of a software application that facilitates the integration of new modules

53. Falsification is

Correct : B. Showing a universal law or rule to be invalid by providing a counter example

54. Foreign key is

Correct : C. A set of attributes in a database table that refers to data in another table

55. Hybrid learning is

Correct : A. Machine-learning involving different techniques

56. Incremental learning referred to

Correct : B. The learning algorithmic analyzes the examples on a systematic basis and makes incremental adjustments to the theory that is learned

57. Information content is

Correct : A. The amount of information with in data as opposed to the amount of redundancy or noise

58. Inclusion dependencies

Correct : C. Restriction that requires data in one column of a database table to the a sub- set of another-column

59. KDD (Knowledge Discovery in Databases) is referred to

Correct : A. Non-trivial extraction of implicit previously unknown and potentially useful information from dat(A)

60. Key is referred to

Correct : B. Set of columns in a database table that can be used to identify each record within this table uniquely

61. Inductive learning is

Correct : C. Learning by generalizing from examples

62. Integrated is

Correct : B. One of the defining aspects of a data warehouse

63. Knowledge engineering is

Correct : A. The process of finding the right formal representation of a certain body of knowledge in order to represent it in a knowledge-based system

64. Kohonen self-organizing map referred to

Correct : B. It automatically maps an external signal space into a system’s internal representational space. They are useful in the performance of classification tasks

65. Learning is

Correct : C. A process where an individual learns how to carry out a certain task when making a transition from a situation in which the task cannot be carried out to a situation in which the same task under the same circumstances can be carried out.

66. Learning algorithm referrers to

Correct : A. An algorithm that can learn

67. Meta-learning is

Correct : C. A machine-learning approach that abstracts from the actual strategy of an individual algorithm and can therefore be applied to any other form of machine learning.

68. Machine learning is

Correct : B. A sub-discipline of computer science that deals with the design and implementation of learning algorithms.

69. Inductive logic programming is

Correct : A. A class of learning algorithms that try to derive a Prolog program from examples*

70. Multi-dimensional knowledge is

Correct : B. A table with n independent attributes can be seen as an n- dimensional space

71. Naive prediction is

Correct : C. A prediction made using an extremely simple method, such as always predicting the same output.

72. Knowledge is referred to

Correct : C. collection of interesting and useful patterns in a database

73. Node is

Correct : A. A component of a network

74. Projection pursuit is

Correct : C. Discipline in statistics that studies ways to find the most interesting projections of multi-dimensional spaces

75. Statistical significance is

Correct : B. Measure of the probability that a certain hypothesis is incorrect given certain observations.

76. Prediction is

Correct : A. The result of the application of a theory or a rule in a specific case

77. Primary key is

Correct : B. One of several possible enters within a database table that is chosen by the designer as the primary means of accessing the data in the table

78. Noise is

Correct : B. In the context of KDD and data mining, this refers to random errors in a database table.

79. Quadratic complexity is

Correct : A. A reference to the speed of an algorithm, which is quadratically dependent on the size of the dat(A)

80. Query tools are

Correct : C. Tools designed to query a database.

81. Prolog is

Correct : A. A programming language based on logic

82. Massively parallel machine is

Correct : B. A computer where each processor has its own operating system, its own memory, and its own hard disk

83. Meta-data is

Correct : C. Describes the structure of the contents of a database

84. n(log n) is referred to

Correct : A. A measure of the desired maximal complexity of data mining algorithms

85. Operational database is

Correct : B. A database containing volatile data used for the daily operation of an organization

86. Oracle is referred to

Correct : C. Relational database management system

87. Paradigm is

Correct : A. General class of approaches to a problem.

88. Patterns is

Correct : C. Structures in a database those are statistically relevant

89. Parallelism is

Correct : B. Performing several computations simultaneously

90. Perceptron is

Correct : D. Simple forerunner of modern neural networks, without hidden layers.

91. Shallow knowledge

Correct : B. The information stored in a database that can be, retrieved with a single query.

92. Statistics

Correct : A. The science of collecting, organizing, and applying numerical facts

93. Subject orientation

Correct : C. One of the defining aspects of a data warehouse, which is specially built around all the existing applications of the operational dat(A)

94. Search space

Correct : A. The large set of candidate solutions possible for a problem

95. Transparency

Correct : C. Worth of the output of a machine- learning program that makes it under- standable for humans

96. Quantitative attributes are

Correct : B. Attributes of a database table that can take only numerical values.

97. Unsupervised algorithms

Correct : A. It do not need the control of the human operator during their execution.

98. Vector

Correct : B. An arrow in a multi-dimensional space. It is a quantity usually characterized by an ordered set of scalars.

99. Verification

Correct : C. The validation of a theory on the basis of a finite number of examples

100. Visualization techniques are

Correct : A. A class of graphic techniques used to visualize the contents of a database