Quiznetik

Machine Learning (ML) | Set 2

1. MLE estimates are often undesirable because

Correct : B. they have high variance

2. The difference between the actual Y value and the predicted Y value found using a regression equation is called the

Correct : A. slope

3. Neural networks

Correct : C. can be used for regression as well as classification

4. Linear Regression is a _______ machine learning algorithm.

Correct : A. supervised

5. Which of the following methods/methods do we use to find the best fit line for data in Linear Regression?

Correct : A. least square error

6. Which of the following methods do we use to best fit the data in Logistic Regression?

Correct : B. maximum likelihood

7. Lasso can be interpreted as least-squares linear regression where

Correct : A. weights are regularized with the l1 norm

8. Which of the following evaluation metrics can be used to evaluate a model while modeling a continuous output variable?

Correct : D. mean-squared-error

9. Simple regression assumes a __________ relationship between the input attribute and output attribute.

Correct : C. linear

10. In the regression equation Y = 75.65 + 0.50X, the intercept is

Correct : B. 75.65

11. The selling price of a house depends on many factors. For example, it depends on the number of bedrooms, number of kitchen, number of bathrooms, the year the house was built, and the square footage of the lot. Given these factors, predicting the selling price of the house is an example of ____________ task.

Correct : D. multiple linear regression

12. Suppose, you got a situation where you find that your linear regression model is under fitting the data. In such situation which of the following options would you consider?

Correct : A. you will add more features

13. We have been given a dataset with n records in which we have input attribute as x and output attribute as y. Suppose we use a linear regression method to model this data. To test our linear regressor, we split the data in training set and test set randomly. Now we increase the training set size gradually. As the training set size increases, What do you expect will happen with the mean training error?

Correct : D. can’t say

14. We have been given a dataset with n records in which we have input attribute as x and output attribute as y. Suppose we use a linear regression method to model this data. To test our linear regressor, we split the data in training set and test set randomly. What do you expect will happen with bias and variance as you increase the size of training data?

Correct : D. bias increases and variance decreases

15. Regarding bias and variance, which of the following statements are true? (Here ‘high’ and ‘low’ are relative to the ideal model. (i) Models which overfit are more likely to have high bias (ii) Models which overfit are more likely to have low bias (iii) Models which overfit are more likely to have high variance (iv) Models which overfit are more likely to have low variance

Correct : B. (ii) and (iii)

16. Which of the following indicates the fundamental of least squares?

Correct : D. arithmetic mean should be minimized

17. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y. Now Imagine that you are applying linear regression by fitting the best fit line using least square error on this data. You found that correlation coefficient for one of it’s variable(Say X1) with Y is 0.95.

Correct : B. relation between the x1 and y is strong

18. In terms of bias and variance. Which of the following is true when you fit degree 2 polynomial?

Correct : C. bias will be high, variance will be low

19. Which of the following statements are true for a design matrix X ∈ Rn×d with d > n? (The rows are n sample points and the columns represent d features.)

Correct : D. at least one principal component direction is orthogonal to a hyperplane that contains all the sample points

20. Point out the wrong statement.

Correct : C. least squares is not an estimation tool

21. Suppose, you got a situation where you find that your linear regression model is under fitting the data. In such situation which of the following options would you consider?

Correct : A. you will add more features

22. If X and Y in a regression model are totally unrelated,

Correct : B. the coefficient of determination would be 0

23. Regarding bias and variance, which of the following statements are true? (Here ‘high’ and ‘low’ are relative to the ideal model. (i) Models which overfit are more likely to have high bias (ii) Models which overfit are more likely to have low bias (iii) Models which overfit are more likely to have high variance (iv) Models which overfit are more likely to have low variance

Correct : B. (ii) and (iii)

24. Which of the following statements are true for a design matrix X ∈ Rn×d with d > n? (The rows are n sample points and the columns represent d features.)

Correct : D. at least one principal component direction is orthogonal to a hyperplane that contains all the sample points

25. Problem in multi regression is ?

Correct : C. both multicollinearity & overfitting

26. How can we best represent ‘support’ for the following association rule: “If X and Y, then Z”.

Correct : C. {z}/{x,y}

27. Choose the correct statement with respect to ‘confidence’ metric in association rules

Correct : A. it is the conditional probability that a randomly selected transaction will include all the items in the consequent given that the transaction includes all the items in the antecedent.

28. What are tree based classifiers?

Correct : C. both options except none

29. What is gini index?

Correct : B. it is a measure of purity

30. Which of the following sentences are correct in reference to Information gain? a. It is biased towards single-valued attributes b. It is biased towards multi-valued attributes c. ID3 makes use of information gain d. The approact used by ID3 is greedy

Correct : C. b, c and d

31. Multivariate split is where the partitioning of tuples is based on a combination of attributes rather than on a single attribute.

Correct : A. true

32. Gain ratio tends to prefer unbalanced splits in which one partition is much smaller than the other

Correct : A. true

33. The gini index is not biased towards multivalued attributed.

Correct : B. false

34. Gini index does not favour equal sized partitions.

Correct : B. false

35. When the number of classes is large Gini index is not a good choice.

Correct : A. true

36. Attribute selection measures are also known as splitting rules.

Correct : A. true

37. his clustering approach initially assumes that each data instance represents a single cluster.

Correct : C. agglomerative clustering

38. Which statement is true about the K-Means algorithm?

Correct : C. all attributes must be numeric

39. KDD represents extraction of

Correct : B. knowledge

40. The most general form of distance is

Correct : B. eucledian

41. Which of the following algorithm comes under the classification

Correct : D. k-nearest neighbor

42. Hierarchical agglomerative clustering is typically visualized as?

Correct : A. dendrogram

43. The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent,from being considered for counting support

Correct : D. pruning

44. The distance between two points calculated using Pythagoras theorem is

Correct : B. eucledian distance

45. Which one of these is not a tree based learner?

Correct : C. bayesian classifier

46. Which one of these is a tree based learner?

Correct : D. random forest

47. What is the approach of basic algorithm for decision tree induction?

Correct : A. greedy

48. Which of the following classifications would best suit the student performance classification systems?

Correct : A. if...then... analysis

49. Given that we can select the same feature multiple times during the recursive partitioning of the input space, is it always possible to achieve 100% accuracy on the training data (given that we allow for trees to grow to their maximum size) when building decision trees?

Correct : B. no

50. This clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration

Correct : A. k-means clustering

51. The number of iterations in apriori ___________ Select one: a. b. c. d.

Correct : C. increases with the size of the maximum frequent set

52. Frequent item sets is

Correct : D. superset of both closed frequent item sets and maximal frequent item sets

53. A good clustering method will produce high quality clusters with

Correct : C. high intra class similarity

54. Which statement is true about neural network and linear regression models?

Correct : D. both models require input attributes to be numeric

55. Which Association Rule would you prefer

Correct : C. low support and high confidence

56. In a Rule based classifier, If there is a rule for each combination of attribute values, what do you called that rule set R

Correct : A. exhaustive

57. The apriori property means

Correct : A. if a set cannot pass a test, its supersets will also fail the same test

58. If an item set ‘XYZ’ is a frequent item set, then all subsets of that frequent item set are

Correct : C. frequent

59. Clustering is ___________ and is example of ____________learning

Correct : D. descriptive and unsupervised

60. To determine association rules from frequent item sets

Correct : C. both minimum support and confidence are needed

61. If {A,B,C,D} is a frequent itemset, candidate rules which is not possible is

Correct : B. d –>abcd

62. Which Association Rule would you prefer

Correct : B. low support and high confidence

63. This clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration

Correct : B. k-means clustering

64. Classification rules are extracted from _____________

Correct : A. decision tree

65. What does K refers in the K-Means algorithm which is a non-hierarchical clustering approach?

Correct : D. number of clusters

66. How will you counter over-fitting in decision tree?

Correct : A. by pruning the longer rules

67. What are two steps of tree pruning work?

Correct : B. postpruning and prepruning

68. Which of the following sentences are true?

Correct : D. all of the above

69. Assume that you are given a data set and a neural network model trained on the data set. You are asked to build a decision tree model with the sole purpose of understanding/interpreting the built neural network model. In such a scenario, which among the following measures would you concentrate most on optimising?

Correct : C. fidelity of the decision tree model, which is the fraction of instances on which the neural network and the decision tree give the same output

70. Which of the following properties are characteristic of decision trees? (a) High bias (b) High variance (c) Lack of smoothness of prediction surfaces (d) Unbounded parameter set

Correct : C. b, c and d

71. To control the size of the tree, we need to control the number of regions. One approach to do this would be to split tree nodes only if the resultant decrease in the sum of squares error exceeds some threshold. For the described method, which among the following are true? (a) It would, in general, help restrict the size of the trees (b) It has the potential to affect the performance of the resultant regression/classification model (c) It is computationally infeasible

Correct : A. a and b

72. Which among the following statements best describes our approach to learning decision trees

Correct : D. identify the model which gives performance close to the best greedy approximation performance (option (b)) with the smallest partition scheme

73. Having built a decision tree, we are using reduced error pruning to reduce the size of the tree. We select a node to collapse. For this particular node, on the left branch, there are 3 training data points with the following outputs: 5, 7, 9.6 and for the right branch, there are four training data points with the following outputs: 8.7, 9.8, 10.5, 11. What were the original responses for data points along the two branches (left & right respectively) and what is the new response after collapsing the node?

Correct : C. 7.2, 10, 8.8

74. Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set. Which among the following statements are possible in light of the performance improvement observed? (a) The collapsed node helped overcome the effect of one or more noise affected data points in the training set (b) The validation set had one or more noise affected data points in the region corresponding to the collapsed node (c) The validation set did not have any data points along at least one of the collapsed branches (d) The validation set did have data points adversely affected by the collapsed node

Correct : D. all of the above

75. Time Complexity of k-means is given by

Correct : B. o(tkn)

76. In Apriori algorithm, if 1 item-sets are 100, then the number of candidate 2 item-sets are

Correct : C. 4950

77. Machine learning techniques differ from statistical techniques in that machine learning methods

Correct : A. are better able to deal with missing and noisy data

78. The probability that a person owns a sports car given that they subscribe to automotive magazine is 40%. We also know that 3% of the adult population subscribes to automotive magazine. The probability of a person owning a sports car given that they don’t subscribe to automotive magazine is 30%. Use this information to compute the probability that a person subscribes to automotive magazine given that they own a sports car

Correct : B. 0.0396

79. What is the final resultant cluster size in Divisive algorithm, which is one of the hierarchical clustering approaches?

Correct : C. singleton

80. Given a frequent itemset L, If |L| = k, then there are

Correct : C. 2k – 2 candidate association rules

81. Which Statement is not true statement.

Correct : B. k-means clustering aims to partition n observations into k clusters

82. which of the following cases will K-Means clustering give poor results? 1. Data points with outliers 2. Data points with different densities 3. Data points with round shapes 4. Data points with non-convex shapes

Correct : C. 2 and 4

83. What is Decision Tree?

Correct : D. none of the above

84. What are two steps of tree pruning work?

Correct : B. postpruning and prepruning

85. A database has 5 transactions. Of these, 4 transactions include milk and bread. Further, of the given 4 transactions, 2 transactions include cheese. Find the support percentage for the following association rule “if milk and bread are purchased, then cheese is also purchased”.

Correct : D. 0.42

86. Which of the following option is true about k-NN algorithm?

Correct : C. ??it can be used in both classification and regression??

87. How to select best hyperparameters in tree based models?

Correct : B. measure performance over validation data

88. What is true about K-Mean Clustering? 1. K-means is extremely sensitive to cluster center initializations 2. Bad initialization can lead to Poor convergence speed 3. Bad initialization can lead to bad overall clustering

Correct : D. 1, 2 and 3

89. What are tree based classifiers?

Correct : C. both options except none

90. What is gini index?

Correct : D. all (1,2 and 3)

91. Tree/Rule based classification algorithms generate ... rule to perform the classification.

Correct : A. if-then.

92. Decision Tree is

Correct : C. both a & b

93. Which of the following is true about Manhattan distance?

Correct : A. it can be used for continuous variables

94. A company has build a kNN classifier that gets 100% accuracy on training data. When they deployed this model on client side it has been found that the model is not at all accurate. Which of the following thing might gone wrong? Note: Model has successfully deployed and no technical issues are found at client side except the model performance

Correct : A. it is probably a overfitted model

95. hich of the following classifications would best suit the student performance classification systems?

Correct : A. if...then... analysis

96. Which statement is true about the K-Means algorithm? Select one:

Correct : C. all attributes must be numeric

97. Which of the following can act as possible termination conditions in K-Means? 1. For a fixed number of iterations. 2. Assignment of observations to clusters does not change between iterations. Except for cases with a bad local minimum. 3. Centroids do not change between successive iterations. 4. Terminate when RSS falls below a threshold.

Correct : D. 1,2,3,4

98. Which of the following statement is true about k-NN algorithm? 1) k-NN performs much better if all of the data have the same scale 2) k-NN works well with a small number of input variables (p), but struggles when the number of inputs is very large 3) k-NN makes no assumptions about the functional form of the problem being solved

Correct : D. 1,2 and 3

99. In which of the following cases will K-means clustering fail to give good results? 1) Data points with outliers 2) Data points with different densities 3) Data points with nonconvex shapes

Correct : C. 1, 2, and 3??

100. How will you counter over-fitting in decision tree?

Correct : A. by pruning the longer rules