2. The difference between the actual Y value and the predicted Y value found using a regression equation is called the
Correct : A. slope
3. Neural networks
Correct : C. can be used for regression as well as classification
4. Linear Regression is a _______ machine learning algorithm.
Correct : A. supervised
5. Which of the following methods/methods do we use to find the best fit line for data in Linear Regression?
Correct : A. least square error
6. Which of the following methods do we use to best fit the data in Logistic Regression?
Correct : B. maximum likelihood
7. Lasso can be interpreted as least-squares linear regression where
Correct : A. weights are regularized with the l1 norm
8. Which of the following evaluation metrics can be used to evaluate a model while modeling a continuous output variable?
Correct : D. mean-squared-error
9. Simple regression assumes a __________ relationship between the input attribute and output attribute.
Correct : C. linear
10. In the regression equation Y = 75.65 + 0.50X, the intercept is
Correct : B. 75.65
11. The selling price of a house depends on many factors. For example, it depends on the number of bedrooms, number of kitchen, number of bathrooms, the year the house was built, and the square footage of the lot. Given these factors, predicting the selling price of the house is an example of ____________ task.
Correct : D. multiple linear regression
12. Suppose, you got a situation where you find that your linear regression model is under fitting the data. In such situation which of the following options would you consider?
Correct : A. you will add more features
13. We have been given a dataset with n records in which we have input attribute as x and output attribute as y. Suppose we use a linear regression method to model this data. To test our linear regressor, we split the data in training set and test set randomly. Now we increase the training set size gradually. As the training set size increases, What do you expect will happen with the mean training error?
Correct : D. can’t say
14. We have been given a dataset with n records in which we have input attribute as x and output attribute as y. Suppose we use a linear regression method to model this data. To test our linear regressor, we split the data in training set and test set randomly. What do you expect will happen with bias and variance as you increase the size of training data?
Correct : D. bias increases and variance decreases
15. Regarding bias and variance, which of the following statements are true? (Here ‘high’ and ‘low’ are relative to the ideal model.
(i) Models which overfit are more likely to have high bias
(ii) Models which overfit are more likely to have low bias
(iii) Models which overfit are more likely to have high variance
(iv) Models which overfit are more likely to have low variance
Correct : B. (ii) and (iii)
16. Which of the following indicates the fundamental of least squares?
Correct : D. arithmetic mean should be minimized
17. Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y. Now Imagine that you are applying linear regression by fitting the best fit line using least square error on this data. You found that correlation coefficient for one of it’s variable(Say X1) with Y is 0.95.
Correct : B. relation between the x1 and y is strong
18. In terms of bias and variance. Which of the following is true when you fit degree 2 polynomial?
Correct : C. bias will be high, variance will be low
19. Which of the following statements are true for a design matrix X ∈ Rn×d with d > n? (The rows are n sample
points and the columns represent d features.)
Correct : D. at least one principal component direction is orthogonal to a hyperplane that contains all the sample
points
20. Point out the wrong statement.
Correct : C. least squares is not an estimation tool
21. Suppose, you got a situation where you find that your linear regression model is under fitting the data. In such situation which of the following options would you consider?
Correct : A. you will add more features
22. If X and Y in a regression model are totally unrelated,
Correct : B. the coefficient of determination would be 0
23. Regarding bias and variance, which of the following statements are true? (Here ‘high’ and ‘low’ are relative to the ideal model.
(i) Models which overfit are more likely to have high bias
(ii) Models which overfit are more likely to have low bias
(iii) Models which overfit are more likely to have high variance
(iv) Models which overfit are more likely to have low variance
Correct : B. (ii) and (iii)
24. Which of the following statements are true for a design matrix X ∈ Rn×d with d > n? (The rows are n sample points and the columns represent d features.)
Correct : D. at least one principal component direction is orthogonal to a hyperplane that contains all the sample
points
25. Problem in multi regression is ?
Correct : C. both multicollinearity & overfitting
26. How can we best represent ‘support’ for the following association rule: “If X and Y, then Z”.
Correct : C. {z}/{x,y}
27. Choose the correct statement with respect to ‘confidence’ metric in association rules
Correct : A. it is the conditional probability that a randomly selected transaction will include all the items in the consequent given that the transaction includes all the items in the antecedent.
28. What are tree based classifiers?
Correct : C. both options except none
29. What is gini index?
Correct : B. it is a measure of purity
30. Which of the following sentences are correct in reference to
Information gain?
a. It is biased towards single-valued attributes
b. It is biased towards multi-valued attributes
c. ID3 makes use of information gain
d. The approact used by ID3 is greedy
Correct : C. b, c and d
31. Multivariate split is where the partitioning of tuples is based on a
combination of attributes rather than on a single attribute.
Correct : A. true
32. Gain ratio tends to prefer unbalanced splits in which one partition is much smaller than the other
Correct : A. true
33. The gini index is not biased towards multivalued attributed.
Correct : B. false
34. Gini index does not favour equal sized partitions.
Correct : B. false
35. When the number of classes is large Gini index is not a good choice.
Correct : A. true
36. Attribute selection measures are also known as splitting rules.
Correct : A. true
37. his clustering approach initially assumes that each data instance represents a single cluster.
Correct : C. agglomerative clustering
38. Which statement is true about the K-Means algorithm?
Correct : C. all attributes must be numeric
39. KDD represents extraction of
Correct : B. knowledge
40. The most general form of distance is
Correct : B. eucledian
41. Which of the following algorithm comes under the classification
Correct : D. k-nearest neighbor
42. Hierarchical agglomerative clustering is typically visualized as?
Correct : A. dendrogram
43. The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent,from being considered for counting support
Correct : D. pruning
44. The distance between two points calculated using Pythagoras theorem is
Correct : B. eucledian distance
45. Which one of these is not a tree based learner?
Correct : C. bayesian classifier
46. Which one of these is a tree based learner?
Correct : D. random forest
47. What is the approach of basic algorithm for decision tree induction?
Correct : A. greedy
48. Which of the following classifications would best suit the student performance classification systems?
Correct : A. if...then... analysis
49. Given that we can select the same feature multiple times during the recursive partitioning of
the input space, is it always possible to achieve 100% accuracy on the training data (given
that we allow for trees to grow to their maximum size) when building decision trees?
Correct : B. no
50. This clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration
Correct : A. k-means clustering
51. The number of iterations in apriori ___________ Select one: a. b. c. d.
Correct : C. increases with the size of the maximum frequent set
52. Frequent item sets is
Correct : D. superset of both closed frequent item sets and maximal frequent item sets
53. A good clustering method will produce high quality clusters with
Correct : C. high intra class similarity
54. Which statement is true about neural network and linear regression models?
Correct : D. both models require input attributes to be numeric
55. Which Association Rule would you prefer
Correct : C. low support and high confidence
56. In a Rule based classifier, If there is a rule for each combination of attribute values, what do you called that rule set R
Correct : A. exhaustive
57. The apriori property means
Correct : A. if a set cannot pass a test, its supersets will also fail the same test
58. If an item set ‘XYZ’ is a frequent item set, then all subsets of that frequent item set are
Correct : C. frequent
59. Clustering is ___________ and is example of ____________learning
Correct : D. descriptive and unsupervised
60. To determine association rules from frequent item sets
Correct : C. both minimum support and confidence are needed
61. If {A,B,C,D} is a frequent itemset, candidate rules which is not possible is
Correct : B. d –>abcd
62. Which Association Rule would you prefer
Correct : B. low support and high confidence
63. This clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration
Correct : B. k-means clustering
64. Classification rules are extracted from _____________
Correct : A. decision tree
65. What does K refers in the K-Means algorithm which is a non-hierarchical clustering approach?
Correct : D. number of clusters
66. How will you counter over-fitting in decision tree?
Correct : A. by pruning the longer rules
67. What are two steps of tree pruning work?
Correct : B. postpruning and prepruning
68. Which of the following sentences are true?
Correct : D. all of the above
69. Assume that you are given a data set and a neural network model trained on the data set. You
are asked to build a decision tree model with the sole purpose of understanding/interpreting
the built neural network model. In such a scenario, which among the following measures would
you concentrate most on optimising?
Correct : C. fidelity of the decision tree model, which is the fraction of instances on which the neural
network and the decision tree give the same output
70. Which of the following properties are characteristic of decision trees?
(a) High bias
(b) High variance
(c) Lack of smoothness of prediction surfaces
(d) Unbounded parameter set
Correct : C. b, c and d
71. To control the size of the tree, we need to control the number of regions. One approach to
do this would be to split tree nodes only if the resultant decrease in the sum of squares error
exceeds some threshold. For the described method, which among the following are true?
(a) It would, in general, help restrict the size of the trees (b) It has the potential to affect the performance of the resultant regression/classification
model
(c) It is computationally infeasible
Correct : A. a and b
72. Which among the following statements best describes our approach to learning decision trees
Correct : D. identify the model which gives performance close to the best greedy approximation performance (option (b)) with the smallest partition scheme
73. Having built a decision tree, we are using reduced error pruning to reduce the size of the
tree. We select a node to collapse. For this particular node, on the left branch, there are 3
training data points with the following outputs: 5, 7, 9.6 and for the right branch, there are
four training data points with the following outputs: 8.7, 9.8, 10.5, 11. What were the original
responses for data points along the two branches (left & right respectively) and what is the
new response after collapsing the node?
Correct : C. 7.2, 10, 8.8
74. Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set.
Which among the following statements are possible in light of the performance improvement observed?
(a) The collapsed node helped overcome the effect of one or more noise affected data points in the training set
(b) The validation set had one or more noise affected data points in the region corresponding to the collapsed node
(c) The validation set did not have any data points along at least one of the collapsed branches
(d) The validation set did have data points adversely affected by the collapsed node
Correct : D. all of the above
75. Time Complexity of k-means is given by
Correct : B. o(tkn)
76. In Apriori algorithm, if 1 item-sets are 100, then the number of candidate 2 item-sets are
Correct : C. 4950
77. Machine learning techniques differ from statistical techniques in that machine learning methods
Correct : A. are better able to deal with missing and noisy data
78. The probability that a person owns a sports car given that they subscribe to automotive magazine is 40%. We also know that 3% of the adult population subscribes to automotive magazine. The probability of a person owning a sports car given that they don’t subscribe to automotive magazine is 30%. Use this information to compute the probability that a person subscribes to automotive magazine given that they own a sports car
Correct : B. 0.0396
79. What is the final resultant cluster size in Divisive algorithm, which is one of the hierarchical clustering approaches?
Correct : C. singleton
80. Given a frequent itemset L, If |L| = k, then there are
Correct : C. 2k – 2 candidate association rules
81. Which Statement is not true statement.
Correct : B. k-means clustering aims to partition n observations into k clusters
82. which of the following cases will K-Means clustering give poor results?
1. Data points with outliers
2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes
Correct : C. 2 and 4
83. What is Decision Tree?
Correct : D. none of the above
84. What are two steps of tree pruning work?
Correct : B. postpruning and prepruning
85. A database has 5 transactions. Of these, 4 transactions include milk and bread. Further, of the given 4 transactions, 2 transactions include cheese. Find the support percentage for the following association rule “if milk and bread are purchased, then cheese is also purchased”.
Correct : D. 0.42
86. Which of the following option is true about k-NN algorithm?
Correct : C. ??it can be used in both classification and regression??
87. How to select best hyperparameters in tree based models?
Correct : B. measure performance over validation data
88. What is true about K-Mean Clustering?
1. K-means is extremely sensitive to cluster center initializations
2. Bad initialization can lead to Poor convergence speed
3. Bad initialization can lead to bad overall clustering
Correct : D. 1, 2 and 3
89. What are tree based classifiers?
Correct : C. both options except none
90. What is gini index?
Correct : D. all (1,2 and 3)
91. Tree/Rule based classification algorithms generate ... rule to perform the classification.
Correct : A. if-then.
92. Decision Tree is
Correct : C. both a & b
93. Which of the following is true about Manhattan distance?
Correct : A. it can be used for continuous variables
94. A company has build a kNN classifier that gets 100% accuracy on training data. When they deployed this model on client side it has been found that the model is not at all accurate. Which of the following thing might gone wrong?
Note: Model has successfully deployed and no technical issues are found at client side except the model performance
Correct : A. it is probably a overfitted model
95. hich of the following classifications would best suit the student performance classification systems?
Correct : A. if...then... analysis
96. Which statement is true about the K-Means algorithm? Select one:
Correct : C. all attributes must be numeric
97. Which of the following can act as possible termination conditions in K-Means?
1. For a fixed number of iterations.
2. Assignment of observations to clusters does not change between iterations. Except for cases with a bad local minimum.
3. Centroids do not change between successive iterations.
4. Terminate when RSS falls below a threshold.
Correct : D. 1,2,3,4
98. Which of the following statement is true about k-NN algorithm?
1) k-NN performs much better if all of the data have the same scale
2) k-NN works well with a small number of input variables (p), but struggles when the number of inputs is very large
3) k-NN makes no assumptions about the functional form of the problem being solved
Correct : D. 1,2 and 3
99. In which of the following cases will K-means clustering fail to give good results? 1) Data points with outliers 2) Data points with different densities 3) Data points with nonconvex shapes
Correct : C. 1, 2, and 3??
100. How will you counter over-fitting in decision tree?