Quiznetik

Machine Learning (ML) | Set 7

1. The average squared difference between classifier predicted output and actual output.

A. mean squared error

B. root mean squared error

C. mean absolute error

D. mean relative error

Correct : A. mean squared error

2. Simple regression assumes a __________ relationship between the input attribute and output attribute.

A. linear

B. quadratic

C. reciprocal

D. inverse

Correct : A. linear

3. Regression trees are often used to model _______ data.

A. linear

B. nonlinear

C. categorical

D. symmetrical

Correct : B. nonlinear

4. The leaf nodes of a model tree are

A. averages of numeric output attribute values.

B. nonlinear regression equations.

C. linear regression equations.

D. sums of numeric output attribute values.

Correct : C. linear regression equations.

5. Logistic regression is a ____ regression technique that is used to model data having a _outcome.

A. linear, numeric

B. linear, binary

C. nonlinear, numeric

D. nonlinear, binary

Correct : D. nonlinear, binary

6. This technique associates a conditional probability value with each data instance.

A. linear regression

B. logistic regression

C. simple regression

D. multiple linear regression

Correct : B. logistic regression

7. This supervised learning technique can process both numeric and categorical input attributes.

A. linear regression

B. bayes classifier

C. logistic regression

D. backpropagation learning

Correct : A. linear regression

8. With Bayes classifier, missing data items are

A. treated as equal compares.

B. treated as unequal compares.

C. replaced with a default value.

D. ignored.

Correct : B. treated as unequal compares.

9. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.

A. agglomerative clustering

B. expectation maximization

C. conceptual clustering

D. k-means clustering

Correct : D. k-means clustering

10. This clustering algorithm initially assumes that each data instance represents a single cluster.

A. agglomerative clustering

B. conceptual clustering

C. k-means clustering

D. expectation maximization

Correct : C. k-means clustering

11. This unsupervised clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration.

A. agglomerative clustering

B. conceptual clustering

C. k-means clustering

D. expectation maximization

Correct : C. k-means clustering

12. Machine learning techniques differ from statistical techniques in that machine learning methods

A. typically assume an underlying distribution for the data.

B. are better able to deal with missing and noisy data.

C. are not able to explain their behavior.

D. have trouble with large-sized datasets.

Correct : B. are better able to deal with missing and noisy data.

13. In reinforcement learning if feedback is negative one it is defined as____.

A. Penalty

B. Overlearning

C. Reward

D. None of above

Correct : A. Penalty

14. According to____ , it’s a key success factor for the survival and evolution of all species.

A. Claude Shannon\s theory

B. Gini Index

C. Darwin’s theory

D. None of above

Correct : C. Darwin’s theory

15. What is ‘Training set’?

A. Training set is used to test the accuracy of the hypotheses generated by the learner.

B. A set of data is used to discover the potentially predictive relationship.

C. Both A & B

D. None of above

Correct : B. A set of data is used to discover the potentially predictive relationship.

16. Common deep learning applications include____

A. Image classification, Real-time visual tracking

B. Autonomous car driving, Logistic optimization

C. Bioinformatics, Speech recognition

D. All above

Correct : D. All above

17. Reinforcement learning is particularly efficient when______________.

A. the environment is not completely deterministic

B. it\s often very dynamic

C. it\s impossible to have a precise error measure

D. All above

Correct : D. All above

18. if there is only a discrete number of possible outcomes (called categories), the process becomes a______.

A. Regression

B. Classification.

C. Modelfree

D. Categories

Correct : B. Classification.

19. Which of the following are supervised learning applications

A. Spam detection, Pattern detection, Natural Language Processing

B. Image classification, Real-time visual tracking

C. Autonomous car driving, Logistic optimization

D. Bioinformatics, Speech recognition

Correct : A. Spam detection, Pattern detection, Natural Language Processing

20. During the last few years, many ______ algorithms have been applied to deep neural networks to learn the best policy for playing Atari video games and to teach an agent how to associate the right action with an input representing the state.

A. Logical

B. Classical

C. Classification

D. None of above

Correct : D. None of above

21. What is ‘Overfitting’ in Machine learning?

A. when a statistical model describes random error or noise instead of underlying relationship ‘overfitting’ occurs.

B. Robots are programed so that they can perform the task based on data they gather from sensors.

C. While involving the process of learning ‘overfitting’ occurs.

D. a set of data is used to discover the potentially predictive relationship

Correct : A. when a statistical model describes random error or noise instead of underlying relationship ‘overfitting’ occurs.

22. What is ‘Test set’?

A. Test set is used to test the accuracy of the hypotheses generated by the learner.

B. It is a set of data is used to discover the potentially predictive relationship.

C. Both A & B

D. None of above

Correct : A. Test set is used to test the accuracy of the hypotheses generated by the learner.

23. ________is much more difficult because it's necessary to determine a supervised strategy to train a model for each feature and, finally, to predict their value

A. Removing the whole line

B. Creating sub-model to predict those features

C. Using an automatic strategy to input them according to the other known values

D. All above

Correct : B. Creating sub-model to predict those features

24. How it's possible to use a different placeholder through the parameter_______.

A. regression

B. classification

C. random_state

D. missing_values

Correct : D. missing_values

25. If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the class________.

A. RobustScaler

B. DictVectorizer

C. LabelBinarizer

D. FeatureHasher

Correct : A. RobustScaler

26. scikit-learn also provides a class for per-sample normalization, Normalizer. It can apply________to each element of a dataset

A. max, l0 and l1 norms

B. max, l1 and l2 norms

C. max, l2 and l3 norms

D. max, l3 and l4 norms

Correct : B. max, l1 and l2 norms

27. There are also many univariate methods that can be used in order to select the best features according to specific criteria based on________.

A. F-tests and p-values

B. chi-square

C. ANOVA

D. All above

Correct : A. F-tests and p-values

28. ________performs a PCA with non-linearly separable data sets.

A. SparsePCA

B. KernelPCA

C. SVD

D. None of the Mentioned

Correct : B. KernelPCA

29. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Which of the following statement is true in following case?

A. Feature F1 is an example of nominal variable.

B. Feature F1 is an example of ordinal variable.

C. It doesn’t belong to any of the above category.

D. Both of these

Correct : B. Feature F1 is an example of ordinal variable.

30. The parameter______ allows specifying the percentage of elements to put into the test/training set

A. test_size

B. training_size

C. All above

D. None of these

Correct : C. All above

31. In many classification problems, the target ______ is made up of categorical labels which cannot immediately be processed by any algorithm.

A. random_state

B. dataset

C. test_size

D. All above

Correct : B. dataset

32. _______adopts a dictionary-oriented approach, associating to each category label a progressive integer number.

A. LabelEncoder class

B. LabelBinarizer class

C. DictVectorizer

D. FeatureHasher

Correct : A. LabelEncoder class

33. Function used for linear regression in R is __________

A. lm(formula, data)

B. lr(formula, data)

C. lrm(formula, data)

D. regression.linear(formula, data)

Correct : A. lm(formula, data)

34. In syntax of linear model lm(formula,data,..), data refers to ______

A. Matrix

B. Vector

C. Array

D. List

Correct : B. Vector

35. Which of the following methods do we use to find the best fit line for data in Linear Regression?

A. Least Square Error

B. Maximum Likelihood

C. Logarithmic Loss

D. Both A and B

Correct : A. Least Square Error

36. Which of the following evaluation metrics can be used to evaluate a model while modeling a continuous output variable?

A. AUC-ROC

B. Accuracy

C. Logloss

D. Mean-Squared-Error

Correct : D. Mean-Squared-Error

37. Which of the following is true about Residuals ?

A. Lower is better

B. Higher is better

C. A or B depend on the situation

D. None of these

Correct : A. Lower is better

38. Naive Bayes classifiers are a collection ------------------of algorithms

A. Classification

B. Clustering

C. Regression

D. All

Correct : A. Classification

39. Naive Bayes classifiers is _______________ Learning

A. Supervised

B. Unsupervised

C. Both

D. None

Correct : A. Supervised

40. Features being classified is independent of each other in Naïve Bayes Classifier

A. False

B. true

Correct : B. true

41. Features being classified is __________ of each other in Naïve Bayes Classifier

A. Independent

B. Dependent

C. Partial Dependent

D. None

Correct : A. Independent

42. Conditional probability is a measure of the probability of an event given that another event has already occurred.

A. True

B. false

Correct : A. True

43. Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

A. True

B. false

Correct : A. True

44. Bernoulli Naïve Bayes Classifier is ___________distribution

A. Continuous

B. Discrete

C. Binary

Correct : C. Binary

45. Multinomial Naïve Bayes Classifier is ___________distribution

A. Continuous

B. Discrete

C. Binary

Correct : B. Discrete

46. Gaussian Naïve Bayes Classifier is ___________distribution

A. Continuous

B. Discrete

C. Binary

Correct : A. Continuous

47. Binarize parameter in BernoulliNB scikit sets threshold for binarizing of sample features.

A. True

B. false

Correct : A. True

48. Gaussian distribution when plotted, gives a bell shaped curve which is symmetric about the _______ of the feature values.

A. Mean

B. Variance

C. Discrete

D. Random

Correct : A. Mean

49. SVMs directly give us the posterior probabilities P(y = 1jx) and P(y = 􀀀1jx)

A. True

B. false

Correct : B. false

50. Any linear combination of the components of a multivariate Gaussian is a univariate Gaussian.

A. True

B. false

Correct : A. True

51. Solving a non linear separation problem with a hard margin Kernelized SVM (Gaussian RBF Kernel) might lead to overfitting

A. True

B. false

Correct : A. True

52. SVM is a ------------------ algorithm

A. Classification

B. Clustering

C. Regression

D. All

Correct : A. Classification

53. SVM is a ------------------ learning

A. Supervised

B. Unsupervised

C. Both

D. None

Correct : A. Supervised

54. The linear SVM classifier works by drawing a straight line between two classes

A. True

B. false

Correct : A. True

55. What is Model Selection in Machine Learning?

A. The process of selecting models among different mathematical models, which are used to describe the same data set

B. when a statistical model describes random error or noise instead of underlying relationship

C. Find interesting directions in data and find novel observations/ database cleaning

D. All above

Correct : A. The process of selecting models among different mathematical models, which are used to describe the same data set

56. Which are two techniques of Machine Learning ?

A. Genetic Programming and Inductive Learning

B. Speech recognition and Regression

C. Both A & B

D. None of the Mentioned

Correct : A. Genetic Programming and Inductive Learning

57. Even if there are no actual supervisors ________ learning is also based on feedback provided by the environment

A. Supervised

B. Reinforcement

C. Unsupervised

D. None of the above

Correct : B. Reinforcement

58. When it is necessary to allow the model to develop a generalization ability and avoid a common problem called______.

A. Overfitting

B. Overlearning

C. Classification

D. Regression

Correct : A. Overfitting

59. Techniques involve the usage of both labeled and unlabeled data is called___.

A. Supervised

B. Semi-supervised

C. Unsupervised

D. None of the above

Correct : B. Semi-supervised

60. A supervised scenario is characterized by the concept of a _____.

A. Programmer

B. Teacher

C. Author

D. Farmer

Correct : B. Teacher

61. overlearning causes due to an excessive ______.

A. Capacity

B. Regression

C. Reinforcement

D. Accuracy

Correct : A. Capacity

62. Which of the following are several models for feature extraction

A. regression

B. classification

C. None of the above

Correct : C. None of the above

63. _____ provides some built-in datasets that can be used for testing purposes.

A. scikit-learn

B. classification

C. regression

D. None of the above

Correct : A. scikit-learn

64. While using _____ all labels are turned into sequential numbers.

A. LabelEncoder class

B. LabelBinarizer class

C. DictVectorizer

D. FeatureHasher

Correct : A. LabelEncoder class

65. _______produce sparse matrices of real numbers that can be fed into any machine learning model.

A. DictVectorizer

B. FeatureHasher

C. Both A & B

D. None of the Mentioned

Correct : C. Both A & B

66. scikit-learn offers the class______, which is responsible for filling the holes using a strategy based on the mean, median, or frequency

A. LabelEncoder

B. LabelBinarizer

C. DictVectorizer

D. Imputer

Correct : D. Imputer

67. Which of the following scale data by removing elements that don't belong to a given range or by considering a maximum absolute value.

A. MinMaxScaler

B. MaxAbsScaler

C. Both A & B

D. None of the Mentioned

Correct : C. Both A & B

68. scikit-learn also provides a class for per-sample normalization,_____

A. Normalizer

B. Imputer

C. Classifier

D. All above

Correct : A. Normalizer

69. ______dataset with many features contains information proportional to the independence of all features and their variance.

A. normalized

B. unnormalized

C. Both A & B

D. None of the Mentioned

Correct : B. unnormalized

70. In order to assess how much information is brought by each component, and the correlation among them, a useful tool is the_____.

A. Concuttent matrix

B. Convergance matrix

C. Supportive matrix

D. Covariance matrix

Correct : D. Covariance matrix

71. The_____ parameter can assume different values which determine how the data matrix is initially processed.

A. run

B. start

C. init

D. stop

Correct : C. init

72. ______allows exploiting the natural sparsity of data while extracting principal components.

A. SparsePCA

B. KernelPCA

C. SVD

D. init parameter

Correct : A. SparsePCA

73. Which of the following statement is true about outliers in Linear regression?

A. Linear regression is sensitive to outliers

B. Linear regression is not sensitive to outliers

C. Can’t say

D. None of these

Correct : A. Linear regression is sensitive to outliers

74. Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you found that there is a relationship between them. Which of the following conclusion do you make about this situation?

A. Since the there is a relationship means our model is not good

B. Since the there is a relationship means our model is good

C. Can’t say

D. None of these

Correct : A. Since the there is a relationship means our model is not good

75. Let’s say, a “Linear regression” model perfectly fits the training data (train error is zero). Now, Which of the following statement is true?

A. You will always have test error zero

B. You can not have test error zero

C. None of the above

Correct : C. None of the above

76. In a linear regression problem, we are using “R-squared” to measure goodness-of-fit. We add a feature in linear regression model and retrain the same model.Which of the following option is true?

A. If R Squared increases, this variable is significant.

B. If R Squared decreases, this variable is not significant.

C. Individually R squared cannot tell about variable importance. We can’t say anything about it right now.

D. None of these.

Correct : C. Individually R squared cannot tell about variable importance. We can’t say anything about it right now.

77. To test linear relationship of y(dependent) and x(independent) continuous variables, which of the following plot best suited?

A. Scatter plot

B. Barchart

C. Histograms

D. None of these

Correct : A. Scatter plot

78. which of the following step / assumption in regression modeling impacts the trade-off between under-fitting and over-fitting the most.

A. The polynomial degree

B. Whether we learn the weights by matrix inversion or gradient descent

C. The use of a constant-term

Correct : A. The polynomial degree

79. Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection?

A. Ridge regression uses subset selection of features

B. Lasso regression uses subset selection of features

C. Both use subset selection of features

D. None of above

Correct : B. Lasso regression uses subset selection of features

80. Which of the following statement(s) can be true post adding a variable in a linear regression model?1. R-Squared and Adjusted R-squared both increase2. R-Squared increases and Adjusted R-squared decreases3. R-Squared decreases and Adjusted R-squared decreases4. R-Squared decreases and Adjusted R-squared increases

A. 1 and 2

B. 1 and 3

C. 2 and 4

D. None of the above

Correct : A. 1 and 2

81. What is/are true about kernel in SVM?1. Kernel function map low dimensional data to high dimensional space2. It’s a similarity function

A. 1

B. 2

C. 1 and 2

D. None of these

Correct : C. 1 and 2

82. Suppose you are building a SVM model on data X. The data X can be error prone which means that you should not trust any specific data point too much. Now think that you want to build a SVM model which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper parameter.What would happen when you use very small C (C~0)?

A. Misclassification would happen

B. Data will be correctly classified

C. Can’t say

D. None of these

Correct : A. Misclassification would happen

83. The cost parameter in the SVM means:

A. The number of cross-validations to be made

B. The kernel to be used

C. The tradeoff between misclassification and simplicity of the model

D. None of the above

Correct : C. The tradeoff between misclassification and simplicity of the model

84. How do you handle missing or corrupted data in a dataset?

A. a. Drop missing rows or columns

B. b. Replace missing values with mean/median/mode

C. c. Assign a unique category to missing values

D. d. All of the above

Correct : D. d. All of the above

85. Which of the following statements about Naive Bayes is incorrect?

A. Attributes are equally important.

B. Attributes are statistically dependent of one another given the class value.

C. Attributes are statistically independent of one another given the class value.

D. Attributes can be nominal or numeric

Correct : B. Attributes are statistically dependent of one another given the class value.

86. The SVM’s are less effective when:

A. The data is linearly separable

B. The data is clean and ready to use

C. The data is noisy and contains overlapping points

Correct : C. The data is noisy and contains overlapping points

87. If there is only a discrete number of possible outcomes called _____.

A. Modelfree

B. Categories

C. Prediction

D. None of above

Correct : B. Categories

88. Some people are using the term ___ instead of prediction only to avoid the weird idea that machine learning is a sort of modern magic.

A. Inference

B. Interference

C. Accuracy

D. None of above

Correct : A. Inference

89. The term _____ can be freely used, but with the same meaning adopted in physics or system theory.

A. Accuracy

B. Cluster

C. Regression

D. Prediction

Correct : D. Prediction

90. Common deep learning applications / problems can also be solved using____

A. Real-time visual object identification

B. Classic approaches

C. Automatic labeling

D. Bio-inspired adaptive systems

Correct : B. Classic approaches

91. what is the function of ‘Unsupervised Learning’?

A. Find clusters of the data and find low-dimensional representations of the data

B. Find interesting directions in data and find novel observations/ database cleaning

C. Interesting coordinates and correlations

D. All

Correct : D. All

92. What are the two methods used for the calibration in Supervised Learning?

A. Platt Calibration and Isotonic Regression

B. Statistics and Informal Retrieval

Correct : A. Platt Calibration and Isotonic Regression

93. Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale one of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with the same regularization parameter.Now, which of the following option will be correct?

A. It is more likely for X1 to be excluded from the model

B. It is more likely for X1 to be included in the model

C. Can’t say

D. None of these

Correct : B. It is more likely for X1 to be included in the model

94. Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection?

A. Ridge regression uses subset selection of features

B. Lasso regression uses subset selection of features

C. Both use subset selection of features

D. None of above

Correct : B. Lasso regression uses subset selection of features

95. Which of the following statement(s) can be true post adding a variable in a linear regression model? 1. R-Squared and Adjusted R-squared both increase 2. R-Squared increases and Adjusted R-squared decreases 3. R-Squared decreases and Adjusted R-squared decreases 4. R-Squared decreases and Adjusted R-squared increases

A. 1 and 2

B. 1 and 3

C. 2 and 4

D. None of the above

Correct : A. 1 and 2

96. We can also compute the coefficient of linear regression with the help of an analytical method called “Normal Equation”. Which of the following is/are true about “Normal Equation”? 1. We don’t have to choose the learning rate 2. It becomes slow when number of features is very large 3. No need to iterate

A. 1 and 2

B. 1 and 3.

C. 2 and 3.

D. 1,2 and 3.

Correct : D. 1,2 and 3.

97. If two variables are correlated, is it necessary that they have a linear relationship?

A. Yes

B. No

Correct : B. No

98. When the C parameter is set to infinite, which of the following holds true?

A. The optimal hyperplane if exists, will be the one that completely separates the data

B. The soft-margin classifier will separate the data

C. None of the above

Correct : A. The optimal hyperplane if exists, will be the one that completely separates the data

99. Suppose you are building a SVM model on data X. The data X can be error prone which means that you should not trust any specific data point too much. Now think that you want to build a SVM model which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper parameter.What would happen when you use very large value of C(C->infinity)?

A. We can still classify data correctly for given setting of hyper parameter C

B. We can not classify data correctly for given setting of hyper parameter C

C. Can’t Say

D. None of these

Correct : A. We can still classify data correctly for given setting of hyper parameter C

100. SVM can solve linear and non-linear problems

A. true

B. false

Correct : A. true

Quiznetik

Machine Learning (ML) | Set 7

1. The average squared difference between classifier predicted output and actual output.

2. Simple regression assumes a __________ relationship between the input attribute and output attribute.

3. Regression trees are often used to model _______ data.

4. The leaf nodes of a model tree are

5. Logistic regression is a ________ regression technique that is used to model data having a _____outcome.

6. This technique associates a conditional probability value with each data instance.

7. This supervised learning technique can process both numeric and categorical input attributes.

8. With Bayes classifier, missing data items are

9. This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.

10. This clustering algorithm initially assumes that each data instance represents a single cluster.

11. This unsupervised clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration.

12. Machine learning techniques differ from statistical techniques in that machine learning methods

13. In reinforcement learning if feedback is negative one it is defined as____.

14. According to____ , it’s a key success factor for the survival and evolution of all species.

15. What is ‘Training set’?

16. Common deep learning applications include____

17. Reinforcement learning is particularly efficient when______________.

18. if there is only a discrete number of possible outcomes (called categories), the process becomes a______.

19. Which of the following are supervised learning applications

20. During the last few years, many ______ algorithms have been applied to deep neural networks to learn the best policy for playing Atari video games and to teach an agent how to associate the right action with an input representing the state.

21. What is ‘Overfitting’ in Machine learning?

22. What is ‘Test set’?

23. ________is much more difficult because it's necessary to determine a supervised strategy to train a model for each feature and, finally, to predict their value

24. How it's possible to use a different placeholder through the parameter_______.

25. If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the class________.

26. scikit-learn also provides a class for per-sample normalization, Normalizer. It can apply________to each element of a dataset

27. There are also many univariate methods that can be used in order to select the best features according to specific criteria based on________.

28. ________performs a PCA with non-linearly separable data sets.

29. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Which of the following statement is true in following case?

30. The parameter______ allows specifying the percentage of elements to put into the test/training set

31. In many classification problems, the target ______ is made up of categorical labels which cannot immediately be processed by any algorithm.

32. _______adopts a dictionary-oriented approach, associating to each category label a progressive integer number.

33. Function used for linear regression in R is __________

34. In syntax of linear model lm(formula,data,..), data refers to ______

35. Which of the following methods do we use to find the best fit line for data in Linear Regression?

36. Which of the following evaluation metrics can be used to evaluate a model while modeling a continuous output variable?

37. Which of the following is true about Residuals ?

38. Naive Bayes classifiers are a collection ------------------of algorithms

39. Naive Bayes classifiers is _______________ Learning

40. Features being classified is independent of each other in Naïve Bayes Classifier

41. Features being classified is __________ of each other in Naïve Bayes Classifier

42. Conditional probability is a measure of the probability of an event given that another event has already occurred.

43. Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

44. Bernoulli Naïve Bayes Classifier is ___________distribution

45. Multinomial Naïve Bayes Classifier is ___________distribution

46. Gaussian Naïve Bayes Classifier is ___________distribution

47. Binarize parameter in BernoulliNB scikit sets threshold for binarizing of sample features.

48. Gaussian distribution when plotted, gives a bell shaped curve which is symmetric about the _______ of the feature values.

49. SVMs directly give us the posterior probabilities P(y = 1jx) and P(y = 􀀀1jx)

50. Any linear combination of the components of a multivariate Gaussian is a univariate Gaussian.

51. Solving a non linear separation problem with a hard margin Kernelized SVM (Gaussian RBF Kernel) might lead to overfitting

52. SVM is a ------------------ algorithm

53. SVM is a ------------------ learning

54. The linear SVM classifier works by drawing a straight line between two classes

55. What is Model Selection in Machine Learning?

56. Which are two techniques of Machine Learning ?

57. Even if there are no actual supervisors ________ learning is also based on feedback provided by the environment

58. When it is necessary to allow the model to develop a generalization ability and avoid a common problem called______.

59. Techniques involve the usage of both labeled and unlabeled data is called___.

60. A supervised scenario is characterized by the concept of a _____.

61. overlearning causes due to an excessive ______.

62. Which of the following are several models for feature extraction

63. _____ provides some built-in datasets that can be used for testing purposes.

64. While using _____ all labels are turned into sequential numbers.

65. _______produce sparse matrices of real numbers that can be fed into any machine learning model.

66. scikit-learn offers the class______, which is responsible for filling the holes using a strategy based on the mean, median, or frequency

67. Which of the following scale data by removing elements that don't belong to a given range or by considering a maximum absolute value.

68. scikit-learn also provides a class for per-sample normalization,_____

69. ______dataset with many features contains information proportional to the independence of all features and their variance.

70. In order to assess how much information is brought by each component, and the correlation among them, a useful tool is the_____.

71. The_____ parameter can assume different values which determine how the data matrix is initially processed.

72. ______allows exploiting the natural sparsity of data while extracting principal components.

73. Which of the following statement is true about outliers in Linear regression?

74. Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you found that there is a relationship between them. Which of the following conclusion do you make about this situation?

75. Let’s say, a “Linear regression” model perfectly fits the training data (train error is zero). Now, Which of the following statement is true?

76. In a linear regression problem, we are using “R-squared” to measure goodness-of-fit. We add a feature in linear regression model and retrain the same model.Which of the following option is true?

77. To test linear relationship of y(dependent) and x(independent) continuous variables, which of the following plot best suited?

78. which of the following step / assumption in regression modeling impacts the trade-off between under-fitting and over-fitting the most.

5. Logistic regression is a ____ regression technique that is used to model data having a _outcome.