Quiznetik

Bigdata | Set 1

1. Data in ___________ bytes size is called Big Data.

Correct : C. Peta

2. How many V's of Big Data

Correct : D. 5

3. Transaction data of the bank is?

Correct : A. structured data

4. In how many forms BigData could be found?

Correct : B. 3

5. Which of the following are Benefits of Big Data Processing?

Correct : D. All of the above

6. Which of the following are incorrect Big Data Technologies?

Correct : D. Apache Pytarch

7. The overall percentage of the world’s total data has been created just within the past two years is ?

Correct : C. 90%

8. Apache Kafka is an open-source platform that was created by?

Correct : A. LinkedIn

9. What was Hadoop named after?

Correct : C. The toy elephant of Cutting’s son

10. What are the main components of Big Data?

Correct : D. All of the above

11. All of the following accurately describe Hadoop, EXCEPT ____________

Correct : B. Real-time

12. __________ has the world’s largest Hadoop cluster.

Correct : C. Facebook

13. Facebook Tackles Big Data With _______ based on Hadoop.

Correct : A. Project Prism

14. ___________ is general-purpose computing model and runtime system for distributed data analytics.

Correct : A. Mapreduce

15. The examination of large amounts of data to see what patterns or other useful information can be found is known as

Correct : C. Big data analytics

16. Big data analysis does the following except?

Correct : D. Analyzes data

17. What makes Big Data analysis difficult to optimize?

Correct : B. Both data and cost effective ways to mine data to make business sense out of it

18. The new source of big data that will trigger a Big Data revolution in the years to come is?

Correct : C. Transactional data and sensor data

19. The unit of data that flows through a Flume agent is

Correct : D. Event

20. Listed below are the three steps that are followed to deploy a Big Data Solution except

Correct : B. Data dissemination

21. Who popularized bigdata term?

Correct : B. John Mashey

22. Numbers ,text, image, audio and video data is ____

Correct : D. Variety

23. Real time data is ______.

Correct : C. unique

24. ______ is the term that is used to describe data that is high volume , high velocity and /or high variety.

Correct : B. Bigdata

25. According to analysts, for what can traditional IT systems provide a foundation when they’re integrated with big data technologies like Hadoop?

Correct : A. Big data management and data mining

26. Point out the wrong statement.

Correct : C. The programming model, MapReduce, used by Hadoop is difficult to write and test

27. __________ can best be described as a programming model used to develop Hadoop-based applications that can process massive amounts of data.

Correct : A. MapReduce

28. __________ has the world’s largest Hadoop cluster.

Correct : C. Facebook

29. Facebook Tackles Big Data With _______ based on Hadoop.

Correct : A. ‘Project Prism’

30. Data science is the process of diverse set of data through ?

Correct : D. All of the above

31. The modern conception of data science as an independent discipline is sometimes attributed to?

Correct : A. William S.

32. Which of the following language is used in Data science?

Correct : C. R

33. Which of the following is false?

Correct : B. Raw data should be processed only one time.

34. What is the work of Data Architect?

Correct : C. build data solutions that are optimized for performance and design applications

35. Which of the following is correct skills for a Data Scientist?

Correct : D. All of the above

36. Which of the following are correct component for data science?

Correct : D. All of the above

37. Which of the following is not a part of data science process?

Correct : C. Communication Building

38. Which of the following are the Data Sources in data science?

Correct : C. Both A and B

39. Which of the following is not a application for data science?

Correct : D. Privacy Checker

40. Point out the correct statement.

Correct : A. Raw data is original source of data

41. Which of the following is one of the key data science skills?

Correct : D. All of the above

42. Which of the following is a key characteristic of a hacker?

Correct : B. Willing to find answers on their own

43. Raw data should be processed only one time.

Correct : B. False

44. Which of the following is the common goal of statistical modelling?

Correct : A. Inference

45. Causal analysis is commonly applied to census data.

Correct : B. False

46. Which of the following model is usually a gold standard for data analysis?

Correct : C. Causal

47. Which of the following is a revision control system?

Correct : A. Git

48. Which of the following step is performed by data scientist after acquiring the data?

Correct : A. Data Cleaning

49. Which of the following focuses on the discovery of (previously) unknown properties on the data?

Correct : A. Data mining

50. Which of the following can be used to create sub–samples using a maximum dissimilarity approach?

Correct : B. maxDissim

51. Which of the following can be used to impute data sets based only on information in the training set?

Correct : B. preProcess

52. Which of the following model model include a backwards elimination feature selection routine?

Correct : B. MARS

53. Which of the following is a categorical outcome?

Correct : C. Accuracy

54. Which of the following function provides unsupervised prediction ?

Correct : D. None of the Mentioned

55. What is true about Machine Learning?

Correct : D. All of the above

56. ML is a field of AI consisting of learning algorithms that?

Correct : D. All of the above

57. p → 0q is not a?

Correct : B. horn clause

58. The action _______ of a robot arm specify to Place block A on block B.

Correct : A. STACK(A,B)

59. A__________ begins by hypothesizing a sentence (the symbol S) and successively predicting lower level constituents until individual preterminal symbols are written.

Correct : C. top-down parser

60. A model of language consists of the categories which does not include ________.

Correct : B. structural units.

61. Different learning methods does not include?

Correct : A. Introduction

62. The model will be trained with data in one single batch is known as ?

Correct : C. Both A and B

63. Which of the following are ML methods?

Correct : A. based on human supervision

64. In Model based learning methods, an iterative process takes place on the ML models that are built based on various model parameters, called ?

Correct : C. hyperparameters

65. Which of the following is a widely used and effective machine learning algorithm based on the idea of bagging?

Correct : D. Random Forest

66. To find the minimum or the maximum of a function, we set the gradient to zero because:

Correct : A. The value of the gradient at extrema of a function is always zero

67. Which of the following is a disadvantage of decision trees?

Correct : C. Decision trees are prone to be overfit

68. How do you handle missing or corrupted data in a dataset?

Correct : D. All of the above

69. When performing regression or classification, which of the following is the correct way to preprocess the data?

Correct : A. Normalize the data -> PCA -> training

70. Which of the following statements about regularization is not correct?

Correct : D. None of the above

71. Which of the following techniques can not be used for normalization in text mining?

Correct : C. Stop Word Removal

72. In which of the following cases will K-means clustering fail to give good results? 1) Data points with outliers 2) Data points with different densities 3) Data points with nonconvex shapes

Correct : D. All of the above

73. Which of the following is a reasonable way to select the number of principal components "k"?

Correct : A. Choose k to be the smallest value so that at least 99% of the varinace is retained.

74. What is a sentence parser typically used for?

Correct : B. It is used to parse sentences to derive their most likely syntax tree structures.

75. Data Analysis is a process of?

Correct : D. All of the above

76. Which of the following is not a major data analysis approaches?

Correct : B. Predictive Intelligence

77. How many main statistical methodologies are used in data analysis?

Correct : A. 2

78. In descriptive statistics, data from the entire population or a sample is summarized with ?

Correct : C. numerical descriptors

79. Data Analysis is defined by the statistician?

Correct : D. John Tukey

80. Which of the following is true about hypothesis testing?

Correct : A. answering yes/no questions about the data

81. The goal of business intelligence is to allow easy interpretation of large volumes of data to identify new opportunities.

Correct : A. TRUE

82. The branch of statistics which deals with development of particular statistical methods is classified as

Correct : D. applied statistics

83. Which of the following is true about regression analysis?

Correct : C. modeling relationships within the data

84. Text Analytics, also referred to as Text Mining?

Correct : A. TRUE

85. What is true about Data Visualization?

Correct : D. All of the above

86. Data can be visualized using?

Correct : D. All of the above

87. Data visualization is also an element of the broader _____________.

Correct : B. data presentation architecture

88. Which method shows hierarchical data in a nested format?

Correct : A. Treemaps

89. Which is used to inference for 1 proportion using normal approx?

Correct : D. prop.test()

90. Which is used to find the factor congruence coefficients?

Correct : C. factor.congruence

91. Which of the following is tool for checking normality?

Correct : A. qqline()

92. Which of the following is false?

Correct : C. Data visualization decrease the insights and take solwer decisions

93. Common use cases for data visualization include?

Correct : D. All of the above

94. Which of the following plots are often used for checking randomness in time series?

Correct : C. Autocorrelation

95. To find the minimum or the maximum of a function, we set the gradient to zero because:

Correct : A. The value of the gradient at extrema of a function is always zero

96. Which of the following techniques can not be used for normalization in text mining?

Correct : C. Stop Word Removal

97. In which of the following cases will K-means clustering fail to give good results? 1) Data points with outliers 2) Data points with different densities 3) Data points with nonconvex shapes

Correct : D. All of the above

98. Which of the following is a reasonable way to select the number of principal components "k"?

Correct : A. Choose k to be the smallest value so that at least 99% of the varinace is retained.

99. Which of the following is false?

Correct : B. Raw data should be processed only one time.

100. According to analysts, for what can traditional IT systems provide a foundation when they’re integrated with big data technologies like Hadoop?

Correct : A. Big data management and data mining