Bigdata/Introduction to Bigdata Mcq Set 2 Sample Test,Sample questions

Question:
 Data can be visualized using?

1.graphs

2.charts

3.maps

4.All of the above


Question:
 How do you handle missing or corrupted data in a dataset?

1.Drop missing rows or columns

2.Replace missing values with mean/median/mode

3.. Assign a unique category to missing values

4.All of the above


Question:
 In descriptive statistics, data from the entire population or a sample issummarized with ?

1.integer descriptors

2.floating descriptors

3.numerical descriptors

4.decimal descriptors


Question:
 In Model based learning methods, an iterative process takes place on the ML models that are built based on various model parameters, called ?

1.mini-batches

2.optimizedparameters

3.hyperparameters

4.superparameters


Question:
 The branch of statistics which deals with development of particularstatistical methods is classified as

1.industry statistics

2.economic statistics

3.applied statistics

4. applied statistics


Question:
 What is true about Data Visualization?

1.Data Visualization is used to communicate information clearly and efficiently to users by the usage of information graphics such as tables and charts.

2.Data Visualization helps users in analyzing a large amount of data in a simpler way.

3.Data Visualization makes complex data more accessible, understandable, and usable.

4.All of the above


Question:
 Which of the following is a widely used and effective machine learningalgorithm based on the idea of bagging?

1.Decision Tree

2.Regression

3.Classification

4.Random Forest


Question:
According to analysts, for what can traditional IT systems provide a foundation when they’re integrated with big data technologies like Hadoop?

1.Big data management and data mining

2.Data warehousing and business intelligence

3.Management of Hadoop clusters

4.Collecting and storing unstructured data


Question:
Common use cases for data visualization include?

1.Politics

2.Sales and marketing

3.Healthcare

4.All of the above


Question:
Data Analysis is a process of?

1. inspecting data

2.cleaning data

3. transforming data

4.All of the above


Question:
Data Analysis is defined by the statistician?

1.William S.

2.Hans Peter Luhn

3.Gregory Piatetsky-Shapiro

4.John Tukey


Question:
Data visualization is also an element of the broader _____________.

1.deliver presentation architecture

2.data presentation architecture

3.dataset presentation architecture

4.data process architecture


Question:
File containing R scripts end with extension _______.

1.R

2.S

3.bigdata

4.All of the above


Question:
How many layers Deep learning algorithms are constructed?

1.2

2.3

3.4

4.5


Question:
How many main statistical methodologies are used in data analysis?

1.2

2.3

3.4

4.5


Question:
In which of the following cases will K-means clustering fail to give good results?1) Data points with outliers 2) Data points with different densities 3) Data points with nonconvex shapes

1.1 and 2

2.2 and 3

3. 1 and 3

4.All of the above


Question:
In which of the following cases will K-means clustering fail to give goodresults? 1) Data points with outliers 2) Data points with different densities 3) Data points with nonconvex shapes

1.1 and 2

2.2 and 3

3.1 and 3

4. All of the above


Question:
Text Analytics, also referred to as Text Mining?

1.True

2.False

3.Can be true or false

4.Can not say


Question:
The goal of business intelligence is to allow easy interpretation of largevolumes of data to identify new opportunities.

1.True

2.False

3. Can be true or false

4. Can not say


Question:
The model will be trained with data in one single batch is known as ?

1.Batch learning

2.Offline learning

3.Both A and B

4.None of the above


Question:
To find the minimum or the maximum of a function, we set the gradient to zero because:

1.The value of the gradient at extrema of a function is always zero

2.Depends on the type of problem

3.Both A and B

4.None of the above


Question:
To find the minimum or the maximum of a function, we set the gradient tozero because:

1.The value of the gradient at extrema of a function is always zero

2.Depends on the type of problem

3.Both A and B

4.None of the above


Question:
What is a sentence parser typically used for?

1. It is used to parse sentences to check if they are utf-8 compliant.

2.It is used to parse sentences to derive their most likely syntax tree structures.

3. It is used to parse sentences to assign POS tags to all tokens.

4.It is used to check if sentences can be parsed into meaningful tokens.


Question:
When performing regression or classification, which of the following is thecorrect way to preprocess the data

1.Normalize the data -> PCA -> training

2.PCA -> normalize PCA output -> training

3.Normalize the data -> PCA -> normalize PCA output -> training

4.None of the above


Question:
Which is used to find the factor congruence coefficients?

1. factor.mosaicplot

2.factor.xyplot

3. factor.congruence

4.factor.cumsum


Question:
Which is used to inference for 1 proportion using normal approx?

1. fisher.test()

2.chisq.test()

3. Lm.test()

4.prop.test()


Question:
Which method shows hierarchical data in a nested format?

1.Treemaps

2.Scatter plots

3.Population pyramids

4. Area charts


Question:
Which of the following is a disadvantage of decision trees?

1. Factor analysis

2. Decision trees are robust to outliers

3.Decision trees are prone to be overfit

4.None of the above


Question:
Which of the following is a reasonable way to select the number ofprincipal components "k"?

1.Choose k to be the smallest value so that at least 99% of the varinace is retained.

2.Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).

3. Choose k to be the largest value so that 99% of the variance is retained.

4. Use the elbow method.


Question:
Which of the following is a subset of machine learning?

1. Numpy

2.SciPy

3.Deep Learning

4.All of the above


Question:
Which of the following is false?

1.data visualization include the ability to absorb information quickly

2.Data visualization is another form of visual art

3.Data visualization decrease the insights and take solwer decisions

4. None Of the above


Question:
Which of the following is false?

1.Subsetting can be used to select and exclude variables and observations

2.Raw data should be processed only one time.

3.Merging concerns combining datasets on the same observations to produce a result with more variables

4.None Of the above


Question:
Which of the following is not a major data analysis approaches?

1. Data Mining

2.Predictive Intelligence

3. Business Intelligence

4.Text Analytics


Question:
Which of the following is tool for checking normality?

1.qqline()

2.qline()

3.anova()

4. lm()


Question:
Which of the following is true about hypothesis testing?

1.William S.

2.Hans Peter Luhn

3.Gregory Piatetsky-Shapiro

4.John Tukey


Question:
Which of the following is true about hypothesis testing?

1.answering yes/no questions about the data

2. estimating numerical characteristics of the data

3.describing associations within the data

4.modeling relationships within the data


Question:
Which of the following is true about regression analysis?

1.answering yes/no questions about the data

2.estimating numerical characteristics of the data

3.modeling relationships within the data

4. describing associations within the data


Question:
Which of the following plots are often used for checking randomness intime series?

1. Autocausation

2.Autorank

3.Autocorrelation

4.None of the above


Question:
Which of the following statements about regularization is not correct?

1.Using too large a value of lambda can cause your hypothesis to underfit the data.

2.Using too large a value of lambda can cause your hypothesis to overfit the data

3.Using a very large value of lambda cannot hurt the performance of your hypothesis.

4.None of the above


Question:
Which of the following techniques can not be used for normalization in text mining?

1.Stemming

2.Lemmatization

3.Stop Word Removal

4.None of the above


Question:
Which of the following techniques can not be used for normalization intext mining?

1.Stemming

2.Lemmatization

3.Stop Word Removal

4.None of the above


Question:
________ Programming language is dialect of S.

1.B

2.C

3.R

4.None of the above


More MCQS

  1. BIG Data MCQs
  2. Introduction to Bigdata Mcq Set 1
  3. Introduction to Bigdata Mcq Set 2
Search
Olete Team
Online Exam TestTop Tutorials are Core Java,Hibernate ,Spring,Sturts.The content on Online Exam Testwebsite is done by expert team not only with the help of books but along with the strong professional knowledge in all context like coding,designing, marketing,etc!