Big Cyber Monday Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70percent

CompTIA DA0-001 CompTIA Data+ Certification Exam Exam Practice Test

Demo: 118 questions
Total 396 questions

CompTIA Data+ Certification Exam Questions and Answers

Question 1

Which of the following is the median of the number set:3, 7, 5, 6, 9?

Options:

A.

5

B.

6

C.

7

D.

9

Question 2

A research analyst collects ten data points from 1.000 specimens. The analyst will not need any additional data to complete the analysis and will not need to retrieve information by specifier. Which of the following is the best data structure for the analyst to use?

Options:

A.

NoSQL

B.

Flat file

C.

JSON

D.

Relational database

Question 3

Which of the following is an example of a discrete variable?

Options:

A.

The temperature of a hot tub

B.

The height of a horse

C.

The time to complete a task

D.

The number of people in an office

Question 4

An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?

Options:

A.

Data merge

B.

Data append

C.

Data blending

D.

Data imputation

Question 5

Which of the following are the first steps a company should take after discovering a data breach? (Select two).

Options:

A.

Delete data.

B.

Notify affected users.

C.

Assess the breach.

D.

Back up the system.

E.

Issue a press release.

F.

Delay reporting.

Question 6

A data analyst needs to collect a similar proportion of data from every state. Which of the following sampling methods would be the most appropriate?

Options:

A.

Systematic sampling

B.

Convenience sampling

C.

Stratified sampling

D.

Random sampling

Question 7

A user imports a data file into the accounts payable system each day. On a regular basis. the field input is not what the system is expecting. so it results in an error for the row and a broken import process. To resolve the issue, the user opens the file, finds the error in the row, and manually corrects it before attempting the import again. The import sometimes breaks on subsequent attempts. though. Which of the following changes should be made to this process to reduce the number of errors?

Options:

A.

Delete all incorrect inputs and upload the corrected file.

B.

Have the user manually review the file for data completeness before loading it

C.

Create a data field to data type validator to run the file through prior to import.

D.

Spot-check the file prior to import to catch and correct field errors.

Question 8

You are working with a dataset and need to swap the values in rows with those in columns.

What action do you need to perform?

Options:

A.

Recording

B.

Filtering.

C.

Aggregation.

D.

Transposition.

Question 9

A data analyst has been asked to organize the table below in the following ways:

By sales from high to low -

By state in alphabetic order -

Which of the following functions will allow the data analyst to organize the table in this manner?

Options:

A.

Conditional formatting

B.

Grouping

C.

Filtering

D.

Sorting

Question 10

Given the table below:

Which of the following variables can be considered inconsistent, and how many distinct values should the variable have?

Options:

A.

Name, one

B.

Gender, two

C.

Level, three

D.

Code, four

E.

Region, five

Question 11

A data analyst has been asked to create an ad-hoc sales report for the Chief Executive Officer (CEO).

Which of the following should be included in the report?

Options:

A.

The sales representatives' home addresses.

B.

Line-item SKU numbers.

C.

YTD total sales.

D.

The customers' first and last names.

Question 12

A customer list from a financial services company is shown below:

A data analyst wants to create a likely-to-buy score on a scale from 0 to 100, based on an average of the three numerical variables: number of credit cards, age, and income. Which of the following should the analyst do to the variables to ensure they all have the same weight in the score calculation?

Options:

A.

Recode the variables.

B.

Calculate the percentiles of the variables.

C.

Calculate the standard deviations of the variables.

D.

Normalize the variables.

Question 13

Which of the following data analysis tools increases the efficiency of data visualizations?

Options:

A.

SQL

B.

Microsoft Excel

C.

SAS

D.

RapidMiner

Question 14

A data set was recorded using multimedia technology. Which of the following is a necessary step on the way to interpretation?

Options:

A.

Structural equation modeling

B.

Transcription

C.

Sequential analysis

D.

Sampling

Question 15

Which of the following is a relational database?

Options:

A.

SQL

B.

Excel

C.

JSON

D.

NoSQL

Question 16

Which of the following is an example of structured data?

Options:

A.

A credit card number

B.

An email

C.

A photo

D.

Social media correspondence

Question 17

Five dogs have the following heights in millimeters:

300,430, 170, 470, 600

Which of the following is the standard deviation for the five dogs?

Options:

A.

147mm

B.

154mm

C.

394 mm

D.

21,704mm

Question 18

Which of the following will MOST likely be streamed live?

Options:

A.

Machine data

B.

Key-value pairs

C.

Delimited rows

D.

Flat files

Question 19

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Question 20

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered?

Options:

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Question 21

Which of the following is used for calculations and pivot tables?

Options:

A.

IBM SPSS

B.

SAS

C.

Microsoft Excel

D.

Domo

Question 22

Which of the following types of dashboards should a business intelligence engineer develop in order to provide information about failed data pipelines?

Options:

A.

Referencing

B.

Strategic

C.

Operational

D.

Technical

Question 23

Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?

Options:

A.

A dashboard with a continuous data stream and saved searches

B.

A report of test scores by classroom, emailed to the superintendent at the end of the month

C.

A report of test scores with pie charts showing student performance

D.

A dashboard with a scheduled delivery, the ability to filter scores by school, and bar charts for comparison

Question 24

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company's year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

Options:

A.

Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Question 25

A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.

Which of the following data manipulation techniques would he use to obtain this information?

Options:

A.

Data append

B.

Data blending

C.

Normalize data

D.

Data merge

Question 26

Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).

Options:

A.

A control group for the phrases

B.

A summary of the KPIs

C.

Filter buttons for the status

D.

The date when the report was last accessed

E.

The time period the report covers

F.

The date on which the report was run

Question 27

An analyst wants to include a graph in a quarterly sales report that shows the comparison between two quantitative variables. Which of the following visual diagrams can the analyst use to most effectively represent this relationship?

Options:

A.

Bar graph

B.

Heat map

C.

Pie chart

D.

Scatter plot

Question 28

Which of the following best describes the use of a tab sequence?

Options:

A.

\t

B.

\\t

C.

\l

D.

\\l

Question 29

Which of the following differentiates a flat text file from other data types?

Options:

A.

Data is separated by a delimiter.

B.

Data is stored in defined rows.

C.

Data is defined with key-value pairs.

D.

Data is housed in a markup language.

Question 30

The process of performing initial investigations on data to spot outliers, discover patterns, and test assumptions with statistical insight and graphical visualization is called:

Options:

A.

a t-test.

B.

a performance analysis.

C.

an exploratory data analysis.

D.

a link analysis.

Question 31

An organization would like to add a secondary email field to its customer database in order toenrich the customer profiles. Which of the following data manipulation techniques should the analyst use to add this information?

Options:

A.

Blend

B.

Merge

C.

Append

D.

Aggregate

Question 32

Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.

What can she do to get prevent confusion as see seeks feedback before publishing the report?

Choose the best answer.

Options:

A.

Distribute the report to the appropriate stakeholders via email.

B.

Use a watermark to identify the report as a draft.

C.

Show the report to her immediate supervisor.

D.

Publish the report on an internally facing website.

Question 33

Which of the following technologies would be best suited for creating a multiple linear regression model?

Options:

A.

Microsoft Power Bl

B.

R

C.

SQL

D.

Tableau

Question 34

An analyst in a consumer bank department wants to showcase the concentration of accounts opened in the United States by ZIP Code to describe the effectiveness of the bank's marketing campaigns. Which of the following would be the best way to visualize the data?

Options:

A.

A stacked chart

B.

A tree map

C.

A waterfall chart

D.

A geographic map

Question 35

Which of the following is the best description of the term "data governance"?

Options:

A.

Data governance governs the development of a data visualization dashboard in an organization.

B.

Data governance is the policy that protects against data breaches by cybercriminals.

C.

Data governance is the process of analyzing, manipulating, and reporting data in an organization.

D.

Data governance is the availability, usability, integrity, and security of data in an enterprise.

Question 36

A data analyst needs to perform a full outer join of a customer's orders using the tables below:

Which of the following is the mean of the order quantity?

Options:

A.

73.5

B.

76.5

C.

78.8

D.

81.5

Question 37

An analyst is compiling a series of reports for the new executive board to review. Which of the following elements provides a snapshot of what is contained in the reports for the executives who do not have time to focus on the details?

Options:

A.

Tables

B.

Reference data sources

C.

Observations and insights

D.

Instruction page

Question 38

An analyst has been asked to validate data quality. Which of the following are the BEST reasons to validate data for quality control purposes? (Choose two.)

Options:

A.

Retention

B.

Integrity

C.

Transmission

D.

Consistency

E.

Encryption

F.

Deletion

Question 39

Which of the following is a KPI metric for tracking sales performance?

Options:

A.

Order status percentage

B.

Customer acquisition percentage

C.

Gross profit percentage

D.

Click-through rate percentage

Question 40

Which of the following reports can be used when insight into operational performance is needed each Wednesday?

Options:

A.

Static report

B.

Tactical report

C.

Recurring report

D.

Ad hoc report

Question 41

A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:

Options:

A.

non-relational schema.

B.

galaxy schema.

C.

snowflake schema.

D.

star schema.

Question 42

Which of the following is an example of a flat file?

Options:

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Question 43

An analyst has generated a report that includes the number of months in the first two quarters of 2019 when sales exceeded $50,000:

Which of the following functions did the analyst use to generate the data in the Sales_indicator column?

Options:

A.

Aggregate

B.

Logical

C.

Date

D.

Sort

Question 44

Which of the following are reasons to create and maintain a data dictionary? (Choose two.)

Options:

A.

To improve data acquisition

B.

To remember specifics about data fields

C.

To specify user groups for databases

D.

To provide continuity through personnel turnover

E.

To confine breaches of PHI data

F.

To reduce processing power requirements

Question 45

An analyst reviews the following data:

7

3

5

2

3

7

7

10

Which of the following is the value of the mode?

Options:

A.

3

B.

5

C.

7

D.

10

Question 46

The total values in this month's revenue report are twice as much as last month's. Which of the following most likely occurred during the ETL process?

Options:

A.

The data cleansing processes failed to execute.

B.

The database connectivity failed.

C.

The report included the previous month's data.

D.

The data normalization processes failed.

Question 47

What SQL command is used to delete an entire table from a database?

Options:

A.

DROP.

B.

MODIFY.

C.

DELETE.

D.

ALTER.

Question 48

A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:

Options:

A.

transactional schema.

B.

star schema.

C.

non-relational schema.

D.

snowflake schema.

Question 49

A data analyst needs to write a SOL query measuring last month's website visits and distribute a summary report to the marketing team. Which of the following is the analyst creating?

Options:

A.

Date range

B.

Distribution list

C.

Data content

D.

Report view

Question 50

Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)

Options:

A.

Mean

B.

Minimum

C.

Mode

D.

Variance

E.

Correlation

F.

Maximum

Question 51

An analyst needs to join two data sets that compare vehicle weights. One data set is in pounds, and the other has various units of measure. Which of the following should the analyst do first to the data prior to any type of join?

Options:

A.

Blend

B.

Reduce

C.

Concatenate

D.

Normalize

Question 52

Given the following data:

Which of the following BEST describes the data set?

Options:

A.

There is data bias.

B.

The data is incomplete.

C.

The data is inconsistent.

D.

The data is outliers.

Question 53

Consider this dataset showing the retirement age of 11 people, in whole years:

54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

This tables show a simple frequency distribution of the retirement age data.

Options:

A.

56

B.

55

C.

57

D.

54

Question 54

Which of the following best describes a difference between JSON and XML?

Options:

A.

JSON is quicker to read and write.

B.

JSON has to use an end tag.

C.

JSON strings are longer

D.

JSON is much more difficult to parse.

Question 55

Exhibit.

Which of the following logical statements results in Table B?

A)

B)

C)

D)

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Question 56

What subset of Structured Query Language (SQL) is used to add, remove, modify, or retrieve the information stored within a relational database?

Options:

A.

DDL.

B.

DSL.

C.

DQL.

D.

DML.

Question 57

An analyst has received the requirements for an internal user dashboard. The analyst confirms the data sources and then creates a wireframe. Which of the following is the NEXT step the analyst should take in the dashboard creation process?

Options:

A.

Optimize the dashboard.

B.

Create subscriptions.

C.

Get stakeholder approval.

D.

Deploy to production.

Question 58

The duration of a phone call in milliseconds is an example of:

Options:

A.

ordinal data.

B.

nominal data.

C.

boolean data.

D.

continuous data.

Question 59

An analyst wants to create a historical data set for the past five years with each year in its own data set. Which of the following methods is the best way to create this historical data set?

Options:

A.

Data transpose

B.

Data concatenation

C.

Data append

D.

Data normalization

Question 60

Which of the following BEST describes the issue in which character values are mixed with integer values in a data set column?

Options:

A.

Duplicate data

B.

Missing data

C.

Data outliers

D.

Invalid data type

Question 61

A data analyst was asked to create a visual representation of sales for the first quarter of 2020. Which of the following visualizations should be used when a time element is present?

Options:

A.

A bubble chart

B.

A line chart

C.

A scatter plot

D.

An infographic

Question 62

Encryption is a mechanism for protecting data.

When should encryption be applied to data?

Choose the best answer.

Options:

A.

When data is at rest.

B.

When data is at rest or in transit.

C.

When data is in transit.

D.

When data is at rest, unless you are using local storage.

Question 63

Q3 2020 has just ended, and now a data analyst needs to create an ad-hoc sales report that demonstrates how well the Q3 2020 promotion went versus last year's Q3 promotion.

Which of the following date parameters should the analyst use?

Options:

A.

2019 vs. YTD 2020

B.

Q3 2019 vs. Q3 2020

C.

YTD 2019 vs. YTD 2020

D.

Q4 2019 vs. Q3 2020

Question 64

For which of the following test statistics would a low value imply a potentially meaningful result?

Options:

A.

Chi-squared

B.

p-value

C.

t-test

D.

F-test

Question 65

Which of the following value is the measure of dispersion "range" between the scores of ten students in a test.

The scores of ten students in a test are 17, 23, 30, 36, 45, 51, 58, 66, 72, 77.

Options:

A.

90

B.

60

C.

70

D.

80

Question 66

An analyst modified a data set that had a number of issues. Given the original and modified versions:

Which of the following data manipulation techniques did the analyst use?

Options:

A.

Imputation

B.

Recoding

C.

Parsing

D.

Deriving

Question 67

A database administrator is required to mask certain table columns containing PII in order to comply with the company privacy policy. Which of the following are the most likely types of information the administrator should mask? (Select two).

Options:

A.

Government-issued ID

B.

Address

C.

Order ID

D.

Order date

E.

Customer ID

F.

Referral number

Question 68

A data analyst is working with a data set and would like to combine two fields into a single field. Which of the following data manipulation techniques should the analyst use?

Options:

A.

Data merge

B.

Transpose

C.

Data append

D.

Concatenation

Question 69

Which of the following database schemas features normalized dimension tables?

Options:

A.

Flat

B.

Snowflake

C.

Hierarchical

D.

Star

Question 70

A company wants to know how its customers interact with an e-commerce website based on clicks over items. Which of the following is the primary requirement for this report?

Options:

A.

Data content

B.

Frequency

C.

Filtering

D.

Views

Question 71

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

Options:

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Question 72

An analyst has conducted a review of business questions. Which of the following should the analyst do next to conduct an analysis?

Options:

A.

Determine the data needs and review the observations.

B.

Determine the data needs and sources for analysis.

C.

Determine the data needs and schedule interviews.

D.

Determine the data needs and begin the analysis.

Question 73

Which of the following data types would a telephone number formatted as XXX-XXX-XXXX be considered?

Options:

A.

Numeric

B.

Date

C.

Float

D.

Text

Question 74

Given a product sales table, which of the following aggregate functions are suitable for retrieving the total quantity sold by each employee? (Select two).

Options:

A.

MAX

B.

SELECT

C.

HAVING

D.

GROUP BY

E.

ORDER BY

F.

SUM

Question 75

Which of the following would be considered non-personally identifiable information?

Options:

A.

Cell phone device name

B.

Customer’s name

C.

Government ID number

D.

Telephone number

Question 76

An analyst wants to test the association between the number of doors in a car and the number of gears in the car. Which of the following is the best test to use?

Options:

A.

F-test

B.

Acceptance test

C.

Chi-squared test

D.

Z-test

Question 77

Given the following report:

Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).

Options:

A.

A control group for the phrases

B.

A summary of the KPIs

C.

Filter buttons for the status

D.

The date when the report was last accessed

E.

The time period lhe report covers

F.

The date on which the report was run

Question 78

A data analyst needs to create a dashboard using the company's yearly revenue data sets. Which of the following would be the best way to plot the information to show the top-performing region?

Options:

A.

A line chart

B.

A waterfall chart

C.

A heat map

D.

A stacked bar chart

Question 79

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 80

A data governance analyst who is reviewing a retailer's data set notices that sales data is captured at the regional level but not at the individual store level. Which of the following best describes the issue with this data set?

Options:

A.

Data attribute limitations

B.

Data accuracy

C.

Data integrity

D.

Data consistency

Question 81

A data analyst is asked on the morning of April 9, 2020, to create a sales report that identifies sales year to date. The daily sales data is current through the end of the day. Which of the following date ranges should be on the report?

Options:

A.

January 1, 2020 to April 1, 2020

B.

January 1, 2020 to April 7, 2020

C.

January 1, 2020 to April 8, 2020

D.

January 1, 2020 to April 9, 2020

Question 82

An employer needs to maintain adequate office staffing during the winter and wants to track storm data. Which of the following data collection methods should the employer use?

Options:

A.

Web scraping

B.

Public databases

C.

Observations

D.

Weather surveys

Question 83

A data analyst is developing a dashboard to track and monitor metrics. Which of the following best practices should be taken into during the FIRST pment process?

Options:

A.

Create a A Aupirarrame:

B.

Deploy to production.

C.

Copy a dashboard design from the Internet.

D.

Develop a dashboard.

Question 84

Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:

Using this information, which of the following students had the BEST score?

Options:

A.

Randy

B.

Katie

C.

Ralph

D.

Jean

Question 85

Which of the following is a best practice when updating a legacy data source?

Options:

A.

Placing old data in new fields

B.

Keeping only the most recent data

C.

Creating a codebook to document field changes

D.

Removing the data source from production

Question 86

Given the table below:

Which of the following boxes indicates that a Type Il error has occurred?

Options:

A.

1

B.

2

C.

3

D.

4

Question 87

Different people manually type a series of handwritten surveys into an online database. Which of the following issues will MOST likely arise with this data? (Choose two.)

Options:

A.

Data accuracy

B.

Data constraints

C.

Data attribute limitations

D.

Data bias

E.

Data consistency

F.

Data manipulation

Question 88

Given the following data:

CustomerID

ItemBought

Date

Tre_234

Sofa

2022-09-08

216_Tre

Shoes

08/02/2021

215/Tre

Blanket

2021/06/20

045/Tre

Mug

12-26-2021

Tre-345

Lamp

31/08/2022

TREJD19

Bucket

2022'08/01

Which of the following best describes the main issue in the data set?

Options:

A.

Inconsistent data

B.

Data mismatch

C.

Invalid data

D.

Redundant data

Question 89

Which of the following variable name formats would be problematic if used in the majority of data software programs?

Options:

A.

First_Name_

B.

FirstName

C.

First_Name

D.

First Name

Question 90

An analyst wants to combine two data sets into a single spreadsheet. Column names from the first spreadsheet are listed in rows in the second spreadsheet. Which of the following is the first step the analyst should take to combine the data sets?

Options:

A.

Blend

B.

Merge

C.

Concatenate

D.

Transpose

Question 91

Each month an analyst needs to execute a data pull for the two prior months. Which of the following is the most efficient function for the analyst to use?

Options:

A.

Logical

B.

Date

C.

Aggregate

D.

System

Question 92

A data analyst is using a two-tailed, independent t-test to determine whether the type of stretching, dynamic or static, has any influence on a dancer's flexibility. Which of the following is the alternative hypothesis?

Options:

A.

A dancer's flexibility is improved through static stretching.

B.

The change in a dancer's flexibility is not equal to zero.

C.

There is a difference in a dancer's flexibility between static and dynamic stretching.

D.

The means of the static and dynamic stretching groups do not differ from each other.

Question 93

A financial analyst is creating a daily billing report for a company. One night, the company's data warehouse did not update the data, which caused the data to be reported incorrectly the next day. Which of the following documentation elements should the analyst add to catch this error?

Options:

A.

Version number

B.

Data refresh

C.

Frequently asked questions tab

D.

Summary

Question 94

An analyst wants to include a graph in a quarterly sales report that shows the comparison between two quantitative variables. Which of the following visual diagrams can the analyst use to most effectively represent this relationship?

Options:

A.

Bar graph

B.

Heat map

C.

Pie chart

D.

Histogram

Question 95

Given the following tables:

Which of the following will be the dimensions from a FULL JOIN of the tables above?

Options:

A.

Two rows and three columns

B.

Three rows and four columns

C.

Four rows and two columns

D.

Four rows and four columns

Question 96

Which one the following is not considered an aggregate function?

Options:

A.

SUM

B.

MIN

C.

SELECT

D.

MAX

Question 97

What would be an example of an acceptable form of primary identification for the Data+ exam?

Options:

A.

Passport.

B.

School ID card.

C.

Employee ID card.

D.

Credit card with photo and signature.

Question 98

A data analyst is attempting to understand how ice cream consumption is affected by different attributes. such as cost, temperature. and income level. Which of the following

regression analyses should the data analyst perform to understand this relationship?

Options:

A.

Logistic

B.

Ordinary least squares

C.

Cox

D.

Polynomial

Question 99

Which one of the following would not normally be considered a summary statistic?

Options:

A.

z-score.

B.

Mean.

C.

Variance.

D.

Standard deviation.

Question 100

A junior web developer is developing a new application where users can upload short videos. The first task is to create a homepage that shows the headline "Upload Your Short Videos" and a clickable button that says "upload now".

Which of the following HTML commands would help the developer to complete the task successfully?

Options:

A.

< span >Upload Your Short Videos< /span >< button >upload now< /button >

B.

< p >Upload Your Short Videos< /p >< p >upload now< /p >

C.

< hl >Upload Your Short Videos< /h1 >< button >upload now< /button >

D.

< hl >Upload Your Short Videos< /h1 >< hl >upload now< /h1 >

Question 101

Which of the following types of analyses should be used to evaluate the connections and anomalies in a data set when either known patterns are being violated or new patterns are emerging?

Options:

A.

Correlation

B.

Descriptive

C.

Graph

D.

Regression

Question 102

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

Options:

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Question 103

Five dogs have the following heights in millimeters:

300, 430, 170, 470, 600

Which of the following is the mean height for the five dogs?

Options:

A.

394mm

B.

405mm

C.

493mm

D.

504mm

Question 104

Given the following table:

Date of visit

Age

Gender

6/1/22

30

Male

6/15/22

65F

Fem.

6/19/2022

24

M

Which of the following describes the data quality issues with the age data?

Options:

A.

Completeness

B.

Consistency

C.

Accuracy

D.

Manipulation

Question 105

Which one of the following programming languages is specifically designed for use in analytics applications?

Options:

A.

Python.

B.

R

C.

C++

D.

Java.

Question 106

Which of the following types of analysis would be best for an analyst to use to examine the relationships between authors who cited other authors in a library of research papers?

Options:

A.

Linguistic analysis

B.

Trend analysis

C.

Link analysis

D.

Performance analysis

Question 107

Given the image below:

The data should be cleaned because of the presence of:

Options:

A.

outlier

B.

non-parametric data.

C.

multicollinearity.

D.

invalid data.

Question 108

Which of the following occurs if a 90% confidence interval increases to 95%?

Options:

A.

The margin of error does not change.

B.

The interval remains the same.

C.

The interval becomes narrower.

D.

The margin of error doubles.

Question 109

Given the following graph:

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D, over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 110

Which of the following are reasons to conduct data cleansing? (Select two).

Options:

A.

To perform web scraping

B.

To track KPls

C.

To improve accuracy

D.

To review data sets

E.

To increase the sample size

F.

To calculate trends

Question 111

An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered to best display the data?

Options:

A.

Include a bar chart using the site and the percentage of new customers data.

B.

Include a line chart using the site and the percentage of new customers data.

C.

Include a pie chart using the site and percentage of new custorners data.

D.

Include a scatter chart using the site and the percent of new customers data.

Question 112

A quality assurance manager is examining tolerances in Internet of Things sensors. Which of the following is the best measure for the manager to calculate?

Options:

A.

Standard deviation

B.

Quartile range

C.

Median

D.

Mean

Question 113

Consider the following dataset which contains information about houses that are for sale:

Which of the following string manipulation commands will combine the address and region namecolumns to create a full address?

full_address------------------------- 85 Turner St, Northern Metropolitan 25 Bloomburg St, Northern Metropolitan 5 Charles St, Northern Metropolitan 40 Federation La, Northern Metropolitan 55a Park St, Northern Metropolitan

Options:

A.

SELECT CONCAT(address, ' , ' , regionname) AS full_address FROM melb LIMIT 5;

B.

SELECT CONCAT(address, '-' , regionname) AS full_address FROM melb LIMIT 5;

C.

SELECT CONCAT(regionname, ' , ' , address) AS full_address FROM melb LIMIT 5

D.

SELECT CONCAT(regionname, '-' , address) AS full_address FROM melb LIMIT 5;

Question 114

A data profiling rule checks the quality of all email addresses in a database. The rule returns a value with the number of email addresses that conformed to the rule. Which of the following options describes this value?

Options:

A.

Columns passed

B.

Rows passed

C.

Rows failed

D.

Columns failed

Question 115

Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?

Options:

A.

To return a subset of records

B.

To insert a temporary table

C.

To prevent SQL injections

D.

To increase the query speed

Question 116

Which of the following should an analyst do to best summarize the data on a data set?

Options:

A.

Filtering

B.

Aggregation

C.

Sorting

D.

Concatenation

Question 117

A cereal manufacturer wants to determine whether the sugar content of its cereal has increased over the years. Which of the following is the appropriate descriptive statistic to use?

Options:

A.

Frequency

B.

Percent change

C.

Variance

D.

Mean

Question 118

Which of the ing is the correct ion for a tab-delimited spre file?

Options:

A.

tap

B.

tar

C.

sv

D.

az

Demo: 118 questions
Total 396 questions