March Sale Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70percent

CompTIA DA0-001 CompTIA Data+ Certification Exam Exam Practice Test

Demo: 78 questions
Total 262 questions

CompTIA Data+ Certification Exam Questions and Answers

Question 1

Exhibit.

Which of the following logical statements results in Table B?

A)

B)

C)

D)

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Question 2

An analyst reviews the following data:

7

3

5

2

3

7

7

10

Which of the following is the value of the mode?

Options:

A.

3

B.

5

C.

7

D.

10

Question 3

Which of the following is a process that is used during data integration to collect, blend, and load data?

Options:

A.

MDM

B.

ETL

C.

OLTP

D.

BI

Question 4

You should always choose the analytics tool that is most appropriate for any given situation, even if that means acquiring a new tool.

Options:

A.

True.

B.

False.

Question 5

A recurring event is being stored in two databases that are housed in different geographical locations. A data analyst notices the event is being logged three hours earlier in one database than in the other database. Which of the following is the MOST likely cause of the issue?

Options:

A.

The data analyst is not querying the databases correctly.

B.

The databases are recording different events.

C.

The databases are recording the event in different time zones.

D.

The second database is logging incorrectly.

Question 6

Which of the following data types would a telephone number formatted as XXX-XXX-XXXX be considered?

Options:

A.

Numeric

B.

Date

C.

Float

D.

Text

Question 7

Given the table below:

Which of the following boxes indicates that a Type Il error has occurred?

Options:

A.

1

B.

2

C.

3

D.

4

Question 8

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

Options:

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Question 9

Daniel is using the structured Query language to work with data stored in relational database.

He would like to add several new rows to a database table.

What command should he use?

Options:

A.

SELECT.

B.

ALTER.

C.

INSERT.

D.

UPDATE.

Question 10

During data profiling, an analyst decides to recode the status column in the following data set:

Which of the following data concerns explains why the analyst wants to take this action?

Options:

A.

Redundancy

B.

Duplication

C.

Invalidity

D.

Inconsistency

Question 11

A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:

Options:

A.

non-relational schema.

B.

galaxy schema.

C.

snowflake schema.

D.

star schema.

Question 12

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Question 13

While reviewing survey data, a research analyst notices data is missing from all the responses to a single question. Which of the following methods would BEST address this issue?

Options:

A.

Replace missing data.

B.

Remove duplicate data.

C.

Replace redundant data.

D.

Remove invalid data.

Question 14

A data analyst wants to create "Income Categories" that would be calculated based on the existing variable "Income". The "Income Categories" would be as follows:

Income category 1: less than $1.

Income category 2: more than $1 and less than $20,000.

Income category 3: more than $20,001 and less than $40,000.

Income category 4: more than $40,001.

Which of the following data manipulation techniques should the data analyst use to create "Income Categories"?

Options:

A.

Data merge

B.

Derived variables

C.

Data blending

D.

Data append

Question 15

Andy is a pricing analyst for a retailer. Using a hypothesis test, he wants to assess whether people who receive electronic coupons spend more on average.

What should Andy's null hypothesis be?

Options:

A.

People who receive electronic coupons spend more on average.

B.

People who receive electronic coupons spend less on average.

C.

People who receive electronic coupons do not spend more on average.

D.

People who do not receive electronic coupons spend more on average.

Question 16

Given the below:

Which of the following numbers represents a Type I error?

Options:

A.

1

B.

2

C.

3

D.

4

Question 17

A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?

Options:

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Question 18

A data analyst needs to create a data visualization that aids in un the cumulative impact of sequentially introduced values that are positive or negative. Which of the following

data visualization methods should the analyst use?

Options:

A.

A bubble chart

B.

A waterfall chart

C.

A scatter plot

D.

A line chart

Question 19

A JSON file is an example of:

Options:

A.

structured data.

B.

web data.

C.

machine data.

D.

processed data.

Question 20

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D. over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 21

Given the image below:

Which of the following file formats is depicted?

Options:

A.

JSON

B.

CSV

C.

XML

D.

HTML

Question 22

Encryption is a mechanism for protecting data.

When should encryption be applied to data?

Choose the best answer.

Options:

A.

When data is at rest.

B.

When data is at rest or in transit.

C.

When data is in transit.

D.

When data is at rest, unless you are using local storage.

Question 23

You would like to measure how well an organization is achieving its goals.

What type of analysis should you perform?

Options:

A.

Performance analysis.

B.

Outlier analysis.

C.

Predictive analysis.

D.

Trend analysis.

Question 24

A research analyst wants to determine whether the data being analyzed is connected to other datapoints. Which of the following is the BEST type of analysis to conduct?

Options:

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory analysis

Question 25

Which one of the following would not normally be considered a summary statistic?

Options:

A.

z-score.

B.

Mean.

C.

Variance.

D.

Standard deviation.

Question 26

A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?

Options:

A.

2018 goal data

B.

2018 actual revenue

C.

2019 goal data

D.

2019 commission plan

Question 27

Joe. an analyst. tests the loading time on a dashboard he is preparing to go live and finds it is slower than he would like. Which of the following must occur to decrease the loading time?

Options:

A.

Deploy the dashboard to production.

B.

Change the field definitions.

C.

Update the dashboard subscribers.

D.

Optimize the dashboard.

Question 28

What category of data stewardship work is focused on ensuring that the organization respects the wishes of data subjects?

Options:

A.

Data quality.

B.

Data privacy.

C.

Data security.

D.

Regulatory compliance.

Question 29

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

Options:

A.

ETL process.

B.

Record linkage.

C.

ELT process.

D.

System integration.

Question 30

Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)

Options:

A.

Mean

B.

Minimum

C.

Mode

D.

Variance

E.

Correlation

F.

Maximum

Question 31

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

Customer Table -

In-store Transactions –

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

Options:

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Question 32

Which of the ing is the correct ion for a tab-delimited spre file?

Options:

A.

tap

B.

tar

C.

sv

D.

az

Question 33

A development company is constructing a new unit in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans, which of the following should be the price of the Rose unit?

Options:

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Question 34

What role in a data governance is typically responsible for day-to-day oversight of data use?

Options:

A.

Data processors.

B.

Data custodians

C.

Data owners.

D.

Data stewards.

Question 35

Which of the following best describes how discrete data differs from continuous data?

Options:

A.

Discrete data cannot create a sloped line.

B.

Discrete data can only be a finite number of values.

C.

Discrete data can have decimal points.

D.

Discrete data applies only to numbers.

Question 36

Given the customer table below:

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

Options:

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Question 37

Given the following graph:

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D, over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 38

Which of the following best describes a business analytics tool with interactive visualization and business capabilities and an interface that is simple enough for end users to create their own reports and dashboards?

  • Python

Options:

A.

R

B.

Microsoft Power Bl

C.

SAS

Question 39

An analyst is required to run a text analysis of data that is found in articles from a digital news outlet. Which of the following would be the BEST technique for the analyst to apply to acquire the data?

Options:

A.

Web scraping

B.

Sampling

C.

Data wrangling

D.

ETL

Question 40

A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:

Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?

Options:

A.

Date

B.

Mathematical

C.

Logical

D.

Aggregate

Question 41

Given the image below:

The data should be cleaned because of the presence of:

Options:

A.

outlier

B.

non-parametric data.

C.

multicollinearity.

D.

invalid data.

Question 42

A data analyst is asked on the morning of April 9, 2020, to create a sales report that identifies sales year to date. The daily sales data is current through the end of the day. Which of the following date ranges should be on the report?

Options:

A.

January 1, 2020 to April 1, 2020

B.

January 1, 2020 to April 7, 2020

C.

January 1, 2020 to April 8, 2020

D.

January 1, 2020 to April 9, 2020

Question 43

Which one of the following programming languages is specifically designed for use in analytics applications?

Options:

A.

Python.

B.

R

C.

C++

D.

Java.

Question 44

A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:

Options:

A.

transactional schema.

B.

star schema.

C.

non-relational schema.

D.

snowflake schema.

Question 45

A collections manager has a team calling customers who are past due on their accounts in an attempt to collect payments. The manager receives the call list in the form of a printed report that is generated by the accounting department at the beginning of each week. Consequently, the collections team calls some customers who have made payments in the time since the report was last printed. Which of the following reporting enhancements could the accounting department implement to best reduce the number of calls on current accounts?

Options:

A.

Modify the date range on the report

B.

Include a time stamp on the report.

C.

Increase the frequency of report generation.

D.

Add a report run date to the report.

Question 46

An analyst is creating a resource to improve users' experience when they select specific records based on particular dates. Which of the following should the analyst use to create a resource that best meets user needs?

Options:

A.

Drop-down menu

B.

Date range

C.

Text field

D.

Frequency

Question 47

A customer list from a financial services company is shown below:

A data analyst wants to create a likely-to-buy score on a scale from 0 to 100, based on an average of the three numerical variables: number of credit cards, age, and income. Which of the following should the analyst do to the variables to ensure they all have the same weight in the score calculation?

Options:

A.

Recode the variables.

B.

Calculate the percentiles of the variables.

C.

Calculate the standard deviations of the variables.

D.

Normalize the variables.

Question 48

An analyst needs to create an analytics dashboard for an employee intranet site to improve the search functionality, display relevant information, and maintain an updated FAQ page. Which of the following visualizations would best represent what employees are searching for?

Options:

A.

A word cloud

B.

A histogram

C.

A pie chart

D.

A scatter plot

Question 49

Which of the following is a control measure for preventing a data breach?

Options:

A.

Data transmission

B.

Data attribution

C.

Data retention

D.

Data encryption

Question 50

What R package makes it easy to work with dates?

Options:

A.

Lubridate.

B.

Datemath.

C.

Stringr.

D.

ggplot.

Question 51

Which of the following is a difference between a primary key and a unique key?

Options:

A.

A unique key cannot take null values, whereas a primary key can take null values.

B.

There can be only one primary key in a data set, whereas there can be multiple unique keys.

C.

A primary key can take a value more than once, whereas a unique key cannot take a value more than once.

D.

A primary key cannot be a date variable, whereas a unique key can be.

Question 52

You have two databases tables that you would like to join together using a foreign key relationship.

What term best describes this action?

Options:

A.

Blending.

B.

Appending.

C.

Mixing.

D.

Merging.

Question 53

Which of the following is a KPI metric for tracking sales performance?

Options:

A.

Order status percentage

B.

Customer acquisition percentage

C.

Gross profit percentage

D.

Click-through rate percentage

Question 54

Given the following tables:

Which of the following will be the dimensions from a FULL JOIN of the tables above?

Options:

A.

Two rows and three columns

B.

Three rows and four columns

C.

Four rows and two columns

D.

Four rows and four columns

Question 55

Taylor wants to investigate how manufacturing, marketing, and sales expenditures impact overall profitability for her company.

Which of the following systems is the most appropriate?

Options:

A.

OLTP.

B.

OLAP.

C.

Data warehouse.

D.

Data mart.

Question 56

Given the following grocery store orders:

If a query is made to the table with the following logic:

Order_Total > 132 OR (Order Total >= 25 AND Order_Total < 74)

Which of the following is the number of orders that will be returned by the query?

Options:

A.

Four

B.

Five

C.

Six

D.

Seven

Question 57

Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?

Options:

A.

To return a subset of records

B.

To insert a temporary table

C.

To prevent SQL injections

D.

To increase the query speed

Question 58

Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?

Options:

A.

A dashboard with a continuous data stream and saved searches

B.

A report of test scores by classroom, emailed to the superintendent at the end of the month

C.

A report of test scores with pie charts showing student performance

D.

A dashboard with a scheduled delivery, the ability to filter scores by school, and bar charts for comparison

Question 59

A cereal manufacturer wants to determine whether the sugar content of its cereal has increased over the years. Which of the following is the appropriate descriptive statistic to use?

Options:

A.

Frequency

B.

Percent change

C.

Variance

D.

Mean

Question 60

Which of the following data manipulation techniques is an example of a logical function?

Options:

A.

WHERE

B.

AGGREGATE

C.

BOOLEAN

D.

IF

Question 61

An analyst has been tracking company intranet usage and has been asked to create a chat to show the most-used/most-clicked portions of a homepage that contains more than 30 links. Which of the following visualizations would BEST illustrate this information?

Options:

A.

Scatter plot

B.

Heat map

C.

Pie chart

D.

Infographic

Question 62

Which of the following variable name formats would be problematic if used in the majority of data software programs?

Options:

A.

First_Name_

B.

FirstName

C.

First_Name

D.

First Name

Question 63

Which of following is a non-relational database?

Options:

A.

Neo4j

B.

SQLite

C.

MySQL

D.

PostgreSQL

Question 64

Which of the following is an example of a discrete data type?

Options:

A.

8in (20cm)

B.

5 kids

C.

2.5mi (4km)

D.

10.7lbs (4.9kg)

Question 65

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 66

Which of the following concepts should be applied if a data set with 40 fields needs to be pared down to 20 fields and contains similar data across multiple fields?

Options:

A.

Duplication

B.

Consolidation

C.

Compliance

D.

Standardization

Question 67

An analyst must obtain the average daily sales for the following week:

Which of the following must the analyst perform to obtain this value?

Options:

A.

Data normalization

B.

Data append

C.

Data aggregation

D.

Data blending

Question 68

A data analyst is working with a team to create a dashboard for a client who requires on-demand access. Which of the following is the best delivery method to support the clients’ requirement?

Options:

A.

Email

B.

Scheduled

C.

Subscription

D.

Static

Question 69

Which of the following is a best practice when updating a legacy data source?

Options:

A.

Placing old data in new fields

B.

Keeping only the most recent data

C.

Creating a codebook to document field changes

D.

Removing the data source from production

Question 70

A data analyst received a large amount of third-party data that needs to be joined with in-house data files. After the data is joined, the analyst notices three columns all contain dates. Which of the following should the analyst do to maintain data consistency?

Options:

A.

Append all date columns and parse the strings.

B.

Impute all three date columns and then merge.

C.

Merge all date columns and unify the format.

D.

Separate the columns into a table and merge.

Question 71

The senior management team at a company receives a detailed sales report at the end of each quarter. The report is several pages long and includes data from dozens of offices across the country. The team wants a better way to get a quick snapshot of what is included in the report. Which of the following modifications would best meet this requirement?

Options:

A.

Modifying documentation elements to include reference data sources

B.

Modifying the font size and style so important data points are more visible

C.

Modifying the report to include a summary section with observations and insights

D.

Modifying the report layout so it is easier to follow and understand

Question 72

Which of the following BEST describes standard deviation?

Options:

A.

A measure that is used to establish a relationship between two variables

B.

A measure of how data is distributed

C.

A measure of the amount of dispersion of a set of values

D.

A measure that is used to find the significant difference between variables

Question 73

A data analyst needs to calculate the mean for Q1 sales using the data set below:

Which of the following is the mean?

Options:

A.

$2,466.18

B.

$2,667.60

C.

$3,082.72

D.

$12,330.88

Question 74

Which of the following data sampling methods involves dividing a population into subgroups by similar characteristics?

Options:

A.

Systematic

B.

Simple random

C.

Convenience

D.

Stratified

Question 75

Which of the following best describes the process of examining data for statistics and information about the data?

  • Cleansing

Options:

A.

search

B.

Profiling

C.

Governance

Question 76

A data analyst has received a data set that contains actual and projected sales for the fourth quarter of 2019. Which of the following statistical methods should the analyst use to find the measure of dispersion?

Options:

A.

Mean

B.

Variance

C.

Correlation

D.

Confidence interval

Question 77

A company's human resources department has asked a data analyst to categorize the income of all employees into five salary bands:

Which of the following types of functions would be the most appropriate to use?

Options:

A.

Statistical

B.

Aggregate

C.

Logical

D.

Mathematical

Question 78

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

Options:

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Demo: 78 questions
Total 262 questions