Weekend Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70percent

Databricks Databricks-Certified-Data-Analyst-Associate Databricks Certified Data Analyst Associate Exam Exam Practice Test

Databricks Certified Data Analyst Associate Exam Questions and Answers

Question 1

A data analyst wants to create a dashboard with three main sections: Development, Testing, and Production. They want all three sections on the same dashboard, but they want to clearly designate the sections using text on the dashboard.

Which of the following tools can the data analyst use to designate the Development, Testing, and Production sections using text?

Options:

A.

Separate endpoints for each section

B.

Separate queries for each section

C.

Markdown-based text boxes

D.

Direct text written into the dashboard in editing mode

E.

Separate color palettes for each section

Question 2

A stakeholder has provided a data analyst with a lookup dataset in the form of a 50-row CSV file. The data analyst needs to upload this dataset for use as a table in Databricks SQL.

Which approach should the data analyst use to quickly upload the file into a table for use in Databricks SOL?

Options:

A.

Create a table by uploading the file using the Create page within Databricks SQL

B.

Create a table via a connection between Databricks and the desktop facilitated by Partner Connect.

C.

Create a table by uploading the file to cloud storage and then importing the data to Databricks.

D.

Create a table by manually copying and pasting the data values into cloud storage and then importing the data to Databricks.

Question 3

Data professionals with varying titles use the Databricks SQL service as the primary touchpoint with the Databricks Lakehouse Platform. However, some users will use other services like Databricks Machine Learning or Databricks Data Science and Engineering.

Which of the following roles uses Databricks SQL as a secondary service while primarily using one of the other services?

Options:

A.

Business analyst

B.

SQL analyst

C.

Data engineer

D.

Business intelligence analyst

E.

Data analyst

Question 4

Query History provides Databricks SQL users with a lot of benefits. A data analyst has been asked to share all of these benefits with their team as part of a training exercise. One of the benefit statements the analyst provided to their team is incorrect.

Which statement about Query History is incorrect?

Options:

A.

It can be used to view the query plan of queries that have run.

B.

It can be used to debug queries.

C.

It can be used to automate query execution on multiple warehouses (formerly endpoints).

D.

It can be used to troubleshoot slow running queries.

Question 5

What describes Partner Connect in Databricks?

Options:

A.

it allows for free use of Databricks partner tools through a common API.

B.

it allows multi-directional connection between Databricks and Databricks partners easier.

C.

It exposes connection information to third-party tools via Databricks partners.

D.

It is a feature that runs Databricks partner tools on a Databricks SQL Warehouse (formerly known as a SQL endpoint).

Question 6

Which of the following is an advantage of using a Delta Lake-based data lakehouse over common data lake solutions?

Options:

A.

ACID transactions

B.

Flexible schemas

C.

Data deletion

D.

Scalable storage

E.

Open-source formats

Question 7

Which of the following should data analysts consider when working with personally identifiable information (PII) data?

Options:

A.

Organization-specific best practices for Pll data

B.

Legal requirements for the area in which the data was collected

C.

None of these considerations

D.

Legal requirements for the area in which the analysis is being performed

E.

All of these considerations

Question 8

A data analyst wants to create a Databricks SQL dashboard with multiple data visualizations and multiple counters. What must be completed before adding the data visualizations and counters to the dashboard?

Options:

A.

All data visualizations and counters must be created using Queries.

B.

A SQL warehouse (formerly known as SQL endpoint) must be turned on and selected.

C.

A markdown-based tile must be added to the top of the dashboard displaying the dashboard's name.

D.

The dashboard owner must also be the owner of the queries, data visualizations, and counters.

Question 9

Which statement describes descriptive statistics?

Options:

A.

A branch of statistics that uses a variety of data analysis techniques to infer properties of an underlying distribution of probability.

B.

A branch of statistics that uses summary statistics to categorically describe and summarize data.

C.

A branch of statistics that uses summary statistics to quantitatively describe and summarize data.

D.

A branch of statistics that uses quantitative variables that must take on a finite or countably infinite set of values.

Question 10

Which of the following is a benefit of Databricks SQL using ANSI SQL as its standard SQL dialect?

Options:

A.

It has increased customization capabilities

B.

It is easy to migrate existing SQL queries to Databricks SQL

C.

It allows for the use of Photon's computation optimizations

D.

It is more performant than other SQL dialects

E.

It is more compatible with Spark's interpreters

Question 11

A data analyst has been asked to configure an alert for a query that returns the income in the accounts_receivable table for a date range. The date range is configurable using a Date query parameter.

The Alert does not work.

Which of the following describes why the Alert does not work?

Options:

A.

Alerts don't work with queries that access tables.

B.

Queries that return results based on dates cannot be used with Alerts.

C.

The wrong query parameter is being used. Alerts only work with Date and Time query parameters.

D.

Queries that use query parameters cannot be used with Alerts.

E.

The wrong query parameter is being used. Alerts only work with drogdown list query parameters, not dates.

Question 12

A data engineering team has created a Structured Streaming pipeline that processes data in micro-batches and populates gold-level tables. The microbatches are triggered every minute.

A data analyst has created a dashboard based on this gold-level data. The project stakeholders want to see the results in the dashboard updated within one minute or less of new data becoming available within the gold-level tables.

Which of the following cautions should the data analyst share prior to setting up the dashboard to complete this task?

Options:

A.

The required compute resources could be costly

B.

The gold-level tables are not appropriately clean for business reporting

C.

The streaming data is not an appropriate data source for a dashboard

D.

The streaming cluster is not fault tolerant

E.

The dashboard cannot be refreshed that quickly

Question 13

A data analyst has been asked to produce a visualization that shows the flow of users through a website.

Which of the following is used for visualizing this type of flow?

Options:

A.

Heatmap

B.

IChoropleth

C.

Word Cloud

D.

Pivot Table

E.

Sankey

Question 14

A data team has been given a series of projects by a consultant that need to be implemented in the Databricks Lakehouse Platform.

Which of the following projects should be completed in Databricks SQL?

Options:

A.

Testing the quality of data as it is imported from a source

B.

Tracking usage of feature variables for machine learning projects

C.

Combining two data sources into a single, comprehensive dataset

D.

Segmenting customers into like groups using a clustering algorithm

E.

Automating complex notebook-based workflows with multiple tasks

Question 15

A data analyst is processing a complex aggregation on a table with zero null values and their query returns the following result:

Which of the following queries did the analyst run to obtain the above result?

A)

B)

C)

D)

E)

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

E.

Option E

Question 16

An analyst writes a query that contains a query parameter. They then add an area chart visualization to the query. While adding the area chart visualization to a dashboard, the analyst chooses "Dashboard Parameter" for the query parameter associated with the area chart.

Which of the following statements is true?

Options:

A.

The area chart will use whatever is selected in the Dashboard Parameter while all or the other visualizations will remain changed regardless of their parameter use.

B.

The area chart will use whatever is selected in the Dashboard Parameter along with all of the other visualizations in the dashboard that use the same parameter.

C.

The area chart will use whatever value is chosen on the dashboard at the time the area chart is added to the dashboard.

D.

The area chart will use whatever value is input by the analyst when the visualization is added to the dashboard. The parameter cannot be changed by the user afterwards.

E.

The area chart will convert to a Dashboard Parameter.

Question 17

Which of the following benefits of using Databricks SQL is provided by Data Explorer?

Options:

A.

It can be used to run UPDATE queries to update any tables in a database.

B.

It can be used to view metadata and data, as well as view/change permissions.

C.

It can be used to produce dashboards that allow data exploration.

D.

It can be used to make visualizations that can be shared with stakeholders.

E.

It can be used to connect to third party Bl cools.

Question 18

Which of the following statements describes descriptive statistics?

Options:

A.

A branch of statistics that uses summary statistics to quantitatively describe and summarize data.

B.

A branch of statistics that uses a variety of data analysis techniques to infer properties of an underlying distribution of probability.

C.

A branch of statistics that uses quantitative variables that must take on a finite or countably infinite set of values.

D.

A branch of statistics that uses summary statistics to categorically describe and summarize data.

E.

A branch of statistics that uses quantitative variables that must take on an uncountable set of values.