Labour Day Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70percent

Microsoft DP-100 Designing and Implementing a Data Science Solution on Azure Exam Practice Test

Demo: 53 questions
Total 407 questions

Designing and Implementing a Data Science Solution on Azure Questions and Answers

Question 1

You need to implement a feature engineering strategy for the crowd sentiment local models.

What should you do?

Options:

A.

Apply an analysis of variance (ANOVA).

B.

Apply a Pearson correlation coefficient.

C.

Apply a Spearman correlation coefficient.

D.

Apply a linear discriminant analysis.

Question 2

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 3

You need to implement a new cost factor scenario for the ad response models as illustrated in the

performance curve exhibit.

Which technique should you use?

Options:

A.

Set the threshold to 0.5 and retrain if weighted Kappa deviates +/- 5% from 0.45.

B.

Set the threshold to 0.05 and retrain if weighted Kappa deviates +/- 5% from 0.5.

C.

Set the threshold to 0.2 and retrain if weighted Kappa deviates +/- 5% from 0.6.

D.

Set the threshold to 0.75 and retrain if weighted Kappa deviates +/- 5% from 0.15.

Question 4

You need to implement a model development strategy to determine a user’s tendency to respond to an ad.

Which technique should you use?

Options:

A.

Use a Relative Expression Split module to partition the data based on centroid distance.

B.

Use a Relative Expression Split module to partition the data based on distance travelled to the event.

C.

Use a Split Rows module to partition the data based on distance travelled to the event.

D.

Use a Split Rows module to partition the data based on centroid distance.

Question 5

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 6

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 7

You need to use the Python language to build a sampling strategy for the global penalty detection models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 8

You need to define a modeling strategy for ad response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 9

You need to build a feature extraction strategy for the local models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 10

You need to implement a scaling strategy for the local penalty detection data.

Which normalization type should you use?

Options:

A.

Streaming

B.

Weight

C.

Batch

D.

Cosine

Question 11

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 12

You need to resolve the local machine learning pipeline performance issue. What should you do?

Options:

A.

Increase Graphic Processing Units (GPUs).

B.

Increase the learning rate.

C.

Increase the training iterations,

D.

Increase Central Processing Units (CPUs).

Question 13

You need to modify the inputs for the global penalty event model to address the bias and variance issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 14

You need to select an environment that will meet the business and data requirements.

Which environment should you use?

Options:

A.

Azure HDInsight with Spark MLlib

B.

Azure Cognitive Services

C.

Azure Machine Learning Studio

D.

Microsoft Machine Learning Server

Question 15

For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point.

Options:

Question 16

You use Azure Machine Learning studio to analyze an mltable data asset containing a decimal column named column1. You need to verify that the column1 values are normally distributed.

Which statistic should you use?

Options:

A.

Max

B.

Type

C.

Profile

D.

Mean

Question 17

You create an Azure Machine Learning workspace. You train a classification model by using automated machine learning (automated ML) in Azure Machine Learning studio. The training data contains multiple classes that have significantly different numbers of samples.

You must use a metric type to avoid labeling negative samples as positive and an averaging method that will minimize the class imbalance.

You need to configure the metric type and the averaging method.

Which configurations should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 18

You are using a Git repository to track work in an Azure Machine Learning workspace.

You need to authenticate a Git account by using SSH.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 19

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are using Azure Machine Learning to run an experiment that trains a classification model.

You want to use Hyperdrive to find parameters that optimize the AUC metric for the model. You configure a HyperDriveConfig for the experiment by running the following code:

You plan to use this configuration to run a script that trains a random forest model and then tests it with validation data. The label values for the validation data are stored in a variable named y_test variable, and the predicted probabilities from the model are stored in a variable named y_predicted.

You need to add logging to the script to allow Hyperdrive to optimize hyperparameters for the AUC metric.

Solution: Run the following code:

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 20

You use the Azure Machine learning SDK v2 tor Python and notebooks to tram a model. You use Python code to create a compute target, an environment, and a taring script. You need to prepare information to submit a training job.

Which class should you use?

Options:

A.

MLClient

B.

command

C.

BuildContext

D.

EndpointConnection

Question 21

You create an experiment in Azure Machine Learning Studio. You add a training dataset that contains 10,000 rows. The first 9,000 rows represent class 0 (90 percent).

The remaining 1,000 rows represent class 1 (10 percent).

The training set is imbalances between two classes. You must increase the number of training examples for class 1 to 4,000 by using 5 data rows. You add the Synthetic Minority Oversampling Technique (SMOTE) module to the experiment.

You need to configure the module.

Which values should you use? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 22

You create an Azure Machine learning workspace and load a Python training script named tram.py in the src subfolder. The dataset used to train your model is available locally. You run the following script to tram the model:

instructions: For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each correct selection is worth one point

Options:

Question 23

You are building an intelligent solution using machine learning models.

The environment must support the following requirements:

  • Data scientists must build notebooks in a cloud environment
  • Data scientists must use automatic feature engineering and model building in machine learning pipelines.
  • Notebooks must be deployed to retrain using Spark instances with dynamic worker allocation.
  • Notebooks must be exportable to be version controlled locally.

You need to create the environment.

Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 24

You use Azure Machine Learning to implement hyperparameter tuning with a Bandit early termination policy.

The policy uses a slack_factor set to 01. an evaluation interval set to 1, and an evaluation delay set to b.

You need to evaluate the outcome of the early termination policy

What should you evaluate? To answer, select the appropriate options m the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 25

You use Azure Machine Learning Studio to build a machine learning experiment.

You need to divide data into two distinct datasets.

Which module should you use?

Options:

A.

Split Data

B.

Load Trained Model

C.

Assign Data to Clusters

D.

Group Data into Bins

Question 26

You are creating a machine learning model in Python. The provided dataset contains several numerical columns and one text column. The text column represents a product's category. The product category will always be one of the following:

  • Bikes
  • Cars
  • Vans
  • Boats

You are building a regression model using the scikit-learn Python package.

You need to transform the text data to be compatible with the scikit-learn Python package.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 27

You use differential privacy to ensure your reports are private. The calculated value of the epsilon for your data is 1.8. You need to modify your data to ensure your reports are private. Which epsilon value should you accept for your data?

Options:

A.

between 0 and 1

B.

between 2 and 3

C.

between 3 and 10

D.

more than 10

Question 28

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are analyzing a numerical dataset which contain missing values in several columns.

You must clean the missing values using an appropriate operation without affecting the dimensionality of the feature set.

You need to analyze a full dataset to include all values.

Solution: Use the last Observation Carried Forward (IOCF) method to impute the missing data points.

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 29

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You create an Azure Machine Learning service datastore in a workspace. The datastore contains the following files:

• /data/2018/Q1 .csv

• /data/2018/Q2.csv

• /data/2018/Q3.csv

• /data/2018/Q4.csv

• /data/2019/Q1.csv

All files store data in the following format:

id,M,f2,l

1,1,2,0

2,1,1,1

32,10

You run the following code:

You need to create a dataset named training_data and load the data from all files into a single data frame by using the following code:

Solution: Run the following code:

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 30

You use the following code to define the steps for a pipeline:

from azureml.core import Workspace, Experiment, Run

from azureml.pipeline.core import Pipeline

from azureml.pipeline.steps import PythonScriptStep

ws = Workspace.from_config()

. . .

step1 = PythonScriptStep(name="step1", ...)

step2 = PythonScriptsStep(name="step2", ...)

pipeline_steps = [step1, step2]

You need to add code to run the steps.

Which two code segments can you use to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.

experiment = Experiment(workspace=ws,

name='pipeline-experiment')

run = experiment.submit(config=pipeline_steps)

B.

run = Run(pipeline_steps)

C.

pipeline = Pipeline(workspace=ws, steps=pipeline_steps)

experiment = Experiment(workspace=ws,

name='pipeline-experiment')

run = experiment.submit(pipeline)

D.

pipeline = Pipeline(workspace=ws, steps=pipeline_steps)

run = pipeline.submit(experiment_name='pipeline-experiment')

Question 31

You are developing a hands-on workshop to introduce Docker for Windows to attendees.

You need to ensure that workshop attendees can install Docker on their devices.

Which two prerequisite components should attendees install on the devices? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.

Microsoft Hardware-Assisted Virtualization Detection Tool

B.

Kitematic

C.

BIOS-enabled virtualization

D.

VirtualBox

E.

Windows 10 64-bit Professional

Question 32

You create an Azure Machine learning workspace. The workspace contains a folder named src. The folder contains a Python script named script 1 .py.

You use the Azure Machine Learning Python SDK v2 to create a control script. You must use the control script to run script l.py as part of a training job.

You need to complete the section of script that defines the job parameters.

How should you complete the script? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 33

You are performing feature scaling by using the scikit-learn Python library for x.1 x2, and x3 features.

Original and scaled data is shown in the following image.

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic.

NOTE: Each correct selection is worth one point.

Options:

Question 34

You plan to preprocess text from CSV files. You load the Azure Machine Learning Studio default stop words list.

You need to configure the Preprocess Text module to meet the following requirements:

  • Ensure that multiple related words from a single canonical form.
  • Remove pipe characters from text.
  • Remove words to optimize information retrieval.

Which three options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 35

You have the following Azure subscriptions and Azure Machine Learning service workspaces:

You need to obtain a reference to the ml-project workspace.

Solution: Run the following Python code:

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 36

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are using Azure Machine Learning to run an experiment that trains a classification model.

You want to use Hyperdrive to find parameters that optimize the AUC metric for the model. You configure a HyperDriveConfig for the experiment by running the following code:

You plan to use this configuration to run a script that trains a random forest model and then tests it with validation data. The label values for the validation data are stored in a variable named y_test variable, and the predicted probabilities from the model are stored in a variable named y_predicted.

You need to add logging to the script to allow Hyperdrive to optimize hyperparameters for the AUC metric. Solution: Run the following code:

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 37

You use Azure Machine Learning to deploy a model as a real-time web service.

You need to create an entry script for the service that ensures that the model is loaded when the service starts and is used to score new data as it is received.

Which functions should you include in the script? To answer, drag the appropriate functions to the correct actions. Each function may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content

NOTE: Each correct selection is worth one point.

Options:

Question 38

You create an Azure Machine Learning workspace. You use the Azure Machine Learning SDK for Python.

You must create a dataset from remote paths. The dataset must be reusable within the workspace.

You need to create the dataset.

How should you complete the following code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 39

You need to record the row count as a metric named row_count that can be returned using the get_metrics method of the Run object after the experiment run completes. Which code should you use?

Options:

A.

run.upload_file(‘row_count’, ‘./data.csv’)

B.

run.log(‘row_count’, rows)

C.

run.tag(‘row_count’, rows)

D.

run.log_table(‘row_count’, rows)

E.

run.log_row(‘row_count’, rows)

Question 40

You have an Azure Machine Learning workspace named workspaces.

You must add a datastore that connects an Azure Blob storage container to workspaces. You must be able to configure a privilege level.

You need to configure authentication.

Which authentication method should you use?

Options:

A.

Account key

B.

SAS token

C.

Service principal

D.

Managed identity

Question 41

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 42

You need to produce a visualization for the diagnostic test evaluation according to the data visualization requirements.

Which three modules should you recommend be used in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Question 43

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 44

You need to identify the methods for dividing the data according, to the testing requirements.

Which properties should you select? To answer, select the appropriate option-, m the answer area. NOTE: Each correct selection is worth one point.

Options:

Question 45

You need to select a feature extraction method.

Which method should you use?

Options:

A.

Spearman correlation

B.

Mutual information

C.

Mann-Whitney test

D.

Pearson’s correlation

Question 46

You need to set up the Permutation Feature Importance module according to the model training requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 47

You need to configure the Feature Based Feature Selection module based on the experiment requirements and datasets.

How should you configure the module properties? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 48

You need to replace the missing data in the AccessibilityToHighway columns.

How should you configure the Clean Missing Data module? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 49

You need to correct the model fit issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 50

You need to select a feature extraction method.

Which method should you use?

Options:

A.

Mutual information

B.

Mood’s median test

C.

Kendall correlation

D.

Permutation Feature Importance

Question 51

You need to implement early stopping criteria as suited in the model training requirements.

Which three code segments should you use to develop the solution? To answer, move the appropriate code segments from the list of code segments to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Options:

Question 52

You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed.

Which three Azure Machine Learning Studio modules should you use in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Question 53

You need to configure the Permutation Feature Importance module for the model training requirements.

What should you do? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Demo: 53 questions
Total 407 questions