Labour Day Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70percent

Google Professional-Machine-Learning-Engineer Google Professional Machine Learning Engineer Exam Practice Test

Google Professional Machine Learning Engineer Questions and Answers

Question 1

You work on a growing team of more than 50 data scientists who all use AI Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?

Options:

A.

Set up restrictive IAM permissions on the AI Platform notebooks so that only a single user or group can access a given instance.

B.

Separate each data scientist’s work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.

C.

Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources.

D.

Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about AI Platform resource usage. In BigQuery, create a SQL view that maps users to the resources they are using

Question 2

You work for a pharmaceutical company based in Canada. Your team developed a BigQuery ML model to predict the number of flu infections for the next month in Canada Weather data is published weekly and flu infection statistics are published monthly. You need to configure a model retraining policy that minimizes cost What should you do?

Options:

A.

Download the weather and flu data each week Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model weekly.

B.

Download the weather and flu data each month Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model monthly.

C.

Download the weather and flu data each week Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model every month.

D.

Download the weather data each week, and download the flu data each month Deploy the model to a Vertex Al endpoint with feature drift monitoring. and retrain the model if a monitoring alert is detected.

Question 3

You have developed a BigQuery ML model that predicts customer churn and deployed the model to Vertex Al Endpoints. You want to automate the retraining of your model by using minimal additional code when model feature values change. You also want to minimize the number of times that your model is retrained to reduce training costs. What should you do?

Options:

A.

1. Enable request-response logging on Vertex Al Endpoints.

2 Schedule a TensorFlow Data Validation job to monitor prediction drift

3. Execute model retraining if there is significant distance between the distributions.

B.

1. Enable request-response logging on Vertex Al Endpoints

2. Schedule a TensorFlow Data Validation job to monitor training/serving skew

3. Execute model retraining if there is significant distance between the distributions

C.

1 Create a Vertex Al Model Monitoring job configured to monitor prediction drift.

2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitonng alert is detected.

3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery

D.

1. Create a Vertex Al Model Monitoring job configured to monitor training/serving skew

2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is detected

3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery.

Question 4

You are an ML engineer at a travel company. You have been researching customers’ travel behavior for many years, and you have deployed models that predict customers’ vacation patterns. You have observed that customers’ vacation destinations vary based on seasonality and holidays; however, these seasonal variations are similar across years. You want to quickly and easily store and compare the model versions and performance statistics across years. What should you do?

Options:

A.

Store the performance statistics in Cloud SQL. Query that database to compare the performance statistics across the model versions.

B.

Create versions of your models for each season per year in Vertex AI. Compare the performance statistics across the models in the Evaluate tab of the Vertex AI UI.

C.

Store the performance statistics of each pipeline run in Kubeflow under an experiment for each season per year. Compare the results across the experiments in the Kubeflow UI.

D.

Store the performance statistics of each version of your models using seasons and years as events in Vertex ML Metadata. Compare the results across the slices.

Question 5

You recently joined a machine learning team that will soon release a new project. As a lead on the project, you are asked to determine the production readiness of the ML components. The team has already tested features and data, model development, and infrastructure. Which additional readiness check should you recommend to the team?

Options:

A.

Ensure that training is reproducible

B.

Ensure that all hyperparameters are tuned

C.

Ensure that model performance is monitored

D.

Ensure that feature expectations are captured in the schema

Question 6

You recently used BigQuery ML to train an AutoML regression model. You shared results with your team and received positive feedback. You need to deploy your model for online prediction as quickly as possible. What should you do?

Options:

A.

Retrain the model by using BigQuery ML. and specify Vertex Al as the model registry Deploy the model from Vertex Al Model Registry to a Vertex Al endpoint.

B.

Retrain the model by using Vertex Al Deploy the model from Vertex Al Model Registry to a Vertex Al endpoint.

C.

Alter the model by using BigQuery ML and specify Vertex Al as the model registry Deploy the model from Vertex Al Model Registry to a Vertex Al endpoint.

D.

Export the model from BigQuery ML to Cloud Storage Import the model into Vertex Al Model Registry Deploy the model to a Vertex Al endpoint.

Question 7

You built a deep learning-based image classification model by using on-premises data. You want to use Vertex Al to deploy the model to production Due to security concerns you cannot move your data to the cloud. You are aware that the input data distribution might change over time You need to detect model performance changes in production. What should you do?

Options:

A.

Use Vertex Explainable Al for model explainability Configure feature-based explanations.

B.

Use Vertex Explainable Al for model explainability Configure example-based explanations.

C.

Create a Vertex Al Model Monitoring job. Enable training-serving skew detection for your model.

D.

Create a Vertex Al Model Monitoring job. Enable feature attribution skew and dnft detection for your model.

Question 8

You developed a Transformer model in TensorFlow to translate text Your training data includes millions of documents in a Cloud Storage bucket. You plan to use distributed training to reduce training time. You need to configure the training job while minimizing the effort required to modify code and to manage the clusters configuration. What should you do?

Options:

A.

Create a Vertex Al custom training job with GPU accelerators for the second worker pool Use tf .distribute.MultiWorkerMirroredStrategy for distribution.

B.

Create a Vertex Al custom distributed training job with Reduction Server Use N1 high-memory machine type instances for the first and second pools, and use N1 high-CPU machine type instances for the third worker pool.

C.

Create a training job that uses Cloud TPU VMs Use tf.distribute.TPUStrategy for distribution.

D.

Create a Vertex Al custom training job with a single worker pool of A2 GPU machine type instances Use tf .distribute.MirroredStraregy for distribution.

Question 9

You want to migrate a scikrt-learn classifier model to TensorFlow. You plan to train the TensorFlow classifier model using the same training set that was used to train the scikit-learn model and then compare the performances using a common test set. You want to use the Vertex Al Python SDK to manually log the evaluation metrics of each model and compare them based on their F1 scores and confusion matrices. How should you log the metrics?

Options:

A.

B.

C.

D.

Question 10

You need to use TensorFlow to train an image classification model. Your dataset is located in a Cloud Storage directory and contains millions of labeled images Before training the model, you need to prepare the data. You want the data preprocessing and model training workflow to be as efficient scalable, and low maintenance as possible. What should you do?

Options:

A.

1 Create a Dataflow job that creates sharded TFRecord files in a Cloud Storage directory.

2 Reference tf .data.TFRecordDataset in the training script.

3. Train the model by using Vertex Al Training with a V100 GPU.

B.

1 Create a Dataflow job that moves the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label.

2 Reference tfds.fclder_da-asst.imageFclder in the training script.

3. Train the model by using Vertex AI Training with a V100 GPU.

C.

1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance.

2 Write a Python script that creates sharded TFRecord files in a directory inside the instance

3. Reference tf. da-a.TFRecrrdDataset in the training script.

4. Train the model by using the Workbench instance.

D.

1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance.

2 Write a Python scnpt that copies the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label.

3 Reference tf ds. f older_dataset. imageFolder in the training script.

4. Train the model by using the Workbench instance.

Question 11

You work for an advertising company and want to understand the effectiveness of your company's latest advertising campaign. You have streamed 500 MB of campaign data into BigQuery. You want to query the table, and then manipulate the results of that query with a pandas dataframe in an Al Platform notebook. What should you do?

Options:

A.

Use Al Platform Notebooks' BigQuery cell magic to query the data, and ingest the results as a pandas dataframe

B.

Export your table as a CSV file from BigQuery to Google Drive, and use the Google Drive API to ingest the file into your notebook instance

C.

Download your table from BigQuery as a local CSV file, and upload it to your Al Platform notebook instance Use pandas. read_csv to ingest the file as a pandas dataframe

D.

From a bash cell in your Al Platform notebook, use the bq extract command to export the table as a CSV file to Cloud Storage, and then use gsutii cp to copy the data into the notebook Use pandas. read_csv to ingest the file as a pandas dataframe

Question 12

Your team has a model deployed to a Vertex Al endpoint You have created a Vertex Al pipeline that automates the model training process and is triggered by a Cloud Function. You need to prioritize keeping the model up-to-date, but also minimize retraining costs. How should you configure retraining'?

Options:

A.

Configure Pub/Sub to call the Cloud Function when a sufficient amount of new data becomes available.

B.

Configure a Cloud Scheduler job that calls the Cloud Function at a predetermined frequency that fits your team's budget.

C.

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when anomalies are detected.

D.

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when feature drift is detected.

Question 13

You have built a model that is trained on data stored in Parquet files. You access the data through a Hive table hosted on Google Cloud. You preprocessed these data with PySpark and exported it as a CSV file into Cloud Storage. After preprocessing, you execute additional steps to train and evaluate your model. You want to parametrize this model training in Kubeflow Pipelines. What should you do?

Options:

A.

Remove the data transformation step from your pipeline.

B.

Containerize the PySpark transformation step, and add it to your pipeline.

C.

Add a ContainerOp to your pipeline that spins a Dataproc cluster, runs a transformation, and then saves the transformed data in Cloud Storage.

D.

Deploy Apache Spark at a separate node pool in a Google Kubernetes Engine cluster. Add a ContainerOp to your pipeline that invokes a corresponding transformation job for this Spark instance.

Question 14

You work at a bank. You need to develop a credit risk model to support loan application decisions You decide to implement the model by using a neural network in TensorFlow Due to regulatory requirements, you need to be able to explain the models predictions based on its features When the model is deployed, you also want to monitor the model's performance overtime You decided to use Vertex Al for both model development and deployment What should you do?

Options:

A.

Use Vertex Explainable Al with the sampled Shapley method, and enable Vertex Al Model Monitoring to

check for feature distribution drift.

B.

Use Vertex Explainable Al with the sampled Shapley method, and enable Vertex Al Model Monitoring to

check for feature distribution skew.

C.

Use Vertex Explainable Al with the XRAI method, and enable Vertex Al Model Monitoring to check for feature distribution drift.

D.

Use Vertex Explainable Al with the XRAI method and enable Vertex Al Model Monitoring to check for feature distribution skew.

Question 15

You have deployed a scikit-learn model to a Vertex Al endpoint using a custom model server. You enabled auto scaling; however, the deployed model fails to scale beyond one replica, which led to dropped requests. You notice that CPU utilization remains low even during periods of high load. What should you do?

Options:

A.

Attach a GPU to the prediction nodes.

B.

Increase the number of workers in your model server.

C.

Schedule scaling of the nodes to match expected demand.

D.

Increase the minReplicaCount in your DeployedModel configuration.

Question 16

You recently deployed a pipeline in Vertex Al Pipelines that trains and pushes a model to a Vertex Al endpoint to serve real-time traffic. You need to continue experimenting and iterating on your pipeline to improve model performance. You plan to use Cloud Build for CI/CD You want to quickly and easily deploy new pipelines into production and you want to minimize the chance that the new pipeline implementations will break in production. What should you do?

Options:

A.

Set up a CI/CD pipeline that builds and tests your source code If the tests are successful use the Google Cloud console to upload the built container to Artifact Registry and upload the compiled pipeline to Vertex Al Pipelines.

B.

Set up a CI/CD pipeline that builds your source code and then deploys built artifacts into a pre-production environment Run unit tests in the pre-production environment If the tests are successful deploy the pipeline to production.

C.

Set up a CI/CD pipeline that builds and tests your source code and then deploys built artifacts into a pre-production environment. After a successful pipeline run in the pre-production environment deploy the pipeline to production

D.

Set up a CI/CD pipeline that builds and tests your source code and then deploys built arrets into a pre-production environment After a successful pipeline run in the pre-production environment, rebuild the source code, and deploy the artifacts to production

Question 17

You work at a large organization that recently decided to move their ML and data workloads to Google Cloud. The data engineering team has exported the structured data to a Cloud Storage bucket in Avro format. You need to propose a workflow that performs analytics, creates features, and hosts the features that your ML models use for online prediction How should you configure the pipeline?

Options:

A.

Ingest the Avro files into Cloud Spanner to perform analytics Use a Dataflow pipeline to create the features and store them in BigQuery for online prediction.

B.

Ingest the Avro files into BigQuery to perform analytics Use a Dataflow pipeline to create the features, and store them in Vertex Al Feature Store for online prediction.

C.

Ingest the Avro files into BigQuery to perform analytics Use BigQuery SQL to create features and store them in a separate BigQuery table for online prediction.

D.

Ingest the Avro files into Cloud Spanner to perform analytics. Use a Dataflow pipeline to create the features. and store them in Vertex Al Feature Store for online prediction.

Question 18

Your team needs to build a model that predicts whether images contain a driver's license, passport, or credit card. The data engineering team already built the pipeline and generated a dataset composed of 10,000 images with driver's licenses, 1,000 images with passports, and 1,000 images with credit cards. You now have to train a model with the following label map: ['driversjicense', 'passport', 'credit_card']. Which loss function should you use?

Options:

A.

Categorical hinge

B.

Binary cross-entropy

C.

Categorical cross-entropy

D.

Sparse categorical cross-entropy

Question 19

You are training a custom language model for your company using a large dataset. You plan to use the ReductionServer strategy on Vertex Al. You need to configure the worker pools of the distributed training job. What should you do?

Options:

A.

Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs Configure the third worker pool to have GPUs: and use the reduction server container image.

B.

Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs. Configure the third worker pool to use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth.

C.

Configure the machines of the first two worker pools to have TPUs and to use a container image where your training code runs Configure the third worker pool without accelerators, and use the reductionserver container image without accelerators and choose a machine type that prioritizes bandwidth.

D.

Configure the machines of the first two pools to have TPUs. and to use a container image where your training code runs Configure the third pool to have TPUs: and use the reductionserver container image.

Question 20

You work on a growing team of more than 50 data scientists who all use Al Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?

Options:

A.

Set up restrictive I AM permissions on the Al Platform notebooks so that only a single user or group can access a given instance.

B.

Separate each data scientist's work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.

C.

Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources

D.

Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about Al Platform resource usage In BigQuery create a SQL view that maps users to the resources they are using.

Question 21

You need to quickly build and train a model to predict the sentiment of customer reviews with custom categories without writing code. You do not have enough data to train a model from scratch. The resulting model should have high predictive performance. Which service should you use?

Options:

A.

AutoML Natural Language

B.

Cloud Natural Language API

C.

AI Hub pre-made Jupyter Notebooks

D.

AI Platform Training built-in algorithms

Question 22

You work for a social media company. You want to create a no-code image classification model for an iOS mobile application to identify fashion accessories You have a labeled dataset in Cloud Storage You need to configure a training workflow that minimizes cost and serves predictions with the lowest possible latency What should you do?

Options:

A.

Train the model by using AutoML, and register the model in Vertex Al Model Registry Configure your mobile

application to send batch requests during prediction.

B.

Train the model by using AutoML Edge and export it as a Core ML model Configure your mobile application

to use the mlmodel file directly.

C.

Train the model by using AutoML Edge and export the model as a TFLite model Configure your mobile application to use the tflite file directly

D.

Train the model by using AutoML, and expose the model as a Vertex Al endpoint Configure your mobile application to invoke the endpoint during prediction.

Question 23

You are an ML engineer at a manufacturing company You are creating a classification model for a predictive maintenance use case You need to predict whether a crucial machine will fail in the next three days so that the repair crew has enough time to fix the machine before it breaks. Regular maintenance of the machine is relatively inexpensive, but a failure would be very costly You have trained several binary classifiers to predict whether the machine will fail. where a prediction of 1 means that the ML model predicts a failure.

You are now evaluating each model on an evaluation dataset. You want to choose a model that prioritizes detection while ensuring that more than 50% of the maintenance jobs triggered by your model address an imminent machine failure. Which model should you choose?

Options:

A.

The model with the highest area under the receiver operating characteristic curve (AUC ROC) and precision greater than 0 5

B.

The model with the lowest root mean squared error (RMSE) and recall greater than 0.5.

C.

The model with the highest recall where precision is greater than 0.5.

D.

The model with the highest precision where recall is greater than 0.5.

Question 24

You work for a toy manufacturer that has been experiencing a large increase in demand. You need to build an ML model to reduce the amount of time spent by quality control inspectors checking for product defects. Faster defect detection is a priority. The factory does not have reliable Wi-Fi. Your company wants to implement the new ML model as soon as possible. Which model should you use?

Options:

A.

AutoML Vision model

B.

AutoML Vision Edge mobile-versatile-1 model

C.

AutoML Vision Edge mobile-low-latency-1 model

D.

AutoML Vision Edge mobile-high-accuracy-1 model

Question 25

You work for a hotel and have a dataset that contains customers' written comments scanned from paper-based customer feedback forms which are stored as PDF files Every form has the same layout. You need to quickly predict an overall satisfaction score from the customer comments on each form. How should you accomplish this task'?

Options:

A.

Use the Vision API to parse the text from each PDF file Use the Natural Language API

analyzesentiment feature to infer overall satisfaction scores.

B.

Use the Vision API to parse the text from each PDF file Use the Natural Language API

analyzeEntitysentiment feature to infer overall satisfaction scores.

C.

Uptrain a Document Al custom extractor to parse the text in the comments section of each PDF file. Use the Natural Language API analyze sentiment feature to infer overall satisfaction scores.

D.

Uptrain a Document Al custom extractor to parse the text in the comments section of each PDF file. Use the Natural Language API analyzeEntitySentiment feature to infer overall satisfaction scores.

Question 26

You recently deployed a model lo a Vertex Al endpoint and set up online serving in Vertex Al Feature Store. You have configured a daily batch ingestion job to update your featurestore During the batch ingestion jobs you discover that CPU utilization is high in your featurestores online serving nodes and that feature retrieval latency is high. You need to improve online serving performance during the daily batch ingestion. What should you do?

Options:

A.

Schedule an increase in the number of online serving nodes in your featurestore prior to the batch ingestion jobs.

B.

Enable autoscaling of the online serving nodes in your featurestore

C.

Enable autoscaling for the prediction nodes of your DeployedModel in the Vertex Al endpoint.

D.

Increase the worker counts in the importFeaturevalues request of your batch ingestion job.

Question 27

You are developing a model to help your company create more targeted online advertising campaigns. You need to create a dataset that you will use to train the model. You want to avoid creating or reinforcing unfair bias in the model. What should you do?

Choose 2 answers

Options:

A.

Include a comprehensive set of demographic features.

B.

include only the demographic groups that most frequently interact with advertisements.

C.

Collect a random sample of production traffic to build the training dataset.

D.

Collect a stratified sample of production traffic to build the training dataset.

E.

Conduct fairness tests across sensitive categories and demographics on the trained model.

Question 28

As the lead ML Engineer for your company, you are responsible for building ML models to digitize scanned customer forms. You have developed a TensorFlow model that converts the scanned images into text and stores them in Cloud Storage. You need to use your ML model on the aggregated data collected at the end of each day with minimal manual intervention. What should you do?

Options:

A.

Use the batch prediction functionality of Al Platform

B.

Create a serving pipeline in Compute Engine for prediction

C.

Use Cloud Functions for prediction each time a new data point is ingested

D.

Deploy the model on Al Platform and create a version of it for online inference.

Question 29

While monitoring your model training’s GPU utilization, you discover that you have a native synchronous implementation. The training data is split into multiple files. You want to reduce the execution time of your input pipeline. What should you do?

Options:

A.

Increase the CPU load

B.

Add caching to the pipeline

C.

Increase the network bandwidth

D.

Add parallel interleave to the pipeline

Question 30

You work for a company that sells corporate electronic products to thousands of businesses worldwide. Your company stores historical customer data in BigQuery. You need to build a model that predicts customer lifetime value over the next three years. You want to use the simplest approach to build the model and you want to have access to visualization tools. What should you do?

Options:

A.

Create a Vertex Al Workbench notebook to perform exploratory data analysis. Use IPython magics to create a new BigQuery table with input features Use the BigQuery console to run the create model statement Validate the results by using the ml. evaluate and ml. predict statements.

B.

Run the create model statement from the BigQuery console to create an AutoML model Validate the results by using the ml. evaluate and ml. predict statements.

C.

Create a Vertex Al Workbench notebook to perform exploratory data analysis and create input features Save the features as a CSV file in Cloud Storage Import the CSV file as a new BigQuery table Use the BigQuery console to run the create model statement Validate the results by using the ml. evaluate and ml. predict statements.

D.

Create a Vertex Al Workbench notebook to perform exploratory data analysis Use IPython magics to create a new BigQuery table with input features, create the model and validate the results by using the create model, ml. evaluates, and ml. predict statements.

Question 31

You are developing an ML model in a Vertex Al Workbench notebook. You want to track artifacts and compare models during experimentation using different approaches. You need to rapidly and easily transition successful experiments to production as you iterate on your model implementation. What should you do?

Options:

A.

1 Initialize the Vertex SDK with the name of your experiment Log parameters and metrics for each experiment, and attach dataset and model artifacts as inputs and outputs to each execution.

2 After a successful experiment create a Vertex Al pipeline.

B.

1. Initialize the Vertex SDK with the name of your experiment Log parameters and metrics for each experiment, save your dataset to a Cloud Storage bucket and upload the models to Vertex Al Model Registry.

2 After a successful experiment create a Vertex Al pipeline.

C.

1 Create a Vertex Al pipeline with parameters you want to track as arguments to your Pipeline Job Use the Metrics. Model, and Dataset artifact types from the Kubeflow Pipelines DSL as the inputs and outputs of the components in your pipeline.

2. Associate the pipeline with your experiment when you submit the job.

D.

1 Create a Vertex Al pipeline Use the Dataset and Model artifact types from the Kubeflow Pipelines. DSL as the inputs and outputs of the components in your pipeline.

2. In your training component use the Vertex Al SDK to create an experiment run Configure the log_params and log_metrics functions to track parameters and metrics of your experiment.

Question 32

You are an ML engineer on an agricultural research team working on a crop disease detection tool to detect leaf rust spots in images of crops to determine the presence of a disease. These spots, which can vary in shape and size, are correlated to the severity of the disease. You want to develop a solution that predicts the presence and severity of the disease with high accuracy. What should you do?

Options:

A.

Create an object detection model that can localize the rust spots.

B.

Develop an image segmentation ML model to locate the boundaries of the rust spots.

C.

Develop a template matching algorithm using traditional computer vision libraries.

D.

Develop an image classification ML model to predict the presence of the disease.

Question 33

You are training an object detection machine learning model on a dataset that consists of three million X-ray images, each roughly 2 GB in size. You are using Vertex AI Training to run a custom training application on a Compute Engine instance with 32-cores, 128 GB of RAM, and 1 NVIDIA P100 GPU. You notice that model training is taking a very long time. You want to decrease training time without sacrificing model performance. What should you do?

Options:

A.

Increase the instance memory to 512 GB and increase the batch size.

B.

Replace the NVIDIA P100 GPU with a v3-32 TPU in the training job.

C.

Enable early stopping in your Vertex AI Training job.

D.

Use the tf.distribute.Strategy API and run a distributed training job.

Question 34

You need to deploy a scikit-learn classification model to production. The model must be able to serve requests 24/7 and you expect millions of requests per second to the production application from 8 am to 7 pm. You need to minimize the cost of deployment What should you do?

Options:

A.

Deploy an online Vertex Al prediction endpoint Set the max replica count to 1

B.

Deploy an online Vertex Al prediction endpoint Set the max replica count to 100

C.

Deploy an online Vertex Al prediction endpoint with one GPU per replica Set the max replica count to 1.

D.

Deploy an online Vertex Al prediction endpoint with one GPU per replica Set the max replica count to 100.

Question 35

You are developing a mode! to detect fraudulent credit card transactions. You need to prioritize detection because missing even one fraudulent transaction could severely impact the credit card holder. You used AutoML to tram a model on users' profile information and credit card transaction data. After training the initial model, you notice that the model is failing to detect many fraudulent transactions. How should you adjust the training parameters in AutoML to improve model performance?

Choose 2 answers

Options:

A.

Increase the score threshold.

B.

Decrease the score threshold.

C.

Add more positive examples to the training set.

D.

Add more negative examples to the training set.

E.

Reduce the maximum number of node hours for training.

Question 36

You are developing ML models with Al Platform for image segmentation on CT scans. You frequently update your model architectures based on the newest available research papers, and have to rerun training on the same dataset to benchmark their performance. You want to minimize computation costs and manual intervention while having version control for your code. What should you do?

Options:

A.

Use Cloud Functions to identify changes to your code in Cloud Storage and trigger a retraining job

B.

Use the gcloud command-line tool to submit training jobs on Al Platform when you update your code

C.

Use Cloud Build linked with Cloud Source Repositories to trigger retraining when new code is pushed to the repository

D.

Create an automated workflow in Cloud Composer that runs daily and looks for changes in code in Cloud Storage using a sensor.

Question 37

You have recently created a proof-of-concept (POC) deep learning model. You are satisfied with the overall architecture, but you need to determine the value for a couple of hyperparameters. You want to perform hyperparameter tuning on Vertex AI to determine both the appropriate embedding dimension for a categorical feature used by your model and the optimal learning rate. You configure the following settings:

For the embedding dimension, you set the type to INTEGER with a minValue of 16 and maxValue of 64.

For the learning rate, you set the type to DOUBLE with a minValue of 10e-05 and maxValue of 10e-02.

You are using the default Bayesian optimization tuning algorithm, and you want to maximize model accuracy. Training time is not a concern. How should you set the hyperparameter scaling for each hyperparameter and the maxParallelTrials?

Options:

A.

Use UNIT_LINEAR_SCALE for the embedding dimension, UNIT_LOG_SCALE for the learning rate, and a large number of parallel trials.

B.

Use UNIT_LINEAR_SCALE for the embedding dimension, UNIT_LOG_SCALE for the learning rate, and a small number of parallel trials.

C.

Use UNIT_LOG_SCALE for the embedding dimension, UNIT_LINEAR_SCALE for the learning rate, and a large number of parallel trials.

D.

Use UNIT_LOG_SCALE for the embedding dimension, UNIT_LINEAR_SCALE for the learning rate, and a small number of parallel trials.

Question 38

You are working on a Neural Network-based project. The dataset provided to you has columns with different ranges. While preparing the data for model training, you discover that gradient optimization is having difficulty moving weights to a good solution. What should you do?

Options:

A.

Use feature construction to combine the strongest features.

B.

Use the representation transformation (normalization) technique.

C.

Improve the data cleaning step by removing features with missing values.

D.

Change the partitioning step to reduce the dimension of the test set and have a larger training set.

Question 39

You have been given a dataset with sales predictions based on your company’s marketing activities. The data is structured and stored in BigQuery, and has been carefully managed by a team of data analysts. You need to prepare a report providing insights into the predictive capabilities of the data. You were asked to run several ML models with different levels of sophistication, including simple models and multilayered neural networks. You only have a few hours to gather the results of your experiments. Which Google Cloud tools should you use to complete this task in the most efficient and self-serviced way?

Options:

A.

Use BigQuery ML to run several regression models, and analyze their performance.

B.

Read the data from BigQuery using Dataproc, and run several models using SparkML.

C.

Use Vertex AI Workbench user-managed notebooks with scikit-learn code for a variety of ML algorithms and performance metrics.

D.

Train a custom TensorFlow model with Vertex AI, reading the data from BigQuery featuring a variety of ML algorithms.

Question 40

You work for a food product company. Your company's historical sales data is stored in BigQuery You need to use Vertex Al’s custom training service to train multiple TensorFlow models that read the data from BigQuery and predict future sales You plan to implement a data preprocessing algorithm that performs min-max scaling and bucketing on a large number of features before you start experimenting with the models. You want to minimize preprocessing time, cost and development effort How should you configure this workflow?

Options:

A.

Write the transformations into Spark that uses the spark-bigquery-connector and use Dataproc to preprocess the data.

B.

Write SQL queries to transform the data in-place in BigQuery.

C.

Add the transformations as a preprocessing layer in the TensorFlow models.

D.

Create a Dataflow pipeline that uses the BigQuerylO connector to ingest the data process it and write it back to BigQuery.

Question 41

You work for a retail company. You have a managed tabular dataset in Vertex Al that contains sales data from three different stores. The dataset includes several features such as store name and sale timestamp. You want to use the data to train a model that makes sales predictions for a new store that will open soon You need to split the data between the training, validation, and test sets What approach should you use to split the data?

Options:

A.

Use Vertex Al manual split, using the store name feature to assign one store for each set.

B.

Use Vertex Al default data split.

C.

Use Vertex Al chronological split and specify the sales timestamp feature as the time vanable.

D.

Use Vertex Al random split assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set.

Question 42

Your data science team has requested a system that supports scheduled model retraining, Docker containers, and a service that supports autoscaling and monitoring for online prediction requests. Which platform components should you choose for this system?

Options:

A.

Vertex AI Pipelines and App Engine

B.

Vertex AI Pipelines, Vertex AI Prediction, and Vertex AI Model Monitoring

C.

Cloud Composer, BigQuery ML, and Vertex AI Prediction

D.

Cloud Composer, Vertex AI Training with custom containers, and App Engine

Question 43

You work for a startup that has multiple data science workloads. Your compute infrastructure is currently on-premises. and the data science workloads are native to PySpark Your team plans to migrate their data science workloads to Google Cloud You need to build a proof of concept to migrate one data science job to Google Cloud You want to propose a migration process that requires minimal cost and effort. What should you do first?

Options:

A.

Create a n2-standard-4 VM instance and install Java, Scala and Apache Spark dependencies on it.

B.

Create a Google Kubemetes Engine cluster with a basic node pool configuration install Java Scala, and

Apache Spark dependencies on it.

C.

Create a Standard (1 master. 3 workers) Dataproc cluster, and run a Vertex Al Workbench notebook instance

on it.

D.

Create a Vertex Al Workbench notebook with instance type n2-standard-4.

Question 44

You are designing an ML recommendation model for shoppers on your company's ecommerce website. You will use Recommendations Al to build, test, and deploy your system. How should you develop recommendations that increase revenue while following best practices?

Options:

A.

Use the "Other Products You May Like" recommendation type to increase the click-through rate

B.

Use the "Frequently Bought Together' recommendation type to increase the shopping cart size for each order.

C.

Import your user events and then your product catalog to make sure you have the highest quality event stream

D.

Because it will take time to collect and record product data, use placeholder values for the product catalog to test the viability of the model.

Question 45

You work for a retail company. You have been tasked with building a model to determine the probability of churn for each customer. You need the predictions to be interpretable so the results can be used to develop marketing campaigns that target at-risk customers. What should you do?

Options:

A.

Build a random forest regression model in a Vertex Al Workbench notebook instance Configure the model to generate feature importance’s after the model is trained.

B.

Build an AutoML tabular regression model Configure the model to generate explanations when it makes predictions.

C.

Build a custom TensorFlow neural network by using Vertex Al custom training Configure the model to generate explanations when it makes predictions.

D.

Build a random forest classification model in a Vertex Al Workbench notebook instance Configure the model to generate feature importance’s after the model is trained.

Question 46

You have successfully deployed to production a large and complex TensorFlow model trained on tabular data. You want to predict the lifetime value (LTV) field for each subscription stored in the BigQuery table named subscription. subscriptionPurchase in the project named my-fortune500-company-project.

You have organized all your training code, from preprocessing data from the BigQuery table up to deploying the validated model to the Vertex AI endpoint, into a TensorFlow Extended (TFX) pipeline. You want to prevent prediction drift, i.e., a situation when a feature data distribution in production changes significantly over time. What should you do?

Options:

A.

Implement continuous retraining of the model daily using Vertex AI Pipelines.

B.

Add a model monitoring job where 10% of incoming predictions are sampled 24 hours.

C.

Add a model monitoring job where 90% of incoming predictions are sampled 24 hours.

D.

Add a model monitoring job where 10% of incoming predictions are sampled every hour.

Question 47

You are developing an ML model that predicts the cost of used automobiles based on data such as location, condition model type color, and engine-'battery efficiency. The data is updated every night Car dealerships will use the model to determine appropriate car prices. You created a Vertex Al pipeline that reads the data splits the data into training/evaluation/test sets performs feature engineering trains the model by using the training dataset and validates the model by using the evaluation dataset. You need to configure a retraining workflow that minimizes cost What should you do?

Options:

A.

Compare the training and evaluation losses of the current run If the losses are similar, deploy the model to a Vertex AI endpoint Configure a cron job to redeploy the pipeline every night.

B.

Compare the training and evaluation losses of the current run If the losses are similar deploy the model to a Vertex Al endpoint with training/serving skew threshold model monitoring When the model monitoring threshold is tnggered redeploy the pipeline.

C.

Compare the results to the evaluation results from a previous run If the performance improved deploy the model to a Vertex Al endpoint Configure a cron job to redeploy the pipeline every night.

D.

Compare the results to the evaluation results from a previous run If the performance improved deploy the model to a Vertex Al endpoint with training/serving skew threshold model monitoring. When the model monitoring threshold is triggered, redeploy the pipeline.

Question 48

You work for a credit card company and have been asked to create a custom fraud detection model based on historical data using AutoML Tables. You need to prioritize detection of fraudulent transactions while minimizing false positives. Which optimization objective should you use when training the model?

Options:

A.

An optimization objective that minimizes Log loss

B.

An optimization objective that maximizes the Precision at a Recall value of 0.50

C.

An optimization objective that maximizes the area under the precision-recall curve (AUC PR) value

D.

An optimization objective that maximizes the area under the receiver operating characteristic curve (AUC ROC) value

Question 49

You work for a public transportation company and need to build a model to estimate delay times for multiple transportation routes. Predictions are served directly to users in an app in real time. Because different seasons and population increases impact the data relevance, you will retrain the model every month. You want to follow Google-recommended best practices. How should you configure the end-to-end architecture of the predictive model?

Options:

A.

Configure Kubeflow Pipelines to schedule your multi-step workflow from training to deploying your model.

B.

Use a model trained and deployed on BigQuery ML and trigger retraining with the scheduled query feature in BigQuery

C.

Write a Cloud Functions script that launches a training and deploying job on Ai Platform that is triggered by Cloud Scheduler

D.

Use Cloud Composer to programmatically schedule a Dataflow job that executes the workflow from training to deploying your model

Question 50

You need to design an architecture that serves asynchronous predictions to determine whether a particular mission-critical machine part will fail. Your system collects data from multiple sensors from the machine. You want to build a model that will predict a failure in the next N minutes, given the average of each sensor’s data from the past 12 hours. How should you design the architecture?

Options:

A.

1. HTTP requests are sent by the sensors to your ML model, which is deployed as a microservice and exposes a REST API for prediction

2. Your application queries a Vertex AI endpoint where you deployed your model.

3. Responses are received by the caller application as soon as the model produces the prediction.

B.

1. Events are sent by the sensors to Pub/Sub, consumed in real time, and processed by a Dataflow stream processing pipeline.

2. The pipeline invokes the model for prediction and sends the predictions to another Pub/Sub topic.

3. Pub/Sub messages containing predictions are then consumed by a downstream system for monitoring.

C.

1. Export your data to Cloud Storage using Dataflow.

2. Submit a Vertex AI batch prediction job that uses your trained model in Cloud Storage to perform scoring on the preprocessed data.

3. Export the batch prediction job outputs from Cloud Storage and import them into Cloud SQL.

D.

1. Export the data to Cloud Storage using the BigQuery command-line tool

2. Submit a Vertex AI batch prediction job that uses your trained model in Cloud Storage to perform scoring on the preprocessed data.

3. Export the batch prediction job outputs from Cloud Storage and import them into BigQuery.

Question 51

You manage a team of data scientists who use a cloud-based backend system to submit training jobs. This system has become very difficult to administer, and you want to use a managed service instead. The data scientists you work with use many different frameworks, including Keras, PyTorch, theano. Scikit-team, and custom libraries. What should you do?

Options:

A.

Use the Al Platform custom containers feature to receive training jobs using any framework

B.

Configure Kubeflow to run on Google Kubernetes Engine and receive training jobs through TFJob

C.

Create a library of VM images on Compute Engine; and publish these images on a centralized repository

D.

Set up Slurm workload manager to receive jobs that can be scheduled to run on your cloud infrastructure.

Question 52

You are building a linear model with over 100 input features, all with values between -1 and 1. You suspect that many features are non-informative. You want to remove the non-informative features from your model while keeping the informative ones in their original form. Which technique should you use?

Options:

A.

Use Principal Component Analysis to eliminate the least informative features.

B.

Use L1 regularization to reduce the coefficients of uninformative features to 0.

C.

After building your model, use Shapley values to determine which features are the most informative.

D.

Use an iterative dropout technique to identify which features do not degrade the model when removed.

Question 53

You are analyzing customer data for a healthcare organization that is stored in Cloud Storage. The data contains personally identifiable information (PII) You need to perform data exploration and preprocessing while ensuring the security and privacy of sensitive fields What should you do?

Options:

A.

Use the Cloud Data Loss Prevention (DLP) API to de-identify the PI! before performing data exploration and preprocessing.

B.

Use customer-managed encryption keys (CMEK) to encrypt the Pll data at rest and decrypt the Pll data during data exploration and preprocessing.

C.

Use a VM inside a VPC Service Controls security perimeter to perform data exploration and preprocessing.

D.

Use Google-managed encryption keys to encrypt the Pll data at rest, and decrypt the Pll data during data exploration and preprocessing.

Question 54

You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn’t meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?

Options:

A.

Weight pruning

B.

Dynamic range quantization

C.

Model distillation

D.

Dimensionality reduction

Question 55

You have developed an AutoML tabular classification model that identifies high-value customers who interact with your organization's website.

You plan to deploy the model to a new Vertex Al endpoint that will integrate with your website application. You expect higher traffic to the website during

nights and weekends. You need to configure the model endpoint's deployment settings to minimize latency and cost. What should you do?

Options:

A.

Configure the model deployment settings to use an n1-standard-32 machine type.

B.

Configure the model deployment settings to use an n1-standard-4 machine type. Set the minReplicaCount value to 1 and the maxReplicaCount value to 8.

C.

Configure the model deployment settings to use an n1-standard-4 machine type and a GPU accelerator. Set the minReplicaCount value to 1 and the maxReplicaCount value to 4.

D.

Configure the model deployment settings to use an n1-standard-8 machine type and a GPU accelerator.

Question 56

You are an ML engineer at a bank that has a mobile application. Management has asked you to build an ML-based biometric authentication for the app that verifies a customer's identity based on their fingerprint. Fingerprints are considered highly sensitive personal information and cannot be downloaded and stored into the bank databases. Which learning strategy should you recommend to train and deploy this ML model?

Options:

A.

Differential privacy

B.

Federated learning

C.

MD5 to encrypt data

D.

Data Loss Prevention API

Question 57

You need to execute a batch prediction on 100 million records in a BigQuery table with a custom TensorFlow DNN regressor model, and then store the predicted results in a BigQuery table. You want to minimize the effort required to build this inference pipeline. What should you do?

Options:

A.

Import the TensorFlow model with BigQuery ML, and run the ml.predict function.

B.

Use the TensorFlow BigQuery reader to load the data, and use the BigQuery API to write the results to BigQuery.

C.

Create a Dataflow pipeline to convert the data in BigQuery to TFRecords. Run a batch inference on Vertex AI Prediction, and write the results to BigQuery.

D.

Load the TensorFlow SavedModel in a Dataflow pipeline. Use the BigQuery I/O connector with a custom function to perform the inference within the pipeline, and write the results to BigQuery.

Question 58

You are developing a model to predict whether a failure will occur in a critical machine part. You have a dataset consisting of a multivariate time series and labels indicating whether the machine part failed You recently started experimenting with a few different preprocessing and modeling approaches in a Vertex Al Workbench notebook. You want to log data and track artifacts from each run. How should you set up your experiments?

Options:

A.

B.

C.

D.

Question 59

Your data science team needs to rapidly experiment with various features, model architectures, and hyperparameters. They need to track the accuracy metrics for various experiments and use an API to query the metrics over time. What should they use to track and report their experiments while minimizing manual effort?

Options:

A.

Use Kubeflow Pipelines to execute the experiments Export the metrics file, and query the results using the Kubeflow Pipelines API.

B.

Use Al Platform Training to execute the experiments Write the accuracy metrics to BigQuery, and query the results using the BigQueryAPI.

C.

Use Al Platform Training to execute the experiments Write the accuracy metrics to Cloud Monitoring, and query the results using the Monitoring API.

D.

Use Al Platform Notebooks to execute the experiments. Collect the results in a shared Google Sheets file, and query the results using the Google Sheets API

Question 60

You are building a custom image classification model and plan to use Vertex Al Pipelines to implement the end-to-end training. Your dataset consists of images that need to be preprocessed before they can be used to train the model. The preprocessing steps include resizing the images, converting them to grayscale, and extracting features. You have already implemented some Python functions for the preprocessing tasks. Which components should you use in your pipeline'?

Options:

A.

B.

C.

D.

Question 61

Your team frequently creates new ML models and runs experiments. Your team pushes code to a single repository hosted on Cloud Source Repositories. You want to create a continuous integration pipeline that automatically retrains the models whenever there is any modification of the code. What should be your first step to set up the CI pipeline?

Options:

A.

Configure a Cloud Build trigger with the event set as "Pull Request"

B.

Configure a Cloud Build trigger with the event set as "Push to a branch"

C.

Configure a Cloud Function that builds the repository each time there is a code change.

D.

Configure a Cloud Function that builds the repository each time a new branch is created.

Question 62

You work for a bank and are building a random forest model for fraud detection. You have a dataset that

includes transactions, of which 1% are identified as fraudulent. Which data transformation strategy would likely improve the performance of your classifier?

Options:

A.

Write your data in TFRecords.

B.

Z-normalize all the numeric features.

C.

Oversample the fraudulent transaction 10 times.

D.

Use one-hot encoding on all categorical features.

Question 63

You work for a gaming company that has millions of customers around the world. All games offer a chat feature that allows players to communicate with each other in real time. Messages can be typed in more than 20 languages and are translated in real time using the Cloud Translation API. You have been asked to build an ML system to moderate the chat in real time while assuring that the performance is uniform across the various languages and without changing the serving infrastructure.

You trained your first model using an in-house word2vec model for embedding the chat messages translated by the Cloud Translation API. However, the model has significant differences in performance across the different languages. How should you improve it?

Options:

A.

Add a regularization term such as the Min-Diff algorithm to the loss function.

B.

Train a classifier using the chat messages in their original language.

C.

Replace the in-house word2vec with GPT-3 or T5.

D.

Remove moderation for languages for which the false positive rate is too high.

Question 64

Your data science team has requested a system that supports scheduled model retraining, Docker containers, and a service that supports autoscaling and monitoring for online prediction requests. Which platform components should you choose for this system?

Options:

A.

Vertex AI Pipelines and App Engine

B.

Vertex AI Pipelines and Al Platform Prediction

C.

Cloud Composer, BigQuery ML , and Al Platform Prediction

D.

Cloud Composer, Al Platform Training with custom containers, and App Engine

Question 65

You are an ML engineer at a bank. You have developed a binary classification model using AutoML Tables to predict whether a customer will make loan payments on time. The output is used to approve or reject loan requests. One customer’s loan request has been rejected by your model, and the bank’s risks department is asking you to provide the reasons that contributed to the model’s decision. What should you do?

Options:

A.

Use local feature importance from the predictions.

B.

Use the correlation with target values in the data summary page.

C.

Use the feature importance percentages in the model evaluation page.

D.

Vary features independently to identify the threshold per feature that changes the classification.

Question 66

You are developing a recommendation engine for an online clothing store. The historical customer transaction data is stored in BigQuery and Cloud Storage. You need to perform exploratory data analysis (EDA), preprocessing and model training. You plan to rerun these EDA, preprocessing, and training steps as you experiment with different types of algorithms. You want to minimize the cost and development effort of running these steps as you experiment. How should you configure the environment?

Options:

A.

Create a Vertex Al Workbench user-managed notebook using the default VM instance, and use the %%bigquery magic commands in Jupyter to query the tables.

B.

Create a Vertex Al Workbench managed notebook to browse and query the tables directly from the JupyterLab interface.

C.

Create a Vertex Al Workbench user-managed notebook on a Dataproc Hub. and use the %%bigquery magic commands in Jupyter to query the tables.

D.

Create a Vertex Al Workbench managed notebook on a Dataproc cluster, and use the spark-bigquery-connector to access the tables.

Question 67

Your organization wants to make its internal shuttle service route more efficient. The shuttles currently stop at all pick-up points across the city every 30 minutes between 7 am and 10 am. The development team has already built an application on Google Kubernetes Engine that requires users to confirm their presence and shuttle station one day in advance. What approach should you take?

Options:

A.

1. Build a tree-based regression model that predicts how many passengers will be picked up at each shuttle station.

2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the prediction.

B.

1. Build a tree-based classification model that predicts whether the shuttle should pick up passengers at each shuttle station.

2. Dispatch an available shuttle and provide the map with the required stops based on the prediction

C.

1. Define the optimal route as the shortest route that passes by all shuttle stations with confirmed attendance at the given time under capacity constraints.

2 Dispatch an appropriately sized shuttle and indicate the required stops on the map

D.

1. Build a reinforcement learning model with tree-based classification models that predict the presence of passengers at shuttle stops as agents and a reward function around a distance-based metric

2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the simulated outcome.

Question 68

You work for an organization that operates a streaming music service. You have a custom production model that is serving a "next song" recommendation based on a user’s recent listening history. Your model is deployed on a Vertex Al endpoint. You recently retrained the same model by using fresh data. The model received positive test results offline. You now want to test the new model in production while minimizing complexity. What should you do?

Options:

A.

Create a new Vertex Al endpoint for the new model and deploy the new model to that new endpoint Build a service to randomly send 5% of production traffic to the new endpoint Monitor end-user metrics such as listening time If end-user metrics improve between models over time gradually increase the percentage of production traffic sent to the new endpoint.

B.

Capture incoming prediction requests in BigQuery Create an experiment in Vertex Al Experiments Run batch predictions for both models using the captured data Use the user's selected song to compare the models performance side by side If the new models performance metrics are better than the previous model deploy the new model to production.

C.

Deploy the new model to the existing Vertex Al endpoint Use traffic splitting to send 5% of production traffic to the new model Monitor end-user metrics, such as listening time If end-user metrics improve between models over time, gradually increase the percentage of production traffic sent to the new model.

D.

Configure a model monitoring job for the existing Vertex Al endpoint. Configure the monitoring job to detect prediction drift, and set a threshold for alerts Update the model on the endpoint from the previous model to the new model If you receive an alert of prediction drift, revert to the previous model.

Question 69

You have been asked to develop an input pipeline for an ML training model that processes images from disparate sources at a low latency. You discover that your input data does not fit in memory. How should you create a dataset following Google-recommended best practices?

Options:

A.

Create a tf.data.Dataset.prefetch transformation

B.

Convert the images to tf .Tensor Objects, and then run Dataset. from_tensor_slices{).

C.

Convert the images to tf .Tensor Objects, and then run tf. data. Dataset. from_tensors ().

D.

Convert the images Into TFRecords, store the images in Cloud Storage, and then use the tf. data API to read the images for training

Question 70

You are working with a dataset that contains customer transactions. You need to build an ML model to predict customer purchase behavior You plan to develop the model in BigQuery ML, and export it to Cloud Storage for online prediction You notice that the input data contains a few categorical features, including product category and payment method You want to deploy the model as quickly as possible. What should you do?

Options:

A.

Use the transform clause with the ML. ONE_HOT_ENCODER function on the categorical features at model creation and select the categorical and non-categorical features.

B.

Use the ML. ONE_HOT_ENCODER function on the categorical features, and select the encoded categorical features and non-categorical features as inputs to create your model.

C.

Use the create model statement and select the categorical and non-categorical features.

D.

Use the ML. ONE_HOT_ENCODER function on the categorical features, and select the encoded categorical features and non-categorical features as inputs to create your model.

Question 71

One of your models is trained using data provided by a third-party data broker. The data broker does not reliably notify you of formatting changes in the data. You want to make your model training pipeline more robust to issues like this. What should you do?

Options:

A.

Use TensorFlow Data Validation to detect and flag schema anomalies.

B.

Use TensorFlow Transform to create a preprocessing component that will normalize data to the expected distribution, and replace values that don’t match the schema with 0.

C.

Use tf.math to analyze the data, compute summary statistics, and flag statistical anomalies.

D.

Use custom TensorFlow functions at the start of your model training to detect and flag known formatting errors.

Question 72

You are training an object detection model using a Cloud TPU v2. Training time is taking longer than expected. Based on this simplified trace obtained with a Cloud TPU profile, what action should you take to decrease training time in a cost-efficient way?

Options:

A.

Move from Cloud TPU v2 to Cloud TPU v3 and increase batch size.

B.

Move from Cloud TPU v2 to 8 NVIDIA V100 GPUs and increase batch size.

C.

Rewrite your input function to resize and reshape the input images.

D.

Rewrite your input function using parallel reads, parallel processing, and prefetch.

Question 73

You recently trained an XGBoost model on tabular data You plan to expose the model for internal use as an HTTP microservice After deployment you expect a small number of incoming requests. You want to productionize the model with the least amount of effort and latency. What should you do?

Options:

A.

Deploy the model to BigQuery ML by using CREATE model with the BOOSTED-THREE-REGRESSOR statement and invoke the BigQuery API from the microservice.

B.

Build a Flask-based app Package the app in a custom container on Vertex Al and deploy it to Vertex Al Endpoints.

C.

Build a Flask-based app Package the app in a Docker image and deploy it to Google Kubernetes Engine in Autopilot mode.

D.

Use a prebuilt XGBoost Vertex container to create a model and deploy it to Vertex Al Endpoints.

Question 74

You are creating a social media app where pet owners can post images of their pets. You have one million user uploaded images with hashtags. You want to build a comprehensive system that recommends images to users that are similar in appearance to their own uploaded images.

What should you do?

Options:

A.

Download a pretrained convolutional neural network, and fine-tune the model to predict hashtags based on the input images. Use the predicted hashtags to make recommendations.

B.

Retrieve image labels and dominant colors from the input images using the Vision API. Use these properties and the hashtags to make recommendations.

C.

Use the provided hashtags to create a collaborative filtering algorithm to make recommendations.

D.

Download a pretrained convolutional neural network, and use the model to generate embeddings of the input images. Measure similarity between embeddings to make recommendations.

Question 75

You are training an LSTM-based model on Al Platform to summarize text using the following job submission script:

You want to ensure that training time is minimized without significantly compromising the accuracy of your model. What should you do?

Options:

A.

Modify the 'epochs' parameter

B.

Modify the 'scale-tier' parameter

C.

Modify the batch size' parameter

D.

Modify the 'learning rate' parameter

Question 76

You need to design a customized deep neural network in Keras that will predict customer purchases based on their purchase history. You want to explore model performance using multiple model architectures, store training data, and be able to compare the evaluation metrics in the same dashboard. What should you do?

Options:

A.

Create multiple models using AutoML Tables

B.

Automate multiple training runs using Cloud Composer

C.

Run multiple training jobs on Al Platform with similar job names

D.

Create an experiment in Kubeflow Pipelines to organize multiple runs

Question 77

You are developing a model to identify traffic signs in images extracted from videos taken from the dashboard of a vehicle. You have a dataset of 100 000 images that were cropped to show one out of ten different traffic signs. The images have been labeled accordingly for model training and are stored in a Cloud Storage bucket You need to be able to tune the model during each training run. How should you train the model?

Options:

A.

Train a model for object detection by using Vertex Al AutoML.

B.

Train a model for image classification by using Vertex Al AutoML.

C.

Develop the model training code for object detection and tram a model by using Vertex Al custom training.

D.

Develop the model training code for image classification and train a model by using Vertex Al custom training.

Question 78

You trained a model, packaged it with a custom Docker container for serving, and deployed it to Vertex Al Model Registry. When you submit a batch prediction job, it fails with this error "Error model server never became ready Please validate that your model file or container configuration are valid. There are no additional errors in the logs What should you do?

Options:

A.

Add a logging configuration to your application to emit logs to Cloud Logging.

B.

Change the HTTP port in your model's configuration to the default value of 8080

C.

Change the health Route value in your models configuration to /heal thcheck.

D.

Pull the Docker image locally and use the decker run command to launch it locally. Use the docker logs command to explore the error logs.

Question 79

You created a model that uses BigQuery ML to perform linear regression. You need to retrain the model on the cumulative data collected every week. You want to minimize the development effort and the scheduling cost. What should you do?

Options:

A.

Use BigQuerys scheduling service to run the model retraining query periodically.

B.

Create a pipeline in Vertex Al Pipelines that executes the retraining query and use the Cloud Scheduler API to run the query weekly.

C.

Use Cloud Scheduler to trigger a Cloud Function every week that runs the query for retraining the model.

D.

Use the BigQuery API Connector and Cloud Scheduler to trigger. Workflows every week that retrains the model.

Question 80

You are training a Resnet model on Al Platform using TPUs to visually categorize types of defects in automobile engines. You capture the training profile using the Cloud TPU profiler plugin and observe that it is highly input-bound. You want to reduce the bottleneck and speed up your model training process. Which modifications should you make to the tf .data dataset?

Choose 2 answers

Options:

A.

Use the interleave option for reading data

B.

Reduce the value of the repeat parameter

C.

Increase the buffer size for the shuffle option.

D.

Set the prefetch option equal to the training batch size

E.

Decrease the batch size argument in your transformation