Summer Special Flat 65% Limited Time Discount offer - Ends in 0d 00h 00m 00s - Coupon code: suredis

Google Professional-Cloud-DevOps-Engineer Google Cloud Certified - Professional Cloud DevOps Engineer Exam Exam Practice Test

Google Cloud Certified - Professional Cloud DevOps Engineer Exam Questions and Answers

Question 1

You use Cloud Build to build and deploy your application. You want to securely incorporate database credentials and other application secrets into the build pipeline. You also want to minimize the development effort. What should you do?

Options:

A.

Create a Cloud Storage bucket and use the built-in encryption at rest. Store the secrets in the bucket and grant Cloud Build access to the bucket.

B.

Encrypt the secrets and store them in the application repository. Store a decryption key in a separate repository and grant Cloud Build access to the repository.

C.

Use client-side encryption to encrypt the secrets and store them in a Cloud Storage bucket. Store a decryption key in the bucket and grant Cloud Build access to the bucket.

D.

Use Cloud Key Management Service (Cloud KMS) to encrypt the secrets and include them in your Cloud Build deployment configuration. Grant Cloud Build access to the KeyRing.

Question 2

Your organization recently adopted a container-based workflow for application development. Your team develops numerous applications that are deployed continuously through an automated build pipeline to the production environment. A recent security audit alerted your team that the code pushed to production could contain vulnerabilities and that the existing tooling around virtual machine (VM) vulnerabilities no longer applies to the containerized environment. You need to ensure the security and patch level of all code running through the pipeline. What should you do?

Options:

A.

Set up Container Analysis to scan and report Common Vulnerabilities and Exposures.

B.

Configure the containers in the build pipeline to always update themselves before release.

C.

Reconfigure the existing operating system vulnerability software to exist inside the container.

D.

Implement static code analysis tooling against the Docker files used to create the containers.

Question 3

You support a high-traffic web application that runs on Google Cloud Platform (GCP). You need to measure application reliability from a user perspective without making any engineering changes to it. What should you do?

Choose 2 answers

Options:

A.

Review current application metrics and add new ones as needed.

B.

Modify the code to capture additional information for user interaction.

C.

Analyze the web proxy logs only and capture response time of each request.

D.

Create new synthetic clients to simulate a user journey using the application.

E.

Use current and historic Request Logs to trace customer interaction with the application.

Question 4

You have an application running in Google Kubernetes Engine. The application invokes multiple services per request but responds too slowly. You need to identify which downstream service or services are causing the delay. What should you do?

Options:

A.

Analyze VPC flow logs along the path of the request.

B.

Investigate the Liveness and Readiness probes for each service.

C.

Create a Dataflow pipeline to analyze service metrics in real time.

D.

Use a distributed tracing framework such as OpenTelemetry or Stackdriver Trace.

Question 5

You support a stateless web-based API that is deployed on a single Compute Engine instance in the europe-west2-a zone . The Service Level Indicator (SLI) for service availability is below the specified Service Level Objective (SLO). A postmortem has revealed that requests to the API regularly time out. The time outs are due to the API having a high number of requests and running out memory. You want to improve service availability. What should you do?

Options:

A.

Change the specified SLO to match the measured SLI.

B.

Move the service to higher-specification compute instances with more memory.

C.

Set up additional service instances in other zones and load balance the traffic between all instances.

D.

Set up additional service instances in other zones and use them as a failover in case the primary instance is unavailable.

Question 6

You are managing an application that exposes an HTTP endpoint without using a load balancer. The latency of the HTTP responses is important for the user experience. You want to understand what HTTP latencies all of your users are experiencing. You use Stackdriver Monitoring. What should you do?

Options:

A.

• In your application, create a metric with a metricKind set to DELTA and a valueType set to DOUBLE.

• In Stackdriver's Metrics Explorer, use a Slacked Bar graph to visualize the metric.

B.

• In your application, create a metric with a metricKind set to CUMULATIVE and a valueType set to DOUBLE.

• In Stackdriver's Metrics Explorer, use a Line graph to visualize the metric.

C.

• In your application, create a metric with a metricKind set to gauge and a valueType set to distribution.

• In Stackdriver's Metrics Explorer, use a Heatmap graph to visualize the metric.

D.

• In your application, create a metric with a metricKind. set toMETRlc_KIND_UNSPECIFIEDanda valueType set to INT64.

• In Stackdriver's Metrics Explorer, use a Stacked Area graph to visualize the metric.

Question 7

You support a web application that runs on App Engine and uses CloudSQL and Cloud Storage for data storage. After a short spike in website traffic, you notice a big increase in latency for all user requests, increase in CPU use, and the number of processes running the application. Initial troubleshooting reveals:

After the initial spike in traffic, load levels returned to normal but users still experience high latency.

Requests for content from the CloudSQL database and images from Cloud Storage show the same high latency.

No changes were made to the website around the time the latency increased.

There is no increase in the number of errors to the users.

You expect another spike in website traffic in the coming days and want to make sure users don’t experience latency. What should you do?

Options:

A.

Upgrade the GCS buckets to Multi-Regional.

B.

Enable high availability on the CloudSQL instances.

C.

Move the application from App Engine to Compute Engine.

D.

Modify the App Engine configuration to have additional idle instances.

Question 8

Your company experiences bugs, outages, and slowness in its production systems. Developers use the production environment for new feature development and bug fixes. Configuration and experiments are done in the production environment, causing outages for users. Testers use the production environment for load testing, which often slows the production systems. You need to redesign the environment to reduce the number of bugs and outages in production and to enable testers to load test new features. What should you do?

Options:

A.

Create an automated testing script in production to detect failures as soon as they occur.

B.

Create a development environment with smaller server capacity and give access only to developers and testers.

C.

Secure the production environment to ensure that developers can't change it and set up one controlled update per year.

D.

Create a development environment for writing code and a test environment for configurations, experiments, and load testing.

Question 9

You are running an experiment to see whether your users like a new feature of a web application. Shortly after deploying the feature as a canary release, you receive a spike in the number of 500 errors sent to users, and your monitoring reports show increased latency. You want to quickly minimize the negative impact on users. What should you do first?

Options:

A.

Roll back the experimental canary release.

B.

Start monitoring latency, traffic, errors, and saturation.

C.

Record data for the postmortem document of the incident.

D.

Trace the origin of 500 errors and the root cause of increased latency.

Question 10

You are on-call for an infrastructure service that has a large number of dependent systems. You receive an alert indicating that the service is failing to serve most of its requests and all of its dependent systems with hundreds of thousands of users are affected. As part of your Site Reliability Engineering (SRE) incident management protocol, you declare yourself Incident Commander (IC) and pull in two experienced people from your team as Operations Lead (OLJ and Communications Lead (CL). What should you do next?

Options:

A.

Look for ways to mitigate user impact and deploy the mitigations to production.

B.

Contact the affected service owners and update them on the status of the incident.

C.

Establish a communication channel where incident responders and leads can communicate with each other.

D.

Start a postmortem, add incident information, circulate the draft internally, and ask internal stakeholders for input.

Question 11

Your company follows Site Reliability Engineering practices. You are the Incident Commander for a new. customer-impacting incident. You need to immediately assign two incident management roles to assist you in an effective incident response. What roles should you assign?

Choose 2 answers

Options:

A.

Operations Lead

B.

Engineering Lead

C.

Communications Lead

D.

Customer Impact Assessor

E.

External Customer Communications Lead

Question 12

You encounter a large number of outages in the production systems you support. You receive alerts for all the outages that wake you up at night. The alerts are due to unhealthy systems that are automatically restarted within a minute. You want to set up a process that would prevent staff burnout while following Site Reliability Engineering practices. What should you do?

Options:

A.

Eliminate unactionable alerts.

B.

Create an incident report for each of the alerts.

C.

Distribute the alerts to engineers in different time zones.

D.

Redefine the related Service Level Objective so that the error budget is not exhausted.

Question 13

Your company runs applications in Google Kubernetes Engine (GKE) that are deployed following a GitOps methodology.

Application developers frequently create cloud resources to support their applications. You want to give developers the ability to manage infrastructure as code, while ensuring that you follow Google-recommended practices. You need to ensure that infrastructure as code reconciles periodically to avoid configuration drift. What should you do?

Options:

A.

Install and configure Config Connector in Google Kubernetes Engine (GKE).

B.

Configure Cloud Build with a Terraform builder to execute plan and apply commands.

C.

Create a Pod resource with a Terraform docker image to execute terraform plan and terraform apply commands.

D.

Create a Job resource with a Terraform docker image to execute terraforrm plan and terraform apply commands.

Question 14

Your organization recently adopted a container-based workflow for application development. Your team develops numerous applications that are deployed continuously through an automated build pipeline to the production environment. A recent security audit alerted your team that the code pushed to production could contain vulnerabilities and that the existing tooling around virtual machine (VM) vulnerabilities no longer applies to the containerized environment. You need to ensure the security and patch level of all code running through the pipeline. What should you do?

Options:

A.

Set up Container Analysis to scan and report Common Vulnerabilities and Exposures.

B.

Configure the containers in the build pipeline to always update themselves before release.

C.

Reconfigure the existing operating system vulnerability software to exist inside the container.

D.

Implement static code analysis tooling against the Docker files used to create the containers.

Question 15

You are investigating issues in your production application that runs on Google Kubernetes Engine (GKE). You determined that the source Of the issue is a recently updated container image, although the exact change in code was not identified. The deployment is currently pointing to the latest tag. You need to update your cluster to run a version of the container that functions as intended. What should you do?

Options:

A.

Create a new tag called stable that points to the previously working container, and change the deployment to point to the new tag.

B.

Apply the latest tag to the previous container image, and do a rolling update on the deployment.

C.

Build a new container from a previous Git tag, and do a rolling update on the deployment to the new container.

D.

Alter the deployment to point to the sha2 56 digest of the previously working container.

Question 16

You support a service with a well-defined Service Level Objective (SLO). Over the previous 6 months, your service has consistently met its SLO and customer satisfaction has been consistently high. Most of your service’s operations tasks are automated and few repetitive tasks occur frequently. You want to optimize the balance between reliability and deployment velocity while following site reliability engineering best practices. What should you do? (Choose two.)

Options:

A.

Make the service’s SLO more strict.

B.

Increase the service’s deployment velocity and/or risk.

C.

Shift engineering time to other services that need more reliability.

D.

Get the product team to prioritize reliability work over new features.

E.

Change the implementation of your Service Level Indicators (SLIs) to increase coverage.

Question 17

Your application images are built using Cloud Build and pushed to Google Container Registry (GCR). You want to be able to specify a particular version of your application for deployment based on the release version tagged in source control. What should you do when you push the image?

Options:

A.

Reference the image digest in the source control tag.

B.

Supply the source control tag as a parameter within the image name.

C.

Use Cloud Build to include the release version tag in the application image.

D.

Use GCR digest versioning to match the image to the tag in source control.

Question 18

Your company's security team needs to have read-only access to Data Access audit logs in the _Required bucket You want to provide your security team with the necessary permissions following the principle of least privilege and Google-recommended practices. What should you do?

Options:

A.

Assign the roles/logging, viewer role to each member of the security team

B.

Assign the roles/logging. viewer role to a group with all the security team members

C.

Assign the roles/logging.privateLogViewer role to each member of the security team

D.

Assign the roles/logging.privateLogviewer role to a group with all the security team members

Question 19

Your company recently migrated to Google Cloud. You need to design a fast, reliable, and repeatable solution for your company to provision new projects and basic resources in Google Cloud. What should you do?

Options:

A.

Use the Google Cloud console to create projects.

B.

Write a script by using the gcloud CLI that passes the appropriate parameters from the request. Save the script in a Git repository.

C.

Write a Terraform module and save it in your source control repository. Copy and run the apply command to create the new project.

D.

Use the Terraform repositories from the Cloud Foundation Toolkit. Apply the code with appropriate parameters to create the Google Cloud project and related resources.

Question 20

You are writing a postmortem for an incident that severely affected users. You want to prevent similar incidents in the future. Which two of the following sections should you include in the postmortem? (Choose two.)

Options:

A.

An explanation of the root cause of the incident

B.

A list of employees responsible for causing the incident

C.

A list of action items to prevent a recurrence of the incident

D.

Your opinion of the incident’s severity compared to past incidents

E.

Copies of the design documents for all the services impacted by the incident

Question 21

You are building the Cl/CD pipeline for an application deployed to Google Kubernetes Engine (GKE) The application is deployed by using a Kubernetes Deployment, Service, and Ingress The application team asked you to deploy the application by using the blue'green deployment methodology You need to implement the rollback actions What should you do?

Options:

A.

Run the kubectl rollout undo command

B.

Delete the new container image, and delete the running Pods

C.

Update the Kubernetes Service to point to the previous Kubernetes Deployment

D.

Scale the new Kubernetes Deployment to zero

Question 22

Your organization is using Helm to package containerized applications Your applications reference both public and private charts Your security team flagged that using a public Helm repository as a dependency is a risk You want to manage all charts uniformly, with native access control and VPC Service Controls What should you do?

Options:

A.

Store public and private charts in OCI format by using Artifact Registry

B.

Store public and private charts by using GitHub Enterprise with Google Workspace as the identity provider

C.

Store public and private charts by using Git repository Configure Cloud Build to synchronize contents of the repository into a Cloud Storage bucket Connect Helm to the bucket by using https: // [bucket] .srorage.googleapis.com/ [holnchart] as the Helm repository

D.

Configure a Helm chart repository server to run in Google Kubernetes Engine (GKE) with Cloud Storage bucket as the storage backend

Question 23

You are the Operations Lead for an ongoing incident with one of your services. The service usually runs at around 70% capacity. You notice that one node is returning 5xx errors for all requests. There has also been a noticeable increase in support cases from customers. You need to remove the offending node from the load balancer pool so that you can isolate and investigate the node. You want to follow Google-recommended practices to manage the incident and reduce the impact on users. What should you do?

Options:

A.

1. Communicate your intent to the incident team.

2. Perform a load analysis to determine if the remaining nodes can handle the increase in traffic offloaded from the removed node, and scale appropriately.

3. When any new nodes report healthy, drain traffic from the unhealthy node, and remove the unhealthy node from service.

B.

1. Communicate your intent to the incident team.

2. Add a new node to the pool, and wait for the new node to report as healthy.

3. When traffic is being served on the new node, drain traffic from the unhealthy node, and remove the old node from service.

C.

1 . Drain traffic from the unhealthy node and remove the node from service.

2. Monitor traffic to ensure that the error is resolved and that the other nodes in the pool are handling the traffic appropriately.

3. Scale the pool as necessary to handle the new load.

4. Communicate your actions to the incident team.

D.

1 . Drain traffic from the unhealthy node and remove the old node from service.

2. Add a new node to the pool, wait for the new node to report as healthy, and then serve traffic to the new node.

3. Monitor traffic to ensure that the pool is healthy and is handling traffic appropriately.

4. Communicate your actions to the incident team.

Question 24

Your team is running microservices in Google Kubernetes Engine (GKE) You want to detect consumption of an error budget to protect customers and define release policies What should you do?

Options:

A.

Create SLIs from metrics Enable Alert Policies if the services do not pass

B.

Use the metrics from Anthos Service Mesh to measure the health of the microservices

C.

Create a SLO Create an Alert Policy on select_slo_bum_rate

D.

Create a SLO and configure uptime checks for your services Enable Alert Policies if the services do not pass

Question 25

Your team is writing a postmortem after an incident on your external facing application Your team wants to improve the postmortem policy to include triggers that indicate whether an incident requires a postmortem Based on Site Reliability Engineenng (SRE) practices, what triggers should be defined in the postmortem policy?

Choose 2 answers

Options:

A.

An external stakeholder asks for a postmortem

B.

Data is lost due to an incident

C.

An internal stakeholder requests a postmortem

D.

The monitoring system detects that one of the instances for your application has failed

E.

The CD pipeline detects an issue and rolls back a problematic release.

Question 26

Your organization wants to increase the availability target of an application from 99 9% to 99 99% for an investment of $2 000 The application's current revenue is S1,000,000 You need to determine whether the increase in availability is worth the investment for a single year of usage What should you do?

Options:

A.

Calculate the value of improved availability to be $900, and determine that the increase in availability is not worth the investment

B.

Calculate the value of improved availability to be $1 000 and determine that the increase in availability is not worth the investment

C.

Calculate the value of improved availability to be $1 000 and determine that the increase in availability is worth the investment

D.

Calculate the value of improved availability to be $9,000. and determine that the increase in availability is worth the investment

Question 27

You want to share a Cloud Monitoring custom dashboard with a partner team What should you do?

Options:

A.

Provide the partner team with the dashboard URL to enable the partner team to create a copy of the dashboard

B.

Export the metrics to BigQuery Use Looker Studio to create a dashboard, and share the dashboard with the partner team

C.

Copy the Monitoring Query Language (MQL) query from the dashboard; and send the MQL query to the partner team

D.

Download the JSON definition of the dashboard, and send the JSON file to the partner team

Question 28

You are the Site Reliability Engineer responsible for managing your company's data services and products. You regularly navigate operational challenges, such as unpredictable data volume and high cost, with your company's data ingestion processes. You recently learned that a new data ingestion product will be developed in Google Cloud. You need to collaborate with the product development team to provide operational input on the new product. What should you do?

Options:

A.

Deploy the prototype product in a test environment, run a load test, and share the results with the product development team.

B.

When the initial product version passes the quality assurance phase and compliance assessments, deploy the product to a staging environment. Share error logs and performance metrics with the product development team.

C.

When the new product is used by at least one internal customer in production, share error logs and monitoring metrics with the product development team.

D.

Review the design of the product with the product development team to provide feedback early in the design phase.

Question 29

Your company processes IOT data at scale by using Pub/Sub, App Engine standard environment, and an application written in GO. You noticed that the performance inconsistently degrades at peak load. You could not reproduce this issue on your workstation. You need to continuously monitor the application in production to identify slow paths in the code. You want to minimize performance impact and management overhead. What should you do?

  • Install a continuous profiling tool into Compute Engine. Configure the application to send profiling data to the tool.

Options:

A.

Periodically run the go tool pprof command against the application instance. Analyze the results by using flame graphs.

B.

Configure Cloud Profiler, and initialize the cloud.go@gle.com/go/profiler library in the application.

C.

Use Cloud Monitoring to assess the App Engine CPU utilization metric.

Question 30

Your company operates in a highly regulated domain that requires you to store all organization logs for seven years You want to minimize logging infrastructure complexity by using managed services You need to avoid any future loss of log capture or stored logs due to misconfiguration or human error What should you do?

Options:

A.

Use Cloud Logging to configure an aggregated sink at the organization level to export all logs into a BigQuery dataset

B.

Use Cloud Logging to configure an aggregated sink at the organization level to export all logs into Cloud Storage with a seven-year retention policy and Bucket Lock

C.

Use Cloud Logging to configure an export sink at each project level to export all logs into a BigQuery dataset

D.

Use Cloud Logging to configure an export sink at each project level to export all logs into Cloud Storage with a seven-year retention policy and Bucket Lock

Question 31

Your Cloud Run application writes unstructured logs as text strings to Cloud Logging. You want to convert the unstructured logs to JSON-based structured logs. What should you do?

Options:

A.

A Install a Fluent Bit sidecar container, and use a JSON parser.

B.

Install the log agent in the Cloud Run container image, and use the log agent to forward logs to Cloud Logging.

C.

Configure the log agent to convert log text payload to JSON payload.

D.

Modify the application to use Cloud Logging software development kit (SDK), and send log entries with a jsonPay10ad field.

Question 32

Your organization has a containerized web application that runs on-premises As part of the migration plan to Google Cloud you need to select a deployment strategy and platform that meets the following acceptance criteria

1 The platform must be able to direct traffic from Android devices to an Android-specific microservice

2 The platform must allow for arbitrary percentage-based traffic splitting

3 The deployment strategy must allow for continuous testing of multiple versions of any microservice

What should you do?

Options:

A.

Deploy the canary release of the application to Cloud Run Use traffic splitting to direct 10% of user traffic to the canary release based on the revision tag

B.

Deploy the canary release of the application to App Engine Use traffic splitting to direct a subset of user traffic to the new version based on the IP address

C.

Deploy the canary release of the application to Compute Engine Use Anthos Service Mesh with Compute Engine to direct 10% of user traffic to the canary release by configuring the virtual service.

D.

Deploy the canary release to Google Kubernetes Engine with Anthos Sen/ice Mesh Use traffic splitting to direct 10% of user traffic to the new version based on the user-agent header configured in the virtual service

Question 33

Your organization wants to implement Site Reliability Engineering (SRE) culture and principles. Recently, a service that you support had a limited outage. A manager on another team asks you to provide a formal explanation of what happened so they can action remediations. What should you do?

Options:

A.

Develop a postmortem that includes the root causes, resolution, lessons learned, and a prioritized list of action items. Share it with the manager only.

B.

Develop a postmortem that includes the root causes, resolution, lessons learned, and a prioritized list of action items. Share it on the engineering organization's document portal.

C.

Develop a postmortem that includes the root causes, resolution, lessons learned, the list of people responsible, and a list of action items for each person. Share it with the manager only.

D.

Develop a postmortem that includes the root causes, resolution, lessons learned, the list of people responsible, and a list of action items for each person. Share it on the engineering organization's document portal.

Question 34

You support an application running on GCP and want to configure SMS notifications to your team for the most critical alerts in Stackdriver Monitoring. You have already identified the alerting policies you want to configure this for. What should you do?

Options:

A.

Download and configure a third-party integration between Stackdriver Monitoring and an SMS gateway. Ensure that your team members add their SMS/phone numbers to the external tool.

B.

Select the Webhook notifications option for each alerting policy, and configure it to use a third-party integration tool. Ensure that your team members add their SMS/phone numbers to the external tool.

C.

Ensure that your team members set their SMS/phone numbers in their Stackdriver Profile. Select the SMS notification option for each alerting policy and then select the appropriate SMS/phone numbers from the list.

D.

Configure a Slack notification for each alerting policy. Set up a Slack-to-SMS integration to send SMS messages when Slack messages are received. Ensure that your team members add their SMS/phone numbers to the external integration.

Question 35

Your company experiences bugs, outages, and slowness in its production systems. Developers use the production environment for new feature development and bug fixes. Configuration and experiments are done in the production environment, causing outages for users. Testers use the production environment for load testing, which often slows the production systems. You need to redesign the environment to reduce the number of bugs and outages in production and to enable testers to load test new features. What should you do?

Options:

A.

Create an automated testing script in production to detect failures as soon as they occur.

B.

Create a development environment with smaller server capacity and give access only to developers and testers.

C.

Secure the production environment to ensure that developers can't change it and set up one controlled update per year.

D.

Create a development environment for writing code and a test environment for configurations, experiments, and load testing.

Question 36

You support a large service with a well-defined Service Level Objective (SLO). The development team deploys new releases of the service multiple times a week. If a major incident causes the service to miss its SLO, you want the development team to shift its focus from working on features to improving service reliability. What should you do before a major incident occurs?

Options:

A.

Develop an appropriate error budget policy in cooperation with all service stakeholders.

B.

Negotiate with the product team to always prioritize service reliability over releasing new features.

C.

Negotiate with the development team to reduce the release frequency to no more than once a week.

D.

Add a plugin to your Jenkins pipeline that prevents new releases whenever your service is out of SLO.

Question 37

You work for a global organization and run a service with an availability target of 99% with limited engineering resources. For the current calendar month you noticed that the service has 99 5% availability. You must ensure that your service meets the defined availability goals and can react to business changes including the upcoming launch of new features You also need to reduce technical debt while minimizing operational costs You want to follow Google-recommended practices What should you do?

Options:

A.

Add N+1 redundancy to your service by adding additional compute resources to the service

B.

Identify, measure and eliminate toil by automating repetitive tasks

C.

Define an error budget for your service level availability and minimize the remaining error budget

D.

Allocate available engineers to the feature backlog while you ensure that the sen/ice remains within the availability target

Question 38

Your organization uses a change advisory board (CAB) to approve all changes to an existing service You want to revise this process to eliminate any negative impact on the software delivery performance What should you do?

Choose 2 answers

Options:

A.

Replace the CAB with a senior manager to ensure continuous oversight from development to deployment

B.

Let developers merge their own changes but ensure that the team's deployment platform can roll back changes if any issues are discovered

C.

Move to a peer-review based process for individual changes that is enforced at code check-in time and supported by automated tests

D.

Batch changes into larger but less frequent software releases

E.

Ensure that the team's development platform enables developers to get fast feedback on the impact of their changes

Question 39

Your development team has created a new version of their service’s API. You need to deploy the new versions of the API with the least disruption to third-party developers and end users of third-party installed applications. What should you do?

Options:

A.

Introduce the new version of the API.

Announce deprecation of the old version of the API.

Deprecate the old version of the API.

Contact remaining users of the old API.

Provide best effort support to users of the old API.

Turn down the old version of the API.

B.

Announce deprecation of the old version of the API.

Introduce the new version of the API.

Contact remaining users on the old API.

Deprecate the old version of the API.

Turn down the old version of the API.

Provide best effort support to users of the old API.

C.

Announce deprecation of the old version of the API.

Contact remaining users on the old API.

Introduce the new version of the API.

Deprecate the old version of the API.

Provide best effort support to users of the old API.

Turn down the old version of the API.

D.

Introduce the new version of the API.

Contact remaining users of the old API.

Announce deprecation of the old version of the API.

Deprecate the old version of the API.

Turn down the old version of the API.

Provide best effort support to users of the old API.

Question 40

You need to run a business-critical workload on a fixed set of Compute Engine instances for several months. The workload is stable with the exact amount of resources allocated to it. You want to lower the costs for this workload without any performance implications. What should you do?

Options:

A.

Purchase Committed Use Discounts.

B.

Migrate the instances to a Managed Instance Group.

C.

Convert the instances to preemptible virtual machines.

D.

Create an Unmanaged Instance Group for the instances used to run the workload.

Question 41

Your team has recently deployed an NGINX-based application into Google Kubernetes Engine (GKE) and has exposed it to the public via an HTTP Google Cloud Load Balancer (GCLB) ingress. You want to scale the deployment of the application's frontend using an appropriate Service Level Indicator (SLI). What should you do?

Options:

A.

Configure the horizontal pod autoscaler to use the average response time from the Liveness and Readiness probes.

B.

Configure the vertical pod autoscaler in GKE and enable the cluster autoscaler to scale the cluster as pods expand.

C.

Install the Stackdriver custom metrics adapter and configure a horizontal pod autoscaler to use the number of requests provided by the GCLB.

D.

Expose the NGINX stats endpoint and configure the horizontal pod autoscaler to use the request metrics exposed by the NGINX deployment.

Question 42

You use Spinnaker to deploy your application and have created a canary deployment stage in the pipeline. Your application has an in-memory cache that loads objects at start time. You want to automate the comparison of the canary version against the production version. How should you configure the canary analysis?

Options:

A.

Compare the canary with a new deployment of the current production version.

B.

Compare the canary with a new deployment of the previous production version.

C.

Compare the canary with the existing deployment of the current production version.

D.

Compare the canary with the average performance of a sliding window of previous production versions.

Question 43

You are running a real-time gaming application on Compute Engine that has a production and testing environment. Each environment has their own Virtual Private Cloud (VPC) network. The application frontend and backend servers are located on different subnets in the environment's VPC. You suspect there is a malicious process communicating intermittently in your production frontend servers. You want to ensure that network traffic is captured for analysis. What should you do?

Options:

A.

Enable VPC Flow Logs on the production VPC network frontend and backend subnets only with a sample volume scale of 0.5.

B.

Enable VPC Flow Logs on the production VPC network frontend and backend subnets only with a sample volume scale of 1.0.

C.

Enable VPC Flow Logs on the testing and production VPC network frontend and backend subnets with a volume scale of 0.5. Apply changes in

testing before production.

D.

Enable VPC Flow Logs on the testing and production VPC network frontend and backend subnets with a volume scale of 1.0. Apply changes in testing before production.

Question 44

You support a Node.js application running on Google Kubernetes Engine (GKE) in production. The application makes several HTTP requests to dependent applications. You want to anticipate which dependent applications might cause performance issues. What should you do?

Options:

A.

Instrument all applications with Stackdriver Profiler.

B.

Instrument all applications with Stackdriver Trace and review inter-service HTTP requests.

C.

Use Stackdriver Debugger to review the execution of logic within each application to instrument all applications.

D.

Modify the Node.js application to log HTTP request and response times to dependent applications. Use Stackdriver Logging to find dependent applications that are performing poorly.

Question 45

You are configuring a Cl pipeline. The build step for your Cl pipeline integration testing requires access to APIs inside your private VPC network. Your security team requires that you do not expose API traffic publicly. You need to implement a solution that minimizes management overhead. What should you do?

Options:

A.

Use Cloud Build private pools to connect to the private VPC.

B.

Use Spinnaker for Google Cloud to connect to the private VPC.

C.

Use Cloud Build as a pipeline runner. Configure Internal HTTP(S) Load Balancing for API access.

D.

Use Cloud Build as a pipeline runner. Configure External HTTP(S) Load Balancing with a Google Cloud Armor policy for API access.

Question 46

You recently noticed that one Of your services has exceeded the error budget for the current rolling window period. Your company's product team is about to launch a new feature. You want to follow Site Reliability Engineering (SRE) practices.

What should you do?

Options:

A.

Notify the team that their error budget is used up. Negotiate with the team for a launch freeze or tolerate a slightly worse user experience.

B.

Look through other metrics related to the product and find SLOs with remaining error budget. Reallocate the error budgets and allow the feature launch.

C.

Escalate the situation and request additional error budget.

D.

Notify the team about the lack of error budget and ensure that all their tests are successful so the launch will not further risk the error budget.

Question 47

You use Cloud Build to build your application. You want to reduce the build time while minimizing cost and development effort. What should you do?

Options:

A.

Use Cloud Storage to cache intermediate artifacts.

B.

Run multiple Jenkins agents to parallelize the build.

C.

Use multiple smaller build steps to minimize execution time.

D.

Use larger Cloud Build virtual machines (VMs) by using the machine-type option.

Question 48

You are the on-call Site Reliability Engineer for a microservice that is deployed to a Google Kubernetes Engine (GKE) Autopilot cluster. Your company runs an online store that publishes order messages to Pub/Sub and a microservice receives these messages and updates stock information in the warehousing system. A sales event caused an increase in orders, and the stock information is not being updated quickly enough. This is causing a large number of orders to be accepted for products that are out of stock You check the metrics for the microservice and compare them to typical levels.

You need to ensure that the warehouse system accurately reflects product inventory at the time orders are placed and minimize the impact on customers What should you do?

Options:

A.

Decrease the acknowledgment deadline on the subscription

B.

Add a virtual queue to the online store that allows typical traffic levels

C.

Increase the number of Pod replicas

D.

Increase the Pod CPU and memory limits

Question 49

Your company runs applications in Google Kubernetes Engine (GKE). Several applications rely on ephemeral volumes. You noticed some applications were unstable due to the DiskPressure node condition on the worker nodes. You need

to identify which Pods are causing the issue, but you do not have execute access to workloads and nodes. What should you do?

Options:

A.

Check the node/ephemeral_storage/used_bytes metric by using Metrics Explorer.

B.

Check the metric by using Metrics Explorer.

C.

Locate all the Pods with emptyDir volumes. use the df-h command to measure volume disk usage.

D.

Locate all the Pods with emptyDir volumes. Use the du -sh * command to measure volume disk usage.

Question 50

Your team is designing a new application for deployment into Google Kubernetes Engine (GKE). You need to set up monitoring to collect and aggregate various application-level metrics in a centralized location. You want to use Google Cloud Platform services while minimizing the amount of work required to set up monitoring. What should you do?

Options:

A.

Publish various metrics from the application directly to the Slackdriver Monitoring API, and then observe these custom metrics in Stackdriver.

B.

Install the Cloud Pub/Sub client libraries, push various metrics from the application to various topics, and then observe the aggregated metrics in Stackdriver.

C.

Install the OpenTelemetry client libraries in the application, configure Stackdriver as the export destination for the metrics, and then observe the application's metrics in Stackdriver.

D.

Emit all metrics in the form of application-specific log messages, pass these messages from the containers to the Stackdriver logging collector, and then observe metrics in Stackdriver.

Question 51

You are creating and assigning action items in a postmodern for an outage. The outage is over, but you need to address the root causes. You want to ensure that your team handles the action items quickly and efficiently. How should you assign owners and collaborators to action items?

Options:

A.

Assign one owner for each action item and any necessary collaborators.

B.

Assign multiple owners for each item to guarantee that the team addresses items quickly

C.

Assign collaborators but no individual owners to the items to keep the postmortem blameless.

D.

Assign the team lead as the owner for all action items because they are in charge of the SRE team.

Question 52

You created a Stackdriver chart for CPU utilization in a dashboard within your workspace project. You want to share the chart with your Site Reliability Engineering (SRE) team only. You want to ensure you follow the principle of least privilege. What should you do?

Options:

A.

Share the workspace Project ID with the SRE team. Assign the SRE team the Monitoring Viewer IAM role in the workspace project.

B.

Share the workspace Project ID with the SRE team. Assign the SRE team the Dashboard Viewer IAM role in the workspace project.

C.

Click "Share chart by URL" and provide the URL to the SRE team. Assign the SRE team the Monitoring Viewer IAM role in the workspace project.

D.

Click "Share chart by URL" and provide the URL to the SRE team. Assign the SRE team the Dashboard Viewer IAM role in the workspace project.

Question 53

You are using Stackdriver to monitor applications hosted on Google Cloud Platform (GCP). You recently deployed a new application, but its logs are not appearing on the Stackdriver dashboard.

You need to troubleshoot the issue. What should you do?

Options:

A.

Confirm that the Stackdriver agent has been installed in the hosting virtual machine.

B.

Confirm that your account has the proper permissions to use the Stackdriver dashboard.

C.

Confirm that port 25 has been opened in the firewall to allow messages through to Stackdriver.

D.

Confirm that the application is using the required client library and the service account key has proper permissions.

Question 54

You are monitoring a service that uses n2-standard-2 Compute Engine instances that serve large files. Users have reported that downloads are slow. Your Cloud Monitoring dashboard shows that your VMS are running at peak network throughput. You want to improve the network throughput performance. What should you do?

Options:

A.

Deploy a Cloud NAT gateway and attach the gateway to the subnet of the VMS.

B.

Add additional network interface controllers (NICs) to your VMS.

C.

Change the machine type for your VMS to n2-standard-8.

D.

Deploy the Ops Agent to export additional monitoring metrics.