You are developing a prediction model. Your team indicates they need an algorithm that is fast and requires low memory and low processing power. Assuming the following algorithms have similar accuracy on your data, which is most likely to be an ideal choice for the job?
What is the open framework designed to help detect, respond to, and remediate threats in ML systems?
A classifier has been implemented to predict whether or not someone has a specific type of disease. Considering that only 1% of the population in the dataset has this disease, which measures will work the BEST to evaluate this model?
Which of the following text vectorization methods is appropriate and correctly defined for an English-to-Spanish translation machine?
You have a dataset with thousands of features, all of which are categorical. Using these features as predictors, you are tasked with creating a prediction model to accurately predict the value of a continuous dependent variable. Which of the following would be appropriate algorithms to use? (Select two.)
Which of the following describes a typical use case of video tracking?
Which of the following is a type 1 error in statistical hypothesis testing?
R-squared is a statistical measure that:
You create a prediction model with 96% accuracy. While the model's true positive rate (TPR) is performing well at 99%, the true negative rate (TNR) is only 50%. Your supervisor tells you that the TNR needs to be higher, even if it decreases the TPR. Upon further inspection, you notice that the vast majority of your data is truly positive.
What method could help address your issue?
We are using the k-nearest neighbors algorithm to classify the new data points. The features are on different scales.
Which method can help us to solve this problem?
You are implementing a support-vector machine on your data, and a colleague suggests you use a polynomial kernel. In what situation might this help improve the prediction of your model?
Which of the following can benefit from deploying a deep learning model as an embedded model on edge devices?
You and your team need to process large datasets of images as fast as possible for a machine learning task. The project will also use a modular framework with extensible code and an active developer community. Which of the following would BEST meet your needs?
A data scientist is tasked to extract business intelligence from primary data captured from the public. Which of the following is the most important aspect that the scientist cannot forget to include?
In a self-driving car company, ML engineers want to develop a model for dynamic pathing. Which of following approaches would be optimal for this task?
Which of the following is a privacy-focused law that an AI practitioner should adhere to while designing and adapting an AI system that utilizes personal data?
Which two of the following decrease technical debt in ML systems? (Select two.)
Which type of regression represents the following formula: y = c + b*x, where y = estimated dependent variable score, c = constant, b = regression coefficient, and x = score on the independent variable?
For each of the last 10 years, your team has been collecting data from a group of subjects, including their age and numerous biomarkers collected from blood samples. You are tasked with creating a prediction model of age using the biomarkers as input. You start by performing a linear regression using all of the data over the 10-year period, with age as the dependent variable and the biomarkers as predictors.
Which assumption of linear regression is being violated?
Which of the following is the primary purpose of hyperparameter optimization?
Which of the following sentences is true about model evaluation and model validation in ML pipelines?
For a particular classification problem, you are tasked with determining the best algorithm among SVM, random forest, K-nearest neighbors, and a deep neural network. Each of the algorithms has similar accuracy on your data. The stakeholders indicate that they need a model that can convey each feature's relative contribution to the model's accuracy. Which is the best algorithm for this use case?
A classifier has been implemented to predict whether or not someone has a specific type of disease. Considering that only 1% of the population in the dataset has this disease, which measures will work the BEST to evaluate this model?
You have a dataset with many features that you are using to classify a dependent variable. Because the sample size is small, you are worried about overfitting. Which algorithm is ideal to prevent overfitting?
Which two encodes can be used to transform categories data into numerical features? (Select two.)
A company is developing a merchandise sales application The product team uses training data to teach the AI model predicting sales, and discovers emergent bias. What caused the biased results?
Which of the following models are text vectorization methods? (Select two.)