Databricks Certified Machine Learning Professional - Databricks-Machine-Learning-Professional 模擬練習
A data scientist has developed and logged a Spark ML random forest model model, and then they ended their Spark session and terminated their cluster. After starting a new cluster, they want to review the featureImportances of the original model object. Which lines of code can be used to restore the model object so that featureImportances is available?
正解: B
A machine learning engineer wants to deploy a model for real-time serving using MLflow Model Serving. For the model, the machine learning engineer currently has one model version in each of the stages in the MLflow Model Registry. The engineer wants to know which model versions can be queried once Model Serving is enabled for the model. Which of the following lists all of the MLflow Model Registry stages whose model versions are automatically deployed with Model Serving?
正解: A
A machine learning engineering team wants to build a continuous pipeline for data preparation of a machine learning application. The team would like the data to be fully processed and made ready for inference in a series of equal-sized batches. Which tool can be used to provide this type of continuous processing?
正解: C
Which of the following MLflow operations can be used to automatically calculate and log a Shapley feature importance plot?
正解: A
A Machine Learning Engineer needs to deploy a production ML workflow that includes an MLflow experiment for tracking model training runs, a registered model in Unity Catalog for version management, and a model serving endpoint for real-time inference. The team requires a unified configuration approach that ensures consistent deployment across development and production environments while adhering to infrastructure-as-code best practices. Which approach should the Machine Learning Engineer use to define all three components together?
正解: B
解説: (PassTest メンバーにのみ表示されます)
A machine learning engineer is monitoring categorical input variables for a production machine learning application. The engineer believes that missing values are becoming more prevalent in more recent data for a particular value in one of the categorical input variables. Which of the following tools can the machine learning engineer use to assess their theory?
正解: A
How can you save a trained Spark ML PipelineModel?
正解: B
解説: (PassTest メンバーにのみ表示されます)
After a data scientist noticed that a column was missing from a production feature set stored as a Delta table, the machine learning engineering team has been tasked with determining when the column was dropped from the feature set. Which SQL command can be used to accomplish this task?
正解: C
A data scientist has developed a scikit-learn model sklearn_model and they want to log the model using MLflow.
They write the following incomplete code block:

Which lines of code can be used to fill in the blank so the code block can successfully complete the task?
They write the following incomplete code block:

Which lines of code can be used to fill in the blank so the code block can successfully complete the task?
正解: E
A Machine Learning Engineer has a real-time fraud detection model deployed that approves or blocks millions of transactions daily. They need to deploy a new version of the model with improved detection accuracy to this high-traffic, business-critical application. Because any model downtime could result in lost revenue or customer dissatisfaction, the engineer must ensure zero downtime and minimal disruption for end users. Leadership also requires that any rollback to the previous version be immediate if issues are detected with the new model in production. Which deployment strategy meets these requirements?
正解: A
解説: (PassTest メンバーにのみ表示されます)
A Machine Learning Engineer has deployed a fraud detection model that processes 10,000 transactions per hour. The model was trained on data from Q1 2024, but it's now Q4 2024. The ML team notices three concerning trends: (1) the model's precision has dropped from 92% to
78% over the past month, (2) the average transaction amount in recent data has increased from
$150 to $220 and (3) the relationship between transaction frequency and fraud likelihood has weakened significantly due to new payment methods being introduced. The engineer needs to implement a monitoring solution that can detect the root cause of the performance degradation, identify why the precision dropped, and be able to do this at the scale needed. Which monitoring pipeline component will do this?
78% over the past month, (2) the average transaction amount in recent data has increased from
$150 to $220 and (3) the relationship between transaction frequency and fraud likelihood has weakened significantly due to new payment methods being introduced. The engineer needs to implement a monitoring solution that can detect the root cause of the performance degradation, identify why the precision dropped, and be able to do this at the scale needed. Which monitoring pipeline component will do this?
正解: D
解説: (PassTest メンバーにのみ表示されます)