Databricks Certified Professional Data Engineer - Databricks-Certified-Professional-Data-Engineer 模擬練習
A data engineer is developing a Lakeflow Declarative Pipeline (LDP) using a Databricks notebook directly connected to their pipeline. After adding new table definitions and transformation logic in their notebook, they want to check for any syntax errors in the pipeline code without actually processing data or running the pipeline.
How should the data engineer perform this syntax check?
How should the data engineer perform this syntax check?
正解: B
解説: (PassTest メンバーにのみ表示されます)
A data team is automating a daily multi-task ETL pipeline in Databricks. The pipeline includes a notebook for ingesting raw data, a Python wheel task for data transformation, and a SQL query to update aggregates. They want to trigger the pipeline programmatically and see previous runs in the GUI. They need to ensure tasks are retried on failure and stakeholders are notified by email if any task fails.
Which two approaches will meet these requirements? (Choose 2 answers)
Which two approaches will meet these requirements? (Choose 2 answers)
正解: C,E
解説: (PassTest メンバーにのみ表示されます)
A Structured Streaming job deployed to production has been resulting in higher than expected cloud storage costs. At present, during normal execution, each micro-batch of data is processed in less than 3 seconds; at least 12 times per minute, a micro-batch is processed that contains 0 records. The streaming write was configured using the default trigger settings. The production job is currently scheduled alongside many other Databricks jobs in a workspace with instance pools provisioned to reduce start-up time for jobs with batch execution. Holding all other variables constant and assuming records need to be processed in less than 10 minutes, which adjustment will meet the requirement?
正解: D
解説: (PassTest メンバーにのみ表示されます)
A data engineering team is setting up deployment automation. To deploy workspace assets remotely using the Databricks CLI command, they must configure it with proper authentication.
Which authentication approach will provide the highest level of security ?
Which authentication approach will provide the highest level of security ?
正解: C
A junior data engineer has manually configured a series of jobs using the Databricks Jobs UI. Upon reviewing their work, the engineer realizes that they are listed as the " Owner " for each job. They attempt to transfer " Owner " privileges to the " DevOps " group, but cannot successfully accomplish this task.
Which statement explains what is preventing this privilege transfer?
Which statement explains what is preventing this privilege transfer?
正解: B
解説: (PassTest メンバーにのみ表示されます)
The data architect has decided that once data has been ingested from external sources into the Databricks Lakehouse, table access controls will be leveraged to manage permissions for all production tables and views.
The following logic was executed to grant privileges for interactive queries on a production database to the core engineering group.
GRANT USAGE ON DATABASE prod TO eng;
GRANT SELECT ON DATABASE prod TO eng;
Assuming these are the only privileges that have been granted to the eng group and that these users are not workspace administrators, which statement describes their privileges?
The following logic was executed to grant privileges for interactive queries on a production database to the core engineering group.
GRANT USAGE ON DATABASE prod TO eng;
GRANT SELECT ON DATABASE prod TO eng;
Assuming these are the only privileges that have been granted to the eng group and that these users are not workspace administrators, which statement describes their privileges?
正解: C
解説: (PassTest メンバーにのみ表示されます)
A data engineer is designing a pipeline in Databricks that processes records from a Kafka stream where late- arriving data is common.
Which approach should the data engineer use?
Which approach should the data engineer use?
正解: B
解説: (PassTest メンバーにのみ表示されます)
A data team ' s Structured Streaming job is configured to calculate running aggregates for item sales to update a downstream marketing dashboard. The marketing team has introduced a new field to track the number of times this promotion code is used for each item. A junior data engineer suggests updating the existing query as follows: Note that proposed changes are in bold.

Which step must also be completed to put the proposed query into production?

Which step must also be completed to put the proposed query into production?
正解: A
解説: (PassTest メンバーにのみ表示されます)
How are the operational aspects of Lakeflow Declarative Pipelines different from Spark Structured Streaming ?
正解: A
解説: (PassTest メンバーにのみ表示されます)