Databricks Certified Data Engineer Professional - Databricks-Certified-Data-Engineer-Professional 模擬練習
A job runs four independent tasks (X, Y, Z, W) in parallel to process regional sales data. The Data Engineering team recently updated its cluster policy to ban cost-prohibitive instance types. Task Y now fails due to the newly enforced cluster policy restricting the use of a specific instance type.
A data engineer needs to resolve the failure quickly without disrupting the other tasks. How should the data engineer resolve the failure of tasks?
A data engineer needs to resolve the failure quickly without disrupting the other tasks. How should the data engineer resolve the failure of tasks?
正解: B
解説: (PassTest メンバーにのみ表示されます)
A data engineering team is setting up deployment automation. To deploy workspace assets remotely using the Databricks CLI command, they must configure it with proper authentication.
Which authentication approach will provide the highest level of security?
Which authentication approach will provide the highest level of security?
正解: C
解説: (PassTest メンバーにのみ表示されます)
A data engineering workspace was automatically enabled for Unity Catalog, creating a workspace catalog. New team members report they can create tables in the default schema but cannot access table in other schemas within the same workspace catalog. Why are the new team members unable to access tables in other schemas?
正解: A
解説: (PassTest メンバーにのみ表示されます)
A data engineer inherits a Delta table with historical partitions by country that are badly skewed.
Queries often filter by high-cardinality customer_id and vary across dimensions over time. The engineer wants a strategy that avoids a disruptive full rewrite, reduces sensitivity to skewed partitions, and sustains strong query performance as access patterns evolve. Which two actions should the data engineer take? (Choose two.)
Queries often filter by high-cardinality customer_id and vary across dimensions over time. The engineer wants a strategy that avoids a disruptive full rewrite, reduces sensitivity to skewed partitions, and sustains strong query performance as access patterns evolve. Which two actions should the data engineer take? (Choose two.)
正解: B,C
解説: (PassTest メンバーにのみ表示されます)
The security team is exploring whether or not the Databricks secrets module can be leveraged for connecting to an external database.
After testing the code with all Python variables being defined with strings, they upload the password to the secrets module and configure the correct permissions for the currently active user. They then modify their code to the following (leaving all other variables unchanged).

Which statement describes what will happen when the above code is executed?
After testing the code with all Python variables being defined with strings, they upload the password to the secrets module and configure the correct permissions for the currently active user. They then modify their code to the following (leaving all other variables unchanged).

Which statement describes what will happen when the above code is executed?
正解: D
解説: (PassTest メンバーにのみ表示されます)
The view updates represents an incremental batch of all newly ingested data to be inserted or updated in the customers table.
The following logic is used to process these records.

Which statement describes this implementation?
The following logic is used to process these records.

Which statement describes this implementation?
正解: D
A view is registered with the following code:

Both users and orders are Delta Lake tables.
Which statement describes the results of querying recent_orders?

Both users and orders are Delta Lake tables.
Which statement describes the results of querying recent_orders?
正解: D
The following table consists of items found in user carts within an e-commerce website.

The following MERGE statement is used to update this table using an updates view, with schema evolution enabled on this table.

How would the following update be handled?


The following MERGE statement is used to update this table using an updates view, with schema evolution enabled on this table.

How would the following update be handled?

正解: A
解説: (PassTest メンバーにのみ表示されます)
The data architect has mandated that all tables in the Lakehouse should be configured as external (also known as "unmanaged") Delta Lake tables.
Which approach will ensure that this requirement is met?
Which approach will ensure that this requirement is met?
正解: C
解説: (PassTest メンバーにのみ表示されます)
A company stores account transactions in a Delta Lake table. The company needs to apply frequent account-level correlations (e.g., UPDATE statements) but wants to avoid rewriting entire Parquet files for each change to reduce file churn and improve write performance. Which Delta Lake feature should they enable?
正解: D
解説: (PassTest メンバーにのみ表示されます)
A query is taking too long to run. After investigating the Spark UI, the data engineer discovered a significant amount of disk spill. The compute instance being used has a core-to-memory ratio of
1:2. What are the two steps the data engineer should take to minimize spillage? (Choose two.)
1:2. What are the two steps the data engineer should take to minimize spillage? (Choose two.)
正解: D,E
解説: (PassTest メンバーにのみ表示されます)
Each configuration below is identical to the extent that each cluster has 400 GB total of RAM 160 total cores and only one Executor per VM.
Given an extremely long-running job for which completion must be guaranteed, which cluster configuration will be able to guarantee completion of the job in light of one or more VM failures?
Given an extremely long-running job for which completion must be guaranteed, which cluster configuration will be able to guarantee completion of the job in light of one or more VM failures?
正解: A
An analytics team wants to run a short-term experiment in Databricks SQL on the customer transactions Delta table (about 20 billion records) created by the data engineering team. Which strategy should the data engineering team use to ensure minimal downtime and no impact on the ongoing ETL processes?
正解: A
解説: (PassTest メンバーにのみ表示されます)