Snowflake SnowPro Advanced: Data Engineer (DEA-C02) - DEA-C02 模擬練習
You are implementing row access policies on a 'SALES DATA table to restrict access based on the 'REGION' column. Different users are allowed to see data only for specific regions. You have a mapping table 'USER REGION MAP' with columns 'USERNAME' and 'REGION'. You want to create a row access policy that dynamically filters the 'SALES DATA' based on the user and their allowed region. Which of the following options represents a correct approach to create and apply this row access policy?


正解: D
解説: (PassTest メンバーにのみ表示されます)
You have a Snowflake table, 'CUSTOMER ORDERS', with columns like 'CUSTOMER ID', 'ORDER DATE', 'ORDER AMOUNT', and 'REGION'. A Bl dashboard relies on a query that aggregates data from this table, but the query performance is consistently slow. The query frequently filters by 'ORDER DATE and groups by 'REGION'. Based on the following 'EXPLAIN' output, which combination of techniques should be considered to improve the performance the most?


正解: B,C
解説: (PassTest メンバーにのみ表示されます)
You're designing a near real-time data pipeline for clickstream data using Snowpipe Streaming. The data volume is extremely high, with bursts exceeding 1 million events per second. Your team reports intermittent ingestion failures and latency spikes. Considering the constraints of Snowpipe Streaming, which of the following strategies would be MOST effective in mitigating these issues, assuming the data format is optimized and network latency is minimal?
正解: A,E
解説: (PassTest メンバーにのみ表示されます)
You have a table 'EMPLOYEE DATA' containing Personally Identifiable Information (PII), including 'salary' and 'email'. You need to implement column-level security such that: 1) The 'salary' column is only visible to users in the 'FINANCE ROLE. 2) The 'email' column is masked with a SHA256 hash for all users except those in the 'HR ROLE. You create the following masking policies:

Which of the following SQL statements correctly applies these masking policies to the 'EMPLOYEE DATA table?

Which of the following SQL statements correctly applies these masking policies to the 'EMPLOYEE DATA table?
正解: B
解説: (PassTest メンバーにのみ表示されます)
You are designing a data pipeline that involves unloading large amounts of data (hundreds of terabytes) from Snowflake to AWS S3 for archival purposes. To optimize cost and performance, which of the following strategies should you consider? (Select ALL that apply)
正解: A,B,D
解説: (PassTest メンバーにのみ表示されます)
You are using Snowpark to perform a complex join operation between two large tables: 'ORDERS (1 OOGB) and 'CUSTOMER (50GB). The join is performed on 'ORDERS.CUSTOMER ID = CUSTOMER.ID. The query is running slower than expected. You have already confirmed that the warehouse size is adequate. Which of the following strategies, applied in combination , would most likely improve the join performance within a Snowpark context?
正解: C,E
解説: (PassTest メンバーにのみ表示されます)
You have a table 'CUSTOMERS' with columns 'CUSTOMER ID', 'FIRST NAME', 'LAST NAME, and 'EMAIL'. You need to transform this data into a semi-structured JSON format and store it in a VARIANT column named 'CUSTOMER DATA' in a table called 'CUSTOMER JSON'. The desired JSON structure should include a root element 'customer' containing 'id', 'name', and 'contact' fields. Which of the following SQL statements, used in conjunction with a CREATE TABLE and INSERT INTO statement for CUSTOMER JSON, correctly transforms the data?


正解: A
解説: (PassTest メンバーにのみ表示されます)
You are planning to monetize a dataset on the Snowflake Marketplace. You want to provide potential customers with sample data to evaluate before they purchase a full subscription. Which of the following strategies are valid and recommended for offering a free sample of your data within the Snowflake Marketplace? (Select all that apply)
正解: A,C
解説: (PassTest メンバーにのみ表示されます)
A Snowflake data warehouse contains a table named 'SALES TRANSACTIONS' with the following columns: 'TRANSACTION ID', 'PRODUCT D', 'CUSTOMER D', 'TRANSACTION DATE, and 'SALES AMOUNT'. You need to optimize a query that calculates the total sales amount per product for a given month. The 'SALES TRANSACTIONS' table is very large (billions of rows), and queries are slow. Given the following initial query: SELECT PRODUCT ID, SUM(SALES AMOUNT) AS TOTAL SALES FROM SALES TRANSACTIONS WHERE TRANSACTION DATE BETWEEN '2023-01-07' AND '2023-01-31' GäOUP BY PRODUCT ID; Which of the following actions, when combined, would MOST effectively improve the performance of this query?
正解: B,D
解説: (PassTest メンバーにのみ表示されます)
A data engineer is tasked with optimizing a Snowflake data pipeline that ingests data from multiple external sources, transforms it, and loads it into a reporting table. The pipeline uses a series of Snowflake tasks orchestrated with a root task and child tasks. Performance monitoring shows inconsistent execution times for the transformation tasks. Which of the following strategies would provide the MOST granular insights into the performance bottlenecks within the pipeline and allow for targeted optimization?
正解: B
解説: (PassTest メンバーにのみ表示されます)
You are building a data pipeline in Snowflake using Snowpark Python. As part of the pipeline, you need to create a dynamic SQL query to filter records from a table named 'PRODUCT REVIEWS based on a list of product categories. The list of categories is passed to a stored procedure as a string argument, where categories are comma separated. The filtered data needs to be further processed within the stored procedure. Which of the following approaches are MOST efficient and secure ways to construct and execute this dynamic SQL query using Snowpark?
正解: A,C
解説: (PassTest メンバーにのみ表示されます)
You are designing a data pipeline using Snowpipe to ingest data from multiple S3 buckets into a single Snowflake table. Each S3 bucket represents a different data source and contains files in JSON format. You want to use Snowpipe's auto-ingest feature and a single Snowpipe object for all buckets to simplify management and reduce overhead. However, each data source has a different JSON schem a. How can you best achieve this goal while ensuring data is loaded correctly and efficiently into the target table?
正解: C
解説: (PassTest メンバーにのみ表示されます)
You are tasked with setting up a Kafka Connector to ingest data into Snowflake. You need to ensure fault tolerance. Which of the following Kafka Connect configurations are essential for enabling fault tolerance and ensuring minimal data loss during connector failures? Select all that apply.
正解: A,C,E
解説: (PassTest メンバーにのみ表示されます)
A financial services company, 'Acme Finance', wants to share aggregated, anonymized transaction data with a research firm, 'Data Insights', through a Snowflake Data Clean Room. Acme Finance needs to ensure that Data Insights can only analyze the data using pre- defined aggregate functions and cannot access the raw, underlying transactional details. Acme Finance has already created a secure view to share the aggregated data'. Which of the following steps are necessary to grant Data Insights access to the data securely while enforcing the required restrictions?
正解: D
解説: (PassTest メンバーにのみ表示されます)
You have created a masking policy called which redacts salary information based on the user's role. You have applied this policy to the 'SALARY column in the 'EMPLOYEES table. However, after applying the policy, you notice that even users with the 'ACCOUNTADMIN' role are seeing the masked data, which is not the intended behavior. The intention is that 'ACCOUNTADMIN' and 'SECURITYADMIN' roles should always see the real salary data'. What is the MOST likely cause of this issue and what would you suggest fix that?
正解: D
解説: (PassTest メンバーにのみ表示されます)