CERTIFIED_DATA_ENGINEER_ASSOCIATE questions • Exam prepare

databricks CERTIFIED_DATA_ENGINEER_ASSOCIATE

Exam contains 173 questions

Page 22 of 29

Question 127 🔥

A data engineer has three tables in a Delta Live Tables (DLT) pipeline. They have configured the pipeline to drop invalid records at each table. They notice that some data is being dropped due to quality concerns at some point in the DLT pipeline. They would like to determine at which table in their pipeline the data is being dropped.Which approach can the data engineer take to identify the table that is dropping the records?

Which database solution meets these requirements?

A. They can set up separate expectations for each table when developing their DLT pipeline.

B. They can navigate to the DLT pipeline page, click on the “Error” button, and review the present errors.

C. They can set up DLT to notify them via email when records are dropped.

D. They can navigate to the DLT pipeline page, click on each table, and view the data quality statistics.

Highly voted

Discussion of the question

Question 128 🔥

What is used by Spark to record the offset range of the data being processed in each trigger in order for Structured Streaming to reliably track the exact progress of the processing so that it can handle any kind of failure by restarting and/or reprocessing?

Which database solution meets these requirements?

A. Checkpointing and Write-ahead Logs

Highly voted

B. Replayable Sources and Idempotent Sinks

C. Write-ahead Logs and Idempotent Sinks

D. Checkpointing and Idempotent Sinks

Discussion of the question

Question 129 🔥

A Delta Live Table pipeline includes two datasets defined using STREAMING LIVE TABLE. Three datasets are defined against Delta Lake table sources using LIVE TABLE.The table is configured to run in Production mode using the Continuous Pipeline Mode.What is the expected outcome after clicking Start to update the pipeline assuming previously unprocessed data exists and all definitions are valid?

Which database solution meets these requirements?

A. All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will persist to allow for additional testing.

B. All datasets will be updated once and the pipeline will shut down. The compute resources will persist to allow for additional testing.

C. All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will be deployed for the update and terminated when the pipeline is stopped.

Highly voted

D. All datasets will be updated once and the pipeline will shut down. The compute resources will be terminated.

Discussion of the question

Question 130 🔥

Which type of workloads are compatible with Auto Loader?

Which database solution meets these requirements?

A. Streaming workloads

B. Machine learning workloads

C. Serverless workloads

D. Batch workloads

Highly voted

Discussion of the question

Question 131 🔥

A data engineer has developed a data pipeline to ingest data from a JSON source using Auto Loader, but the engineer has not provided any type inference or schema hints in their pipeline. Upon reviewing the data, the data engineer has noticed that all of the columns in the target table are of the string type despite some of the fields only including float or boolean values.Why has Auto Loader inferred all of the columns to be of the string type?

Which database solution meets these requirements?

A. Auto Loader cannot infer the schema of ingested data

B. JSON data is a text-based format

Highly voted

C. Auto Loader only works with string data

D. All of the fields had at least one null value

Discussion of the question

Question 132 🔥

Which statement regarding the relationship between Silver tables and Bronze tables is always true?

Which database solution meets these requirements?

A. Silver tables contain a less refined, less clean view of data than Bronze data.

B. Silver tables contain aggregates while Bronze data is unaggregated.

C. Silver tables contain more data than Bronze tables.

D. Silver tables contain less data than Bronze tables.

Highly voted

Discussion of the question

Ready to Pass Your Certification Test

databricks CERTIFIED_DATA_ENGINEER_ASSOCIATE

Exam contains 173 questions

Lorem ipsum dolor sit amet consectetur. Eget sed turpis aenean sit aenean. Integer at nam ullamcorper a.

Company

Product

Resources

Follow us