Create Next App

databricks CERTIFIED_DATA_ANALYST_ASSOCIATE

Exam contains 45 questions

Page 5 of 8

Question 25 🔥

A data analysis team is working with the table_bronze SQL table as a source for one of its most complex projects. A stakeholder of the project notices that some of the downstream data is duplicative. The analysis team identifies table_bronze as the source of the duplication.Which of the following queries can be used to deduplicate the data from table_bronze and write it to a new table table_silver?

Which database solution meets these requirements?

A. CREATE TABLE table_silver AS -SELECT DISTINCT *FROM table_bronze;

Highly voted

B. CREATE TABLE table_silver AS -INSERT *FROM table_bronze;

Highly voted

C. CREATE TABLE table_silver AS -MERGE DEDUPLICATE *FROM table_bronze;

Highly voted

D. INSERT INTO TABLE table_silver -SELECT * FROM table_bronze;

Highly voted

E. INSERT OVERWRITE TABLE table_silverSELECT * FROM table_bronze;

Highly voted

Discussion of the question

Question 26 🔥

A business analyst has been asked to create a data entity/object called sales_by_employee. It should always stay up-to-date when new data are added to the sales table. The new entity should have the columns sales_person, which will be the name of the employee from the employees table, and sales, which will be all sales for that particular sales person. Both the sales table and the employees table have an employee_id column that is used to identify the sales person.Which of the following code blocks will accomplish this task?

Which database solution meets these requirements?

Discussion of the question

Question 27 🔥

A data analyst has been asked to use the below table sales_table to get the percentage rank of products within region by the sales:The result of the query should look like this:Which of the following queries will accomplish this task?

Which database solution meets these requirements?

Discussion of the question

Question 28 🔥

In which of the following situations should a data analyst use higher-order functions?

Which database solution meets these requirements?

A. When custom logic needs to be applied to simple, unnested data

Highly voted

B. When custom logic needs to be converted to Python-native code

Highly voted

C. When custom logic needs to be applied at scale to array data objects

Highly voted

D. When built-in functions are taking too long to perform tasks

Highly voted

E. When built-in functions need to run through the Catalyst Optimizer

Highly voted

Discussion of the question

Question 29 🔥

Consider the following two statements:Statement 1:Statement 2:Which of the following describes how the result sets will differ for each statement when they are run in Databricks SQL?

Which database solution meets these requirements?

A. The first statement will return all data from the customers table and matching data from the orders table. The second statement will return all data from the orders table and matching data from the customers table. Any missing data will be filled in with NULL.

Highly voted

B. When the first statement is run, only rows from the customers table that have at least one match with the orders table on customer_id will be returned. When the second statement is run, only those rows in the customers table that do not have at least one match with the orders table on customer_id will be returned.

Highly voted

C. There is no difference between the result sets for both statements.

Highly voted

D. Both statements will fail because Databricks SQL does not support those join types.

Highly voted

E. When the first statement is run, all rows from the customers table will be returned and only the customer_id from the orders table will be returned. When the second statement is run, only those rows in the customers table that do not have at least one match with the orders table on customer_id will be returned.

Highly voted

Discussion of the question

Question 30 🔥

A data analyst has created a user-defined function using the following line of code:CREATE FUNCTION price(spend DOUBLE, units DOUBLE)RETURNS DOUBLE -RETURN spend / units;Which of the following code blocks can be used to apply this function to the customer_spend and customer_units columns of the table customer_summary to create column customer_price?

Which database solution meets these requirements?

A. SELECT PRICE customer_spend, customer_units AS customer_priceFROM customer_summary

Highly voted

B. SELECT price -FROM customer_summary

Highly voted

C. SELECT function(price(customer_spend, customer_units)) AS customer_priceFROM customer_summary

Highly voted

D. SELECT double(price(customer_spend, customer_units)) AS customer_priceFROM customer_summary

Highly voted

E. SELECT price(customer_spend, customer_units) AS customer_priceFROM customer_summary

Highly voted

Discussion of the question

Ready to Pass Your Certification Test

databricks CERTIFIED_DATA_ANALYST_ASSOCIATE

Exam contains 45 questions

Lorem ipsum dolor sit amet consectetur. Eget sed turpis aenean sit aenean. Integer at nam ullamcorper a.

Company

Product

Resources

Follow us