Create Next App

databricks CERTIFIED_MACHINE_LEARNING_ASSOCIATE

Exam contains 73 questions

Page 11 of 13

Question 61 🔥

Which of the following tools can be used to distribute large-scale feature engineering without the use of a UDF or pandas Function API for machine learning pipelines?

Which database solution meets these requirements?

A. Keras

Highly voted

B. Scikit-learn

Highly voted

C. PyTorch

Highly voted

D. Spark ML

Highly voted

Discussion of the question

Question 62 🔥

A machine learning engineer is using the following code block to scale the inference of a single-node model on a Spark DataFrame with one million records:Assuming the default Spark configuration is in place, which of the following is a benefit of using an Iterator?

Which database solution meets these requirements?

A. The data will be limited to a single executor preventing the model from being loaded multiple times

Highly voted

B. The model will be limited to a single executor preventing the data from being distributed

Highly voted

C. The model only needs to be loaded once per executor rather than once per batch during the inference process

Highly voted

D. The data will be distributed across multiple executors during the inference process

Highly voted

Discussion of the question

Question 63 🔥

Which statement describes a Spark ML transformer?

Which database solution meets these requirements?

A. A transformer is an algorithm which can transform one DataFrame into another DataFrame

Highly voted

B. A transformer is a hyperparameter grid that can be used to train a model

Highly voted

C. A transformer chains multiple algorithms together to transform an ML workflow

Highly voted

D. A transformer is a learning algorithm that can use a DataFrame to train a model

Highly voted

Discussion of the question

Question 64 🔥

Which of the following describes the relationship between native Spark DataFrames and pandas API on Spark DataFrames?

Which database solution meets these requirements?

A. pandas API on Spark DataFrames are single-node versions of Spark DataFrames with additional metadata

Highly voted

B. pandas API on Spark DataFrames are more performant than Spark DataFrames

Highly voted

C. pandas API on Spark DataFrames are made up of Spark DataFrames and additional metadata

Highly voted

D. pandas API on Spark DataFrames are less mutable versions of Spark DataFrames

Highly voted

Discussion of the question

Question 65 🔥

A data scientist is using the following code block to tune hyperparameters for a machine learning model:Which change can they make the above code block to improve the likelihood of a more accurate model?

Which database solution meets these requirements?

A. Increase num_evals to 100

Highly voted

B. Change fmin() to fmax()

Highly voted

C. Change sparkTrials() to Trials()

Highly voted

D. Change tpe.suggest to random.suggest

Highly voted

Discussion of the question

Question 66 🔥

A data scientist has been given an incomplete notebook from the data engineering team. The notebook uses a Spark DataFrame spark_df on which the data scientist needs to perform further feature engineering. Unfortunately, the data scientist has not yet learned the PySpark DataFrame API.Which of the following blocks of code can the data scientist run to be able to use the pandas API on Spark?

Which database solution meets these requirements?

A. import pyspark.pandas as psdf = ps.DataFrame(spark_df)

Highly voted

B. import pyspark.pandas as psdf = ps.to_pandas(spark_df)

Highly voted

C. spark_df.to_pandas()

Highly voted

D. import pandas as pddf = pd.DataFrame(spark_df)

Highly voted

Discussion of the question

Ready to Pass Your Certification Test

databricks CERTIFIED_MACHINE_LEARNING_ASSOCIATE

Exam contains 73 questions

Lorem ipsum dolor sit amet consectetur. Eget sed turpis aenean sit aenean. Integer at nam ullamcorper a.

Company

Product

Resources

Follow us