Course Description

This 2-day Machine Learning with Apache Spark training will teach you how to scale ML pipelines with Apache Spark™, including distributed training, hyperparameter tuning and inference. You’ll build and tune ML models with SparkML while leveraging MLflow to track, version and manage these models. We’ll cover the latest ML features in Apache Spark, such as pandas UDFs, pandas functions and the pandas API on Spark, as well as the latest ML product offerings such as Feature Store and AutoML.

This course will prepare you to take the Databricks Certified Machine Learning Associate exam

Objectives

Perform scalable EDA with Spark
Build and tune machine learning models with SparkML
Track, version and deploy models with MLflow
Perform distributed hyperparameter tuning with HyperOpt
Use the Databricks Machine Learning workspace to create a Feature Store and AutoML experiments
Leverage the pandas API on Spark to scale your pandas code

Prerequisites

Intermediate experience with Python
Experience building machine learning models
Familiarity with PySpark DataFrame API

Course Summary

Next Public Course Dates Book Now Ask About Group Rates	UK & Europe: 4 - 5 February 2026 18 - 19 February 2026 4 - 5 March 2026 US & Canada: 11 - 12 February 2026 25 - 26 February 2026 11 - 12 March 2026 *More dates*
Duration	2 Days
Prerequisites	Intermediate experience with Python Experience building machine learning models Familiarity with PySpark DataFrame API
Available Formats	Public Virtual Live Instructor-Led Private Virtual Live Instructor-Led Private Onsite
Audience	Developers, Data Engineers, Data Science Engineers or Programmers

Course Modules

Day 1 Outline

Day 1

Spark/ML Overview
Exploratory Data Analysis (EDA) and Feature Engineering with Spark
Linear Regression with SparkML: Transformers, Estimators, Pipelines and Evaluators
MLflow Tracking and Model Registry

Day 2 Outline

Testimonials

Absolutely loved the enthusiasm and appreciate the knowledge he brought to class!!!

★★★★★

- Shelly Fruits, KPERS

“Instructor was very passionate about Databricks and helped me to stay engaged. Great pace, great knowledge, and the trainer was fantastic overall :) ”

★★★★★

- Emma Darling, Platform Engineer, Barclays Bank

“The Databricks Training was excellent”

★★★★★

- Spencer Martin, VP, Information Systems, RoyOMartin

Course Description

Objectives

Prerequisites

Course Summary

Course Modules

Day 1

Testimonials

Upcoming Course Dates

Our Customers

Consulting

Consulting

Consulting

Training

Training

Training

About ExistBI

Contact Us

Other Locations

Other Locations

Other Locations

Scalable Machine Learning with Apache Spark

Course Description

Objectives

Prerequisites

Course Summary

Course Modules

Day 1

Testimonials

Upcoming Course Dates

To discuss your project requirements, send us a message

For a free assessment, quick quote or training information, send us a message

To book this course, please fill in your details and submit the form.

To book this course, please fill in your details and submit the form.

To discuss your training requirements or book a class, drop us a line

Our Customers

Consulting

Consulting

Consulting

Training

Training

Training

About ExistBI

Contact Us

Contact Us

Other Locations

Other Locations

Other Locations