Course Description
This 2-day Machine Learning with Apache Spark training will teach you how to scale ML pipelines with Apache Spark™, including distributed training, hyperparameter tuning and inference. You’ll build and tune ML models with SparkML while leveraging MLflow to track, version and manage these models. We’ll cover the latest ML features in Apache Spark, such as pandas UDFs, pandas functions and the pandas API on Spark, as well as the latest ML product offerings such as Feature Store and AutoML.
This course will prepare you to take the Databricks Certified Machine Learning Associate exam
Objectives
- Perform scalable EDA with Spark
- Build and tune machine learning models with SparkML
- Track, version and deploy models with MLflow
- Perform distributed hyperparameter tuning with HyperOpt
- Use the Databricks Machine Learning workspace to create a Feature Store and AutoML experiments
- Leverage the pandas API on Spark to scale your pandas code
Prerequisites
- Intermediate experience with Python
- Experience building machine learning models
- Familiarity with PySpark DataFrame API
Course Summary
| Next Public Course Dates | |
| Duration |
|
| Prerequisites |
|
| Available Formats |
|
| Audience |
|
Course Modules
Day 1
- Spark/ML Overview
- Exploratory Data Analysis (EDA) and Feature Engineering with Spark
- Linear Regression with SparkML: Transformers, Estimators, Pipelines and Evaluators
- MLflow Tracking and Model Registry
Testimonials
“The course was comprehensive and interactive, and the trainer really understood the content. I have been able to implement much of what I learned into my daily work activities and have saved a lot of time.”
- Susan Medina, Senior Business Analyst, Advocate Health Care
“It was a very enjoyable and informative course and we’re all looking forward to using the skills we learnt. We all agreed that the instructor was very good and was able to break down pretty complex issues to make them more understandable”
- Grahame Welch, Head of BI, UK Home Office
The trainers clear and obvious enthusiasm for number crunching, analytics, and teaching others is infectious. He doesn’t waste time, shows exactly what you need to know and is genuinely hilarious.
Every one of my employees had tons of positive stuff to say.
- Benjamin G, MXSG Analysis and Integration Chief, US Air Force
























