Course Description
Our 3-day Azure Data Factory training class covers all key aspects of the Azure Data Factory v2 platform. Special attention is paid to covering Azure services which are commonly used with ADF v2 solutions. These services are Azure Data Lake Storage Gen 2, Azure SQL Database, Azure Databricks, Azure Key Vault, Azure Functions, and a few others.
Azure Data Factory Training Information
In this Azure Data Factory course, you will work on hands-on labs and learn how to:
- Build end-to-end ETL and ELT solutions using Azure Data Factory v2
- Architect, develop and deploy sophisticated, high-performance, easy-to-maintain and secure pipelines that integrate data from a variety of Azure and non-Azure data sources.
- Apply the latest DevOps best practices available for the ADF v2 platform.
Prerequisites
Introduction to Microsoft Azure (AZ-900), or equivalent experience.
Course Summary
Next Public Course Dates | |
| Prerequisites |
|
| Duration |
|
| Available Formats |
|
| Post Training Support | Yes, our certified Data Factory trainers will be available for additional questions once you start working on your data integration project. |
Course Modules
Module 1: Introduction to ADF
- Historical background: SSIS, ADF v1, other ETL/ELT tools
- Key capabilities and benefits of ADF v2
- Recent feature updates and enhancements
Module 2: Core Architectural Components
- Connectors: Azure services, databases, NoSQL, files, generic protocols, services & apps, custom
- Pipelines
- Activities: data movement, data transformation, control flow
- Datasets: source, sink
- Integration Runtimes: Azure, Self-Hosted, Azure-SSIS
Module 3: Building and Executing Your First Pipeline
- Creating ADF v2 instance
- Creating a pipeline and associated activities
- Executing the pipeline
- Monitoring execution
- Reviewing results
Module 4: Data Movement
Copying Tools and SDKS
- Copy Data Tool/Wizard
- Copy activity
- SDKs: Python, .NET
- Automation: PowerShell, REST API, ARM Templates
Copying Considerations
- File formats: Avro, binary, delimited, JSON, ORC, Parquet
- Data store support matrix
- Write behaviour: append, upsert, overwrite, write with custom logic
- Schema and data type mapping
- Fault tolerance options
Module 5: Data Transformation
Transformation with Mapping Data Flows
- Introduction to mapping data flows
- Data flow canvas
- Debug mode
- Dealing with schema drift
- Expression builder & language
- Transformation types: Aggregate, Alter row, Conditional split, Derived column, Exists, Filter, Flatten, Join, Lookup, New branch, Pivot, Select, Sink, Sort, Source, Surrogate key, Union, Unpivot, Window
Transformation with External Services
- Databricks: Notebook, Jar, Python
- HDInsight: Hive, Pig, MapReduce, Streaming, Spark
- Azure Machine Learning service
- SQL Stored procedures
- Azure Data Lake Analytics U-SQL
- Custom activities with .NET or R
Module 6: Control Flow
- Purpose of activity dependencies: branching and chaining
- Activity dependency conditions: succeeded, failed, skipped, completed
- Control flow activities: Append Variable, Azure Function, Execute Pipeline, Filter, ForEach, Get Metadata, If Condition, Lookup, Set Variable, Until, Wait, Web
Module 7: Runtime and Operations
- Debugging
- Monitoring: visual, Azure Monitor, SDKs, runtime-specific best practices
- Scheduling execution with triggers: event-based, schedule, tumbling window
- Performance, scalability, tuning
- Common troubleshooting scenarios in activities, connectors, data flows and integration runtimes
Module 8: DevOps with ADF
- Quick introduction to source control with Git
- Integration with GitHub and Azure DevOps platforms
- Environment management: Development, QA, Production
- Iterative development best practices
- Continuous Integration (CI) pipelines
- Continuous Delivery (CD) pipelines
Module 9: Promoting Reuse
- Templates: out-of-the-box and organisational
- Parameters
- Naming convention
Module 10: Security
- Data movement security
- Azure Key Vault
- Self-hosted IR considerations
- IP address blocks
- Managed identity
Testimonials
Thoroughly enjoyed the training. The trainer was fantastic! It is rare but always an awesome experience when a trainer is also an experienced practioner with a breadth of knowledge and hands on experience… even well beyond the subject matter at hand. I had the feeling that the trainer could have answered in detail any question we might have had related to not only BDM but Hadoop and other relevant big data topics as well. Time well spent and I hope to encounter Tomi again.
- Rick Kirk, CTO, Alliant Energy / Ernst & Young
“Mahesh was an EXCELLENT trainer. Very gifted instructor. Able to clearly communicate and teach objectives. Very engaging made you want to keep learning, and was so patient and kind to all students. Would love to be taught by him anytime.
- Jackie Calvo, Quality Improvement Manager, Horizon Health Center
“Trainer was amazing. I was worried about staying engaged for 4 days, but the pace and explanation was exceptional! Would love to know what other training he provides.”
- Natalie Chronister, Project Administrator III, Carnegie Mellon University Software Engineering Institute

























