This unique 4-day IBM InfoSphere Advanced Datastage bootcamp training is designed to introduce advanced job development techniques in DataStage. This advanced course is for experienced developers seeking training in more advanced techniques and who seek an understanding of the parallel framework architecture. This course combines InfoSphere Advanced DataStage – Parallel Framework (3-days) and InfoSphere DataStage – Advanced Data Processing (2-days). Materials and environment for hands-on labs provided.


  • You should complete our IBM InfoSphere DataStage Essentials training course and have at least one year of experience developing parallel jobs using DataStage.

Skills taught

  • Describe the parallel processing architecture and development and runtime environments
  • Describe the compile process and the runtime job execution process
  • Describe how partitioning and collection works in the parallel framework
  • Describe sorting and buffering in the parallel framework and optimization techniques
  • Describe and work with parallel framework data types
  • Create reusable job components
  • Use loop processing in a Transformer stage
  • Process groups in a Transformer stage
  • Extend the functionality of DataStage by building custom stages and creating new Transformer functions
  • Use Connector stages to read and write from relational tables and handle errors in Connector stages
  • Process XML data in DataStage jobs using the XML stage
  • Design a job that processes a star schema database with Type 1 and Type 2 slowly changing dimensions
  • List job and stage best practices

Course Outline

  • Module 1: Introduction to the Parallel Framework Architecture
  • Module 2: Compilation and Execution
  • Module 3: Partitioning and Collecting Data
  • Module 4: Sorting Data
  • Module 5: Buffering in Parallel Jobs
  • Module 6: Parallel Framework Data Types
  • Module 7: Reusable components
  • Module 8: Advanced Transformer Logic
  • Module 9: Extending the Functionality of Parallel Jobs
  • Module 10: Accessing Databases (start if there is time)
  • Module 11: Processing XML Data
  • Module 12: Slowly Changing Dimensions Stages
  • Module 13: Best Practices