In this classic four-day IBM InfoSphere DataStage training, you will learn about the features of DataStage and how to build and run DataStage Extract, Transform and Load (ETL) jobs. Also covered will be information on DataStage in its IBM Information Server environment. You will learn how to build DataStage parallel jobs that read and write data to and from a variety of data stores including sequential files, data sets, and relational tables. Additionally, you will learn how to build parallel jobs that process data in a variety of ways: business transformations, data filtering, data combining, data generation, sorting, and aggregating.


  • You should have basic knowledge of the Windows operating system and some familiarity with database access techniques.

Skills taught

  • Describe the uses of DataStage and the DataStage workflow
  • Describe the Information Server architecture and how DataStage fits within it
  • Describe the Information Server and DataStage deployment options
  • Use the Information Server Web Console and the DataStage Administrator client to create DataStage users and to configure the DataStage environment
  • Import and export DataStage objects to a file
  • Import table definitions for sequential files and relational tables
  • Design, compile, run, and monitor DataStage parallel jobs
  • Design jobs that read and write to sequential files
  • Describe the DataStage parallel processing architecture
  • Design jobs that combine data using joins and lookups
  • Design jobs that sort and aggregate data
  • Implement complex business logic using the DataStage Transformer stage
  • Debug DataStage jobs using the DataStage PX Debugger
  • Read and write to database tables using DataStage ODBC and DB2 Connector stages
  • Work with the Repository functions such as search and impact analysis
  • Build job sequences that controls batches of jobs Understand how FastTrack and Metadata Workbench can be profitably used with DataStage

Course Outline

  • Module 1: Introduction to DataStage
  • Module 2: Deployment
  • Module 3: DataStage Administration
  • Module 4: Working with Metadata
  • Module 5: Creating Parallel Jobs
  • Module 6: Accessing Sequential Data
  • Module 7: Partitioning and Collecting
  • Module 8: Combining Data
  • Module 9: Group Processing Stages
  • Module 10: Transformer Stage
  • Module 11: Repository Functions
  • Module 12: Working with Relational Data
  • Module 13: Job Control
  • Module 14: Intersecting with Other Information Server Products
Review Date
Reviewed Item
IBM InfoSphere Training: DataStage Essentials
Author Rating