This IBM DataStage training course enables the project administrators and ETL developers to acquire the skills necessary to develop parallel jobs in DataStage v11.3/11.5. The emphasis is on developers. Only administrative functions that are relevant to developers are fully discussed. Students will learn to create parallel jobs that access sequential and relational data and combine and transform the data using functions and other job components.

Training Paths that reference this course

Audience

  • Project Administrators and ETL Developers responsible for data extraction and transformation

Prerequisites

  • Basic knowledge of Windows operating system
  • Familiarity with database access techniques

Key topics

  • Introduction
  • Deployment
  • Administration
  • Work with Metadata
  • Create Parallel Jobs
  • Access Sequential Data
  • Partitioning and Collecting Algorithms
  • Combine Data
  • Group Processing Stages
  • Transformer Stage
  • Repository Functions
  • Work with Relational Data
  • Control Jobs

Objectives

  • Describe the uses of DataStage and the workflow
  • Describe the Information Server architecture and how DataStage fits within it
  • Describe the Information Server and deployment options
  • Use the Information Server Web Console and the Administrator client to create users and to configure the environment
  • Import and export objects to a file
  • Import table definitions for sequential files and relational tables
  • Design, compile, run, and monitor parallel jobs
  • Design jobs that read and write to sequential files
  • Describe the parallel processing architecture
  • Design jobs that combine data using joins and lookups
  • Design jobs that sort and aggregate data
  • Implement complex business logic using the Transformer stage
  • Debug jobs using the PX Debugger
Print Friendly, PDF & Email