This 1-day course (KM520G) teaches Information Server and/or DataStage administrators to configure, manage, and monitor the DataStage Engine which plays a crucial role in Information Server. It not only runs high performance parallel ETL jobs designed and built in DataStage. It also supports other Information Server products including Information Analyzer, QualityStage, and Data Click. After introducing DataStage parallel jobs and the Engine that runs them, the course describes DataStage project configuration, the Engine’s development and runtime environments, and the Engine’s data source connectivity. In addition the course explains how to import and export DataStage objects, how to run and monitor DataStage jobs through the command line and GUI, and how to use some important Engine utilities.

Audience

  • This course is for those who will be administering Information Server and DataStage.

Prerequisites

  • It is recommend that students take KM510 “IBM InfoSphere Administrative Tasks for Information Server v11.5” course

Key topics

Module 1: Introduction to the Information Server (DataStage) engine

  • Information Server architecture
  • How the DataStage Engine is used in Information Server
  • DataStage Engine features

Module 2: Elements of a DataStage job

  • Anatomy of a DataStage parallel job
  • Explore the elements of an example DataStage job, including its stages and job parameters
  • Understand the OSH that is generated by the job during compile
  • Log into Designer
  • Import DataStage jobs
  • Open a DataStage job and explore its elements
  • Examine the OSH generated from the job

Module 3: Engine architecture

  • Partition parallelism
  • Runtime architecture
  • Configuration files
  • The “Score”
  • Examine the Configuration file
  • Run a DataStage job
  • Examine the Score

Module 4: Engine project configuration

  • DataStage project authorizations
  • Runtime Column Propagation (RCP)
  • Environment variables
  • Create a DataStage user and assign DataStage project roles
  • Set environment variables
  • Configure a DataStage project in DataStage Administrator

Module 5: Configuring database connectivity

  • Configuring ODBC data sources
  • Configuring native database connections
  • Enable a DataStage project to access DB2
  • Setup ODBC data source connections
  • Test ODBC connectivity
  • Create a data source connection in Metadata Asset Manager

Module 6: Running DataStage jobs

  • Running jobs from the command line
  • Monitoring jobs in Director and the Operations Console
  • Workload management
  • Starting and stopping the Engine
  • Run a job from the command line
  • View the job log from the command line
  • Monitor DataStage jobs in DataStage Director
  • Monitor jobs in the Operations Console
  • Explore Workload Manager
  • Start and stop the Engine

Module 7: Engine utilities

  • Data set utilities
  • Multiple job compile
  • Resource estimation tool
  • View a data set using the Data Set Editor
  • View a data set from the command line
  • Estimate the resources of a DataStage job using the Resource Estimator tool
  • Run the Multiple Job Compile utility

Module 8: Importing and exporting DataStage objects

  • DataStage Designer exports and imports
  • Command line exports and imports
  • Import and export DataStage objects in Designer
  • Import and export DataStage objects using the DSXImportService command
Print Friendly, PDF & Email