Course Description
ExistBI certified IBM InfoSphere trainers deliver class based, custom and virtual training services to meet the needs of your organization. We are trusted IBM partners and regularly deliver corporate training, consulting and support services in the US, Canada, UK and Europe. We provide the certified instructor, licensed environment, materials and hands-on labs.
This 3-day course (Code: KM402G) is designed to introduce advanced parallel job development techniques in DataStage V9.1. In this course you will develop a deeper understanding of the DataStage architecture, including a deeper understanding of the DataStage development and runtime environments. This will enable you to design parallel jobs that are robust, less subject to errors, reusable, and optimized for better performance.
Course Outcomes
At the end of the course, learners will be able to:
- Describe the parallel processing architecture
- Describe pipeline and partition parallelism
- Describe the role and elements of the DataStage configuration file
- Describe the compile process and how it is represented in the OSH
- Describe the runtime job execution process and how it is depicted in the Score
- Describe how data partitioning and collecting works in the parallel framework
- List and select partitioning and collecting algorithms
- Describe sorting in the parallel framework
- Describe optimization techniques for sorting
- Describe sort key and partitioner key logic in the parallel framework
- Describe buffering in the parallel framework
- Describe optimization techniques for buffering
- Describe and work with parallel framework data types and elements, including virtual data sets and schemas
- Describe the function and use of Runtime Column Propagation (RCP) in DataStage parallel jobs
- Create reusable job components using shared containers
- Describe the function and use of Balanced Optimization
- Optimize DataStage parallel jobs using Balanced Optimization
Course Summary
Next Public Course Dates | |
| Prerequisites |
|
| Duration |
|
| Available Formats |
|
| Audience |
|
Course Modules
- Module 1: Introduction to the Parallel Framework Architecture
- Module 2: Compilation and Execution
- Module 3: Partitioning and Collecting Data
- Module 4: Sorting Data
- Module 5: Buffering in Parallel Jobs
- Module 6: Parallel Framework Data Types
- Module 7: Reusable components
- Module 8: Balanced Optimization
Testimonials
Thoroughly enjoyed the training. The trainer was fantastic! It is rare but always an awesome experience when a trainer is also an experienced practioner with a breadth of knowledge and hands on experience… even well beyond the subject matter at hand. I had the feeling that the trainer could have answered in detail any question we might have had related to not only BDM but Hadoop and other relevant big data topics as well. Time well spent and I hope to encounter Tomi again.
- Rick Kirk, CTO, Alliant Energy / Ernst & Young
The trainer was INCREDIBLE. He was extremely passionate, made sure to consistently ask if anybody needed help, logged on early to answer any questions, and was an overall great human being.
- Salvatore, Hilton Grand Vacations
The trainers clear and obvious enthusiasm for number crunching, analytics, and teaching others is infectious. He doesn’t waste time, shows exactly what you need to know and is genuinely hilarious.
Every one of my employees had tons of positive stuff to say.
- Benjamin G, MXSG Analysis and Integration Chief, US Air Force
























