This 3-day course is designed to give you a better understanding of Big Data topics focusing on Hadoop. It covers ins and outs of Big Data to clarify is it a buzzword, catch-phrase or something useful in our/your daily business. Also, it describes when we are considering something to be Big Data system, how architecture of typical Big Data system looks, what is ecosystem and who key players within Big Data space are? Is it related to data volume or technology in background? Does it replace existing technologies or is it enhancing them in joint existence? It will cover how to implement Hadoop jobs to extract business value from large and varied data sets, how to develop queries to simplify data analysis (with Pig, Hive, Cassandra and Impala).


The audience for this training is for anyone who wants overview of the key components for Big Data environment and broader Hadoop ecosystem. During the training, we will cover topics that can help decision-makers meet their business goals and to see how Big Data subject can be integrated within their organization. Also, this course is designed for those professionals who want to make a career in Big Data Analytics using Hadoop. Targeted audience includes Software Professionals, Analytics Professionals, ETL developers, Project Managers, Testing Professionals, Enterprise Architects and many others who are looking to get foundation of Big Data / Hadoop architecture.


None. No programming experience is required for this training. All exercises within this course are designed to give high-level overview of capabilities for specific part of Big Data platform/ecosystem, and are not intended as deep dive into technologies.

Course Agenda:

  • Introduction to Big Data (definition through 3V, 4V, 6V …)
  • Hadoop overview and Ecosystem
  • Delivering business benefit from Big Data
  • Integrating Big Data with traditional data
  • Storing & analyzing data in Big Data environment
  • Overview of Big Data stores and Data models: key value, graph, document, column-family
  • Deep dive into storage components: HDFS, HBase, Hive, Cassandra and Impala
  • Use Cases of different storage component
  • Comparing selected Big Data storage components to Traditional Databases
  • Relational Data Analysis within Big Data platform
  • Limitations and Future Directions for storage components