A data warehouse comprises several components and layers that need to be created to form a complete data warehouse system. The section Data Warehouse Components briefly describes the different subsystems of a data warehouse system.
In addition, examples are given to explain how raw data from business systems is fed into the data warehouse, processed graphically, and displayed on a dashboard or business intelligence portal.
Operational Systems, also known as Online Transaction Processing (OLTP), form the basis for data analysis in the data warehouse.
Operational data is generated and processed by management, planning, and accounting systems, also known as bottom-up systems.
Much of this business data is generated in so-called online transaction processing systems (OLTP), where multiple users share the same systems and databases in peer-to-peer functions, such as IT systems, reservations, and orders.
The data is periodically retrieved from OLTP systems and temporarily stored in relational databases or flat files.
The data from heterogeneous sources is sent to the workplace, where it is prepared and processed.
Data Staging Area
The workspace, also known as the data storage area, is where data is collected, stored, archived, and used for transformation.
The entire ETL process takes place in the data warehouse and involves the extraction, transformation, and loading of data into the data warehouse. Please note that data will only be uploaded after the above processes have been completed.
This has the advantage that neither the operating systems nor the data warehouse is affected. In addition, incorrect data will not be transferred to the data warehouse. After completing the process, the data is deleted from the workspace.
The ETL process is the basis for data warehouse systems. Although not directly visible to users, the performance of the ETL process should not be underestimated.
The ETL process is a vital part of the data warehouse. In large data warehouse projects, the ETL process is often used for monthly backups, weekly storage of subsystems, and daily loading of essential components.
ETL processes are responsible for that data from the source systems to target systems. Data is extracted, cleaned, logically prepared, and loaded into the target tables.
In many data warehouse projects, ETL processes account for up to 60-80% of the total project cost. As this process is located in the back end of the data warehouse, it is sometimes unclear to the end user how much work needs to be done.
The data layer is the layer of access that enables users to access data. It is usually where the data store is located. In this layer, the data segments are separated according to who you want to give access to, so it can be a very granular organization. For example, you can’t give the sales team access to the HR department’s data, and vice versa.
Operational Data Store (ODS)
The operational aspect of the data warehouse depends on the limitations of the data warehouse architecture.
The integrated database is only available for analytical queries. For operational data, heterogeneous data from different operating systems must still be provided.
The concept of an operational data warehouse was born from the need for integrated operational data.
Data Warehouse Architecture
Data warehouse architectures are typically designed as centralized, distributed data warehouses or with or without a central data warehouse, depending on the application and the specific business requirements.
Each data warehouse architecture has its advantages and disadvantages in terms of implementation, operation, and maintenance.
At the beginning of the data warehouse development process, it is essential to know which architecture concept will be used for further development so that developers and users understand the same.
Implementing a business data model in a central data warehouse often fails due to such large projects’ short- to medium-term planning.
This has led to the emergence of so-called data warehouses, as data warehouse systems tailored to the needs of a specific person or function can be created quickly.
Individual datasets need to be harmonized with each other to ensure the consistency of data models and can be extended to a central data warehouse if necessary.
In this case, brands have predefined views for each business case, for example, data warehouses for the audit department where all relevant critical corporate data is prepared.
Data Warehouse Dashboards
Data warehouse dashboards are often used at the management level to present essential business information concisely and meaningfully.
This includes using red, yellow, and green in different strength or weakness situations.
As far as data warehouse systems are concerned, dashboards are seen as a complement to OLAP tools.
Business Intelligence Portal
The main task of a business intelligence portal is to integrate different thematic contents in a graphical interface.
In the BI portal, content and functions are compiled thematically from the data warehouse.
The prepared content is made available to users through centralized access to individual domains and related information and services.
The high degree of individualization from business intelligence portals can be counteracted by suitable filtering, preparation, and structuring information overloading.
Data warehouses have long been used in industry. They usually support BI with batch processing or process historical data for analysis.
There are also many other developments in data warehouses for real-time data analysis or collecting and processing heterogeneous data structures.
The data warehouse concept and related technologies are the latest trend in the industry and represent an evolution of the enterprise data warehouse concept.