Establishment of Chief Data Officer, Data Strategy and Data Governance Framework

Client Background and Situation

The client is a global universal bank offering services across the board including retail, corporate, commercial, investment banking and capital markets. The focus of the project was the creation of the Chief Data officer, an agreed data strategy, a target operating model, and the rollout of a data framework across the bank.  

The bank needed to establish the function in response to significant regulatory oversight following the global financial crisis in 2008The program also resulted in the introduction of Informatica products (Data quality and metadata manager (now known as EDC).

Chief Data officer

Scope of Services

• Establishing the department of the Chief Data Officer (CDO)

• Creating and agreeing on a data strategy

• Introducing a governance framework for managing the activities of the CDO

• Creating and agreeing on a new data governance framework

• Creating a data policy

• Creating workstreams to manage the delivery of the framework

Ø Business process mapping

Ø Data lineage

Ø Data glossary and dictionary

Ø Data ownership

Ø Data quality

Ø Root cause analysis

Ø Remediation

Ø Metrics and value measurement

Ø Communication and engagement

Ø Data cataloguing

Ø Data analytics

Ø Reference data management

Ø Solution architecture and tool selection

Ø PMO – Agreeing roadmap for delivery including mechanism for prioritisation

Challenge & Outcomes

• Understanding the culture and identifying a suitable data governance framework for the Bank

• Managing expectations – We might not be right the first time.  Be prepared to fail, fail quickly, reset, and deliver.  There is no such thing as failure – Its just “Not Yet”

• Think big, start small, grow fast

• Identify influential leaders, align the programme with business outcomes

Data transformation should not be led by technology but by the business.  The business sets the tempo and the direction

• Don’t implement technology until it is very clear what value we are trying to bring, how the technology is used is driven by the business needs, not what functionality the technology can do

• Technology is an enabler

• Focus on value creation not change for change’s sake

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211

Data Management and Business Intelligence Consulting Services For Global Innovative Manufacturing Company

Boyd Corporation was established in 1928 in the Bay Area, California.  The company is now one of the world’s most innovative problem-solvers, helping their customers solve their toughest technical challenges.  Their expertise range from space exploration to brain surgery.  As you can easily imagine this company produces a vast amount of data.  The Boyd team approached ExistBI regarding an Informatica data warehouse project.

In our initial communications, Boyd were unhappy with their external business intelligence consulting and Informatica support firm on a complex data warehouse project. We took it upon ourselves to gather the requirements quickly on multiple calls and began solving their complex problems within a few days of the initial enquiry.

The technology used included: Microsoft SQL Server database for Data Warehouse, Microsoft SQL Server Reporting Services for static reporting, Microsoft SQL Server Analysis Services for OLAP analysis of financial data, Microsoft.NET application for parsing and enrichment of unstructured datasets and Informatica Data Integration.


Our Data consultants quickly understood the customers pain points and turned around high-value wins for the CTO and CEO, we then further developed the relationship and took on the lead role in their new data warehouse project.

Our engagement to-date involves: 

• BI & Data Warehouse strategy consulting

• Redesign their data warehouse model

• Provide analytical support

• Develop data masking templates for delivering sensitive content to data warehouse

• Development of processing unstructured data sets

• Provide operational analytics for procurement and finances 

• Support in development of quotes and billings

• Informatica Consulting

• Informatica Software Recommendations, Data Masking and MDM

• Informatica Maintenance & Support

• CRM/Salesforce Consulting and Integration

• Microsoft Power BI training

• Sales Demand Forecasting Tool Solution

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211

Independent Bank, Data Governance Consulting and Implementation with Informatica software

Client Background

• 40+ years in business

• 125+ Locations

• Offices in 5 countries

• 3,200 + employees worldwide

• $49.4B in assets under management

Business Need

Data Governance Framework to establish policies, standards, architecture, decision-making structure, and issue resolution process

• Enhanced Data Integration and Quality

• Enrich data usability and analytics

• Better Data Security and Management


• Complex data environment across applications, cloud data lake, and databases including:

• 100 +  external datasets including uncategorized third-party data from vendors 

• 26 data stores – structured, unstructured, documents

• Legacy Banking Application

• 1.0B+ rows

Scope of Service, Solutions & Results

• Installed, configured, and tested EDC, Axon, DPM on premise Linux platform

• Scanned and profiled all data from SQL Server, Microsoft SQL Integration Services, Microsoft SQL Reporting Services and third-party banking software with custom scanners.

• End to End Data Lineage•Identified, prioritized and standardized key metrics used in critical reports and business decisions.

• Established Data Quality process to identify root cause, prioritize and implement remediation efforts.

• Processed requests for GDPR, CCPA and state privacy laws

• The business was able to get a 360 view of the data usage and report to CCPA

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211

Case Study: Microsoft Data Warehouse Consulting For Large Bank

ExistBI’s data strategy team coupled with our Data Warehouse consulting project and implementation group helped a long-term banking client recognize the need to tailor services to clients, quickly and accurately with a more comprehensive offer

Our client decided to be more customer-centric organization by having single, proactive points of contact with its customers. To reach this goal, the bank needed to consolidate and extract more meaningful information from its customer data sources. In addition, the bank’s marketing campaigns had to improve its quality, become more focused and relevant to targeted customers. They needed to go toward democratization of information and broadening decision-making responsibilities. That requires timely delivery of relevant information to each decision maker.

Therefore, it was decided a business analytics solution was necessary, starting with a new Microsoft SQL Data Warehouse. These solutions will provide the client trend analysis, performance feedback and facilitate decisive actions that result in measurable gains in cost efficiency or revenue growth.

Technology: (Microsoft SQL Server, Informatica CloudPower BI, Tableau)

To learn more about our Data Warehouse Consulting and Microsoft Azure & Business Intelligence Implementation Services go to our dedicated page or get in contact with us today.

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211

Salesforce CRM Data Migration Success Story, Large US Insurance Company

Salesforce Data Migration Implementation Across 40 Countries

by ExistBI

ExistBI helped a large US insurance Company migrate around 40 Countries (Markets) into their global Salesforce CRM implementation, containing 15 datasets per market.  We had three phases in migration per market. In SIT we focused on cleaning up the data and making sure it’s accurate from a technical perspective and in the correct format.  In UAT we focused on data quality and the correct relationships, we then loaded to production.

Additionally, we had ongoing interfaces and integrations for Customer, Product and Sales data with other systems. For migration and integration we used Informatica, Oracle for staging area/database, and for other data related tasks we used Talend.

Depending on the Data Migration Services your organization requires, we can offer resources in your office or via onshore or nearshore remote access. For a FREE Assessment or Quote, Please Complete the Contact Us Form or Call: US/Canada +1 866 965 6332 | UK/Europe +44 (0)207 554 8568.

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211

Hadoop vs Data Warehouse

Hadoop vs Data Warehouse

by ExistBI

The majority of Hadoop experts believe an integrated data warehouse (IDW) is simply a huge pile of data. However, data volume has nothing to do with what makes a data warehouse. In this article, we will compare Hadoop vs Data Warehouse to better understand the differences between these two. An IDW is a design pattern, an architecture for an analytics environment. First defined by Barry Devlin in 1988, the architecture quickly was called into question as implementers built huge databases with simple designs as well as small databases with complex designs.

In 1992, Bill Inmon published “Building the Data Warehouse,” which described two competing implementations: data warehouses and data marts. Gartner echoed Inmon’s position in 2005 in its research “Of Data Warehouses, Operational Data Stores, Data Marts and Data Outhouses.” Both are oversimplified in the following table.

Integrated Data Warehouses Data Marts
Subject oriented Subject oriented
Integrated Denormalized
Nonvolatile Nonvolatile
Time variant Time variance and currency
Persistent Virtualization option

“Subject-oriented” means the IDW is a digital reflection of the business. Subject areas contain tabular data about customers, inventory, financials, sales, suppliers, accounts, etc. The IDW contains many subject areas, each of which is 250 to 5,000 relational tables. Having many subject areas enables cross-organizational analysis – often called the 360-degree view. The IDW can answer thousands of routine, ad hoc, and complex questions.

In contrast, a data mart deploys a small fraction of one or two subject areas (i.e., a few tables). With only a few tables, data marts answer far fewer questions and are poor at handling ad hoc requests from executives.

Integration in a data warehouse has many aspects. First is the standardization of data types. This means account balances contain only valid numbers, date fields have only valid dates, and so on. Integration also means rationalizing data from multiple operational applications. For example, say four corporate applications have Bill Franks, William Franks, W. J. Franks, and Frank Williams all at the same street address. Data-integration tools figure out which is the best data to put in the IDW. Data cleansing corrects messed-up data. For example, repairs are needed when “123 Oak St., Atlanta” is in the street address but the city field is blank. Data integration performs dozens of tasks to improve the quality and validity of the data. Coupled with subject areas, this is called “a single version of the truth.”

Does Hadoop Have What it Takes?

Hadoop was engineered to rely on the schema-on-read approach, in which data is parsed, reformatted, and cleansed at runtime in a manually written program. But Hadoop (and Hive) have limited to no ability to ensure valid dates and numeric account balances. In contrast, relational database management systems (RDBMS) ensure that input records conform to the database design – called the schema. According to Dr. Michael Stonebraker, “This is the best way to keep an application from adding ‘garbage’ to a data set.”

The current rage in the Hadoop community is SQL-on-Hadoop. Those who have committed to open-source Apache are playing catch-up to databases by adding SQL language features. SQL-on-Hadoop offers are a subset of the ANSI 1992 SQL language, meaning they lack features found in SQL 1999, 2003, 2006, 2008, and 2011 standards. Therefore, the business user’s ability to perform self-service reporting and analytics is throttled. This, in turn, throws a substantial labor cost back into IT to develop reports in Java.

Additionally, the lack of a database foundation also prevents SQL-on-Hadoop from achieving fast performance. Missing from Hadoop are robust indexing strategies, in-database operators, advanced memory management, concurrency, and dynamic workload management.

A consistent – sometimes angry – complaint from Hadoop experts is the poor performance in large table joins, which the SQL-on-Hadoop tools do not fix. Remember those subject areas above? Some subject areas have two to 10 tables in the 50-1,000 terabyte range. With a mature analytic database, it is a challenging problem to optimize queries that combine 50TB with 500TB, sort it, and do it fast. Fortunately, RDBMS vendors have been innovating the RDBMS and cost-based optimizers since the 1980s. A few Apache Hadoop committers are currently reinventing this wheel, intending to release a fledgling optimizer later in 2014. Again, self-service business user query and reporting suffers.

Hadoop, therefore, does not have what it takes to be a data warehouse. It is, however, nipping at the heels of data marts.

How Many Warehouses Has Hadoop Replaced?

As far as we know, Hadoop has never replaced a data warehouse, although I’ve witnessed a few failed attempts. Instead, Hadoop has been able to peel off a few workloads from an IDW. Migrating low-value data and workloads to Hadoop is not widespread, but neither is it rare.

One workload often offloaded is extract-transform-load (ETL). Technically, Hadoop is not an ETL solution. It’s a middleware infrastructure for parallelism. Hadoop requires hand coding of ETL transformations, which is expensive, especially when maintenance costs pile up in the years to come. Simple RDBMS tasks like referential integrity checks and match key lookup don’t exist in Hadoop or Hive. Hadoop does not provide typical ETL subsystem features out-of-the-box, such as:

*  hundreds of built-in data-type conversions, transformers, look-up matching, and aggregations

*  Robust metadata, data lineage, and data modeling capabilities

*  Data quality and profiling subsystems

*  Workflow management, i.e., a GUI for generating ETL scripts and handling errors

*  Fine grained, role-based security

Because migrations often come with million-dollar price tags, there is not a stampede of ETL migrations to Hadoop. Many organizations keep the low-value ETL workload in the IDW because:

*  The IDW works (it ain’t broke, don’t fix it)

*  Years of business logic must be recoded, debugged, and vetted in Hadoop (risk)

*  There are higher business value Hadoop projects to be implemented (ROI)

Nevertheless, some ETL workload migrations are justifiable. When they occur, the IDW resources freed up are quickly consumed by business users.

Similarly, Hadoop provides a parallel platform for analytics, but it does not provide the analytics. Hadoop downloads do not include report development tools, dashboards, OLAP cubes, hundreds of statistical functions, time series analysis, predictive analytics, optimization, and other analytics. These must be hand coded or acquired elsewhere and integrated into projects.

Hadoop Was Never Free

Where does this leave the cash-strapped CIO who is still under pressure? According to Phil Russom of The Data Warehousing Institute: “Hadoop is not free, as many people have mistakenly said about it. A number of Hadoop users speaking at recent TDWI conferences have explained that Hadoop incurs substantial payroll costs due to its intensive hand coding normally done by high-payroll personnel.”

This reflects the general agreement in the industry, which is that Hadoop is far from free. The $1,000/terabyte hardware costs are hype to begin with, and traditional vendors are closing in on Hadoop’s hardware price advantage anyway. Additionally, some SQL-on-Hadoop offerings are separately priced as open source vendors seek revenue. If you want Hadoop to be fast and functional, well, that part is moving away from free and toward becoming a proprietary, priced database.

Hadoop Jumps in the Lake

Mark Madsen, President of Third Nature, gives some direction on Hadoop benefits: “Some of the workloads, particularly when large data volumes are involved, require new storage layers in the data architecture and new processing engines. These are the problems Hadoop and alternate processing engines are equipped to solve.”

Hadoop defines a new market, called the data lake. Data lake workloads include the following:

  1. Many data centers have 50 million to 150 million files. Organizing this into a cohesive infrastructure, knowing where everything is, its age, its value, and its upstream/downstream uses is a formidable task. The data lake concept is uniquely situated to solve this.
  2. Hadoop can run parallel queries over flat files. This allows it to do basic operational reporting on data in its original form.
  3. Hadoop excels as an archival subsystem. Using low-cost disk storage, Hadoop can compress and hold onto data in its raw form for decades. This avoids the problem of crumbling magnetic tapes and current software versions that can’t read the tape they produced eight years earlier. A close cousin to archival is backup-to-disk. Again, magnetic tape is the competitor.
  4. Hadoop is ideal for temporary data that will be used for a month or two then discarded. There are many urgent projects that need data for a short time then never again. Using Hadoop avoids the lengthy process of getting data through committees into the data warehouse.
  5. Hadoop, most notably YARN from Hortonworks, is providing the first cluster operating system. This is amazing stuff. YARN improves Hadoop cluster management but does not change Hadoop’s position vis-à-vis the data warehouse.

Apples and Oranges

Bob Page, the VP of Development at Hortonworks, weighed in on the Hadoop versus IDW debate: “We don’t see anybody today trying to build an IDW with Hadoop. This is a capability issue, not a cost issue. Hadoop is not an IDW. Hadoop is not a database. Comparing these two for an IDW workload is comparing apples to oranges. I don’t know anybody who would try to build an IDW in Hadoop. There are many elements of the IDW on the technical side that are well refined and have been for 25 years. Things like workload management, the way concurrency works, and the way security works – there are many different aspects of a modern IDW that you are not going to see in Hadoop today. I would not see these two as equivalent.”

Hadoop’s success won’t come as a low-priced imitation of a data warehouse. Instead, I continue to be bullish on Hadoop as we witness the birth of the data lake with predictable birthing pains. Over the next couple of years, the hype will quiet down and we can get to work exploiting the best Hadoop has to offer.

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211

Integrating Clinical Data With Informatica B2B Data Exchange

ExistBI delivered a project integrating clinical data with Informatica B2B Data Exchange. The project involved helping with business analytics, sharing clinical data, getting the business users involved and helping simplify the process to creating new studies.

We helped delivery the Informatica B2B DX Solution by tracking clinical data loading, partner interaction, restricting access to management policy and defining the clinical study definitions. ExistBI are a leading Systems Integrator, Informatica consulting, Informatica training and support organization with offices in the US, UK and Europe.

To learn more about our Informatica consulting services, give us a call or send inquiries to

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211

Case Study: Data Warehouse Consulting for Boeing

Our client is one of the world’s pre-eminent manufacturers of airplane engines and other aerospace related engineering

Boeing sought to integrate an SAP environment with a number of other systems into an Enterprise Data Warehouse. They had hired a large Systems Integrator to design and develop the EDW. The tools chosen were an Oracle database and Informatica PowerCenter and Informatica Data Quality. We were brought in to audit the EDW design, architecture and implementation and to provide any recommendations.


ExistBI’s Data Warehouse consulting divisions best practice methodology was utilized to analyze the EDW and provide a comprehensive review of the design, architecture and development efforts and techniques.

ExistBI analyzed the client requirements, EDW architecture, the data model, the Oracle and Informatica PowerCenter environments, the development and testing techniques, the deployment methodologies and the overall process. Most of the design was determined to be fit for purpose, however, there were a number of significant recommendations made which allowed for faster development and more efficient use of hardware, software, and human resources. Exist was able to offer a number of automation techniques as well as analysis and to alert the SI with regards to a potential process flaw.

Team Structure

A certified Senior Consultant was on site to analyze the requirements and implementation as well as to interview the key stakeholders in the project.


ExistBI was able to provide a quick and comprehensive audit of an EDW implementation providing the end client with necessary assurance and giving valuable advice to the implementing SI, keeping them on track, reducing overall development time, and keeping costs at the minimum.

Informatica Consulting Partner Link

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211

Business Intelligence Market Study – 2013

Benefits of the Study

This Business Intelligence Market Study provides a wealth of information and analysis – offering value to both consumers and producers of Business Intelligence technology and services.

Download: Business Intelligence Market Study

Also read: Business Intelligence Consulting Services For Global Innovative Manufacturing Company

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211

Enterprise Information Management Strengthens Your Information Value

Why Read This Report?

Businesses increasingly rely on information to make smarter, faster decisions for competitive advantage. Although business leaders want access to all kinds of information, structured data and unstructured content are often stored separately and have disconnected architectures. Enterprise information management (EIM) encompasses the processes, policies, technologies, and architectures that capture, consume, and govern the usage of an organization’s structured data and unstructured content.

EIM enables businesses to derive more value from their data and content, harmonizing what has traditionally been a dichotomy. Forrester proposes a logical representation for EIM that unifies data management and content management using a common set of foundational technologies.

Download Report: Enterprise Information Management Strengthens Your Information Value

Deprecated: get_currentuserinfo is deprecated since version 4.5.0! Use wp_get_current_user() instead. in /home/exist1783/ on line 5211


Some of our representative clients include:

Contact Us

Get in Touch with Your Closest Office

    For a free assessment or quick quote, drop us a line

      For a free assessment or to book this class, drop us a line