Data ManagementAnalyticsTableauInsuranceETL

How to Build a Data Warehouse for the Insurance Industry

By August 4, 2017 No Comments
How to Build a Data Warehouse for Insurance

Insurance is a data-heavy industry with a huge upside to leveraging business intelligence. Today, I will discuss the approach we use at Aptitive to build out a data warehouse for insurance clients.

Understand the Value Chain and Create a Design

At its most basic, the insurance industry can be described by its cash inflows and outflows (e.g., the business will collect premiums based on effective policies and pay out claims resulting from accidents). From here, we can describe the measures that are relevant to these activities:

  • Policy Transactions: Quote, Written Premium, Fees, Commission
  • Billing Transactions: Invoice, Taxes
  • Claim Transactions: Payment, Reserve
  • Payment transactions: Received amount

From these four core facts, we can collaborate with subject matter experts to identify the primary “describers” of these measures. For example, a policy transaction will need to include information on the policyholder, coverage, covered items, dates, and connected parties. By working with the business users and analyzing the company’s front-end software like Guidewire or Dovetail, we can design a structure to optimize reporting performance and scalability.

Develop a Data Flow

I have written in detail about my preferred Extract-Transform-Load design in the past, but here is the quick overview:

  1. Isolate your source data in a “common landing area”: I have been working on an insurance client with 20+ data sources (many acquisitions). The first step of our process is to identify the source tables that we need to build out the warehouse and load the information in a staging database. (We create a schema per source and automate most of the development work.)
  2. Denormalize and combine data into a data hub: After staging the data in the CLA, our team creates “Get” Stored Procedures to combine the data into common tables. For example, at one client, I have 13 sources with policy information (policy number, holder, effective date, etc.) that I combined into a single [Business].[Policy] table in my database. I also created tables for tracking other dimensions and facts such as claims, billing, and payment.
  3. Create a star schema warehouse: Finally, the team loads the business layer into the data warehouse by assigning surrogate keys to the dimensions, creating references in the facts, and structuring the tables in a star schema. If designed correctly, any modern reporting tool, from Tableau to SSRS, will be able to connect to the data warehouse and generate high-performance reporting.

Produce Reports, Visualizations, and Analysis

By combining your sources into a centralized data warehouse for insurance, the business has created a single source of the truth. From here, users have a well of data to extract operational metrics, build predictive models, and generate executive dashboards. The potential for insurance analytics is endless: premium forecasting, geographic views, fraud detection, marketing, operational efficiency, call-center tracking, resource optimization, cost comparisons, profit maximization, and so much more!

This blog post was originally published on Medium.