A CTO’s Guide to a Modern Data Platform: How to Build a Data Warehouse in 6-8 Weeks

By October 12, 2018 No Comments

Snowflake Allows Business To See Value In An Extraordinarily Short Timeframe

It’s true that in a very short period of time, you can get an operational data warehouse loaded with all your source data. Snowflake and its technology partner ecosystem, such as Attunity, Alooma, and Fivetran, allow you to take your databases and SaaS applications, replicate them to Snowflake, and see results in a significantly shorter timeframe than we’ve seen before. Write some SQL views in Snowflake against this data, point your favorite BI tool at it and get lightning fast results.

With the right plan in place, you can (aggressively) deliver that first business ‘wow’ in 6-8 weeks. Aptitive typically recommends at least two analytical virtual warehouses in enterprise deployments: one for data discovery, and one for more structured and governed reporting.

The practice of having both a data discovery layer and a governed layer for your Snowflake deployment not only saves you time in the initial build, but it creates a continuous integration and deployment pattern. It finally makes a dent in the notion that a data warehouse cannot be ‘agile’.

Future State Snowflake Data ArchitectureFuture State Snowflake Data Architecture

Future State Snowflake Data Architecture

With this approach, you not only achieve governance and speed-to-insight, but you’ve also cut down your Snowflake consumption costs. Running complex queries at run-time, every time, can get expensive. Snowflake’s caching can help here, but if you’re constantly running a complex join across over 20 tables, that physical layer might help you move from an XL-sized Snowflake warehouse to a L or M. In the long run, those costs savings will add up. When ‘best practice’ or risk avoidance isn’t enough to justify this approach – it’s possible the dollar savings might speak for itself.

The initial 6-8 week path assumes a focus on the data discovery layer, as depicted below. Loading all your data into a data discovery layer should be the primary development activity in your Snowflake proof of concept (POC) or pilot. Here are some tips:

  • Find several source systems that have the data your stakeholders need
  • Begin the process of rapidly loading into your Snowflake data discovery layer
  • Write iterative SQL in Snowflake views to build your business logic
  • Connect your BI tool to Snowflake and build a very simple dashboard
  • Get feedback from business, with the new dashboard as a facilitation tool
  • Repeat this process – more feedback is better

You’ve not only increased the speed to market, but you’ve also enabled developers and business users to execute an analytics project in a completely new way.

Generic Snowflake Data ArchitectureHighlighted: Speed-to-insight approach, with transformations in Snowflake

This all comes with a caveat – yes, you can write all your logic in Snowflake and do everything you need from an analytics standpoint, but it will limit you in the long run. Every data project has sacrifices to be made, whether it’s time, features, cost, or flexibility. You need to balance these with a long term vision in mind for how your data will be used across the organization. Snowflake will get you part of the way there, but a good plan and strategy will take you even further.

In part three, we’ll go over how to leverage Snowflake’s strengths to save you time, money, and re-work. In the meantime, if you have any questions or want to learn more about how Snowflake can fit into your organization, contact us today.

Snowflake Deployment Best Practices Button (1)

This blog originally appeared as a section of our ebook, “Snowflake Deployment Best Practices: A CTO’s Guide to a Modern Data Platform.” Click here to download the full ebook.

Related Content:

What is Snowflake, How is it Different, and Where Does it Fit in Your Ecosystem?

Data Strategy and Governance

Methods to Implement a Snowflake Project

Fred Bliss is the CTO at Aptitive. He brings over 15 years of experience solving complex business problems through data solutions including cloud integration, data warehouse modeling, ETL, and front-end reporting implementations.