TechnicalSnowflake

A CTO’s Guide to a Modern Data Platform: What is Snowflake, How is it Different, and Where Does it Fit in Your Ecosystem?

By October 12, 2018 No Comments
What Is Snowflake and How Is It Different

Chances are, you’ve been here before – a groundbreaking new data and analytics technology has started making waves in the market, and you’re trying to gauge the right balance between marketing hype and reality. Snowflake promises to be a self-managing data warehouse that can get you speed-to-insight in weeks, as opposed to years. Does Snowflake live up to the hype? Do you still need to approach implementation with a well-defined strategy? The answer to both of these questions is ‘yes’.

This blog originally appeared as a section of our ebook, “Snowflake Deployment Best Practices: A CTO’s Guide to a Modern Data Platform.” Click here to download the full ebook.

What Is Snowflake and How Is It Different?

Massive Scale….Low Overhead

Snowflake is one of the few enterprise-ready cloud data warehouses that brings simplicity without sacrificing features. It automatically scales, both up and down, to get the right balance of performance vs. cost. Snowflake’s claim to fame is that it separates compute from storage. This is significant because almost every other database, Redshift included, combines the two together, meaning you must size for your largest workload and incur the cost that comes with it.

With Snowflake, you can store all your data in a single place, and size your compute independently. For example, if you need near-real-time data loads for complex transformations, but have relatively few complex queries in your reporting, you can script a massive Snowflake warehouse for the data load, and scale it back down after it’s completed – all in real time. This saves on cost without sacrificing your solution goals.

Elastic Development and Testing Environments

Development and testing environments no longer require duplicate database environments – rather than creating multiple clusters for each environment, you can spin up a test environment as you need it, point it at the Snowflake storage, and run your tests before moving the code to production. With Redshift, you’re feeling the maintenance and cost impact of 3 clusters all running together. With Snowflake, you stop paying as soon as your workload finishes, since Snowflake charges by the second.

With the right DevOps processes in place for CI/CD (Continuous Integration / Continuous Delivery), testing each release becomes closer to a modern application development approach than it does a traditional data warehouse. Imagine trying to do this in Redshift.

Avoiding FTP with External Data Sharing

The separated storage and compute also enables some other differentiating features, such as Data Sharing. If you’re working with external vendors, partners, or customers, you can share your data, even if the recipient is not a Snowflake customer. Behind the scenes, Snowflake is creating a pointer to your data (with your security requirements defined). If you commonly write scripts to share your data via FTP, you now have a more streamlined, secure, and auditable path for accessing your data outside the organization. Healthcare organizations, for example, can create a data share for their providers to access, rather than cumbersome manual processes that can lead to data security nightmares.

Where Snowflake Fits Into Your EcoSystem

Snowflake Is A Part Of Your Data Ecosystem, But It’s Not In A Silo

Always keep this at the top of your mind. A modern data platform involves not only analytics, but application integration, data science, machine learning, and many other components that will evolve with your organization. Snowflake solves the analytics side of the house, but it’s not built for the rest.

When you’re considering your Snowflake deployment, be sure to draw out the other possible components, even if future tools are not yet known. Knowing what Snowflake public cloud flavor to choose (Azure or AWS) will be the biggest decision you will make. Do you see SQL Server, Azure ML, or other Azure PaaS services in the mix, or is the AWS ecosystem more likely to fit better in the organization?

As a company, Snowflake has clearly recognized that they aren’t built for every type of workload. Snowflake partnered with Databricks to allow heavy data science and other complex workloads to run against your data. The recent partnership with Microsoft will ensure Azure services continue to expand their Snowflake native integrations – expect to see a barrage of new partnership announcements over the next 12 months.

Snowflake Data Architecture with Discovery LayerIf you have any questions or want to learn more about how Snowflake can fit into your organization, contact us today.

Snowflake Deployment Best Practices Button LargeRelated Content:

How to Build a Data Warehouse in 6-8 Weeks

Fred Bliss is the CTO at Aptitive. He brings over 15 years of experience solving complex business problems through data solutions including cloud integration, data warehouse modeling, ETL, and front-end reporting implementations.