Serverless ETL in a Cloud Data Warehouse

Short on time? Here are the key facts.

  • A agricultural technology client needed a way to automatically schedule the ETL jobs to run from AWS through the layers of data warehouse without allocating resources to manually trigger the load
  • Aptitive leveraged their ETL framework and best practices to run ETL jobs using Google Cloud Composer
  • The resulting serverless ETL in a cloud data warehouse has automated previously manual processes, saving the company time and money

Industry

Agricultural Technology

Featured Technologies

AWS
Google Cloud Platform
Cloud Composer
Airflow

The Challenge

Aptitive’s client wanted to move their data, housed in AWS, into a data warehouse on Google Cloud Platform. In doing this, they needed a way to automatically schedule the ETL jobs to run from AWS through the layers of data warehouse without allocating resources to manually trigger the load.

The Solution

To alleviate this problem, Aptitive leveraged their ETL framework and best practices to run ETL jobs using Google Cloud Composer, a managed version of Apache Airflow. Aptitive created Python scripts that are run daily, at scheduled times, using Composer, to transform tables from AWS and load them into the cloud.

The Outcome

The solution with Google Cloud Composer manages the daily raw loads and business logic loads for the client, ensuring that employees do not need to spend their time manually triggering the jobs. This ensures that the correct data will be available in a cloud based data warehouse , saves the the company time and money, and allows them to expand their Google Cloud platform capabilities.

Interested in serverless solutions for your organization? Get started with a complimentary strategy session.

Get Started