Gradient Allison louise Sy Dmd Etm Ad8 unsplash

Enterprise-Scale AWS Cloud Migration

Cloud migration and re-architecting using a custom automation toolset

Practice Area

  • Data Engineering

Business Impacts

  • Fast time-to-value
  • Lower cost of ownership
  • Increased agility and scalability

Challenges

  • Large and complex data sets
  • Parallel migration and re-architecting
  • Need for high migrated data quality

Technologies

  • Greenplum
  • Talend
  • HVR
  • Tableau

Background

Our client, a Fortune 50 energy company needed to migrate all their existing data and analytics platforms to Amazon Web Services and redesign the existing architecture to leverage AWS-native technologies. This was an initiative to optimize performance and reduce long-term operation costs by taking advantage of recent advancements with AWS cloud native solutions.

Challenge

The project involved lift-and-shifting over three thousand Talend ETL jobs, thousands of stored procedures and hundreds of Tableau reports from on-premises databases while altering relevant architecture to take advantage of AWS-native technologies. The migration had to ensure fast time-to-value, which necessitated retaining the quality of dashboards during the migration process to minimize the maintenance required after go-live.

Solution

Starschema conducted a preliminary assessment of critical factors such as workloads, peak times and potential issues. This reduced the associated burdens on the client during the migration, and we used the results to optimize the solution to ensure effectiveness. The Starschema team’s senior AWS architects designed the new architecture and led a “bottom-up” team focusing on the platforms, data integrity and ETL/UDF conversion and a “top-down” team handling report conversion and outlying data quality issues.

Enterprise-Scale AWS Cloud Migration

Through analysis of the client’s on-premises data lake we created a project plan that enabled continuous report delivery for each business unit. Our team developed a web-based application that allowed the client to monitor and test migration results, including object availability, data integration latency, data type mismatches, null columns and row counts. We were also able to rely on our close working relationship with AWS to recommend new features and introduce them on-the-fly during the migration.

The solution’s central component was a custom-developed toolset for automating and optimizing tedious and repeatable tasks to increase productivity when re-architecting the client’s on-premises data lake and its components in the AWS cloud, using AWS-native services where possible. The toolset includes tooling and frameworks for UDF conversion, DDL conversion and data type optimization and allows for the automated migration of databases, ETL pipelines, ingestion and visualization platforms. It supports Greenplum, Redshift, Snowflake, Tableau, Talend and HVR alongside other technologies and enables exceptional data quality to be retained during the migration. Using this toolset, we lift-and-shifted the client’s large and complex data sets from Oracle and Greenplum to AWS while simultaneously re-architecting them to fully leverage cloud-native technologies such as S3, Spectrum, Glue and Redshift, greatly reducing the manual labor involved.

Our team paid special attention to making the maintenance and future optimization of the new architecture easier. The metadata inventory and version control system we employed for productivity during the migration would later serve this secondary purpose, and we used recorded video conferencing and extensive documentation to transfer knowledge to the client. After the go-live, Starschema provided hypercare for 30 days, using our state-of-the-art in-house operations monitoring systems to enable predictive maintenance, reduce the reliance on manual labor and future-proof the client’s new, cloud-based environment.

Outcome

In total, we migrated over 3800 Talend ETL jobs, 3150 PLSQL stored procedures and 630 workbooks from 210 published data sources. The client realized excellent time-to-value by using automation, saving hundreds of man hours during the migration process, and was able to rely on the new architecture during the end-of-year closing shortly after go-live. This was also possible because the migrated dashboards required very little maintenance thanks to the high data quality retention during the automated processes. The client reduced the total cost of ownership by using end-to-end AWS services to achieve savings on hardware, license and maintenance costs, and the eventual fine-tuning of the architecture will help realize further cost optimization.

Data Platform Migration

When large organizations merge or divest, then new entity has business transformation thrust upon it. Our client, a power generation manufacturer, divested from a larger organization to become a brand-new company. With the looming deadline of a cut-over from the existing data platform, the client engaged Starschema to validate and test the existing technology stack and determine the best way to move forward.

Tableau Solutions

Hundreds of companies — Fortune 10 through startups — rely on Tableau and Starschema to see, understand, and use their data to drive business results. Our Tableau solutions are engineered to bridge the gap between the out-ofthe-box Tableau capabilities and the needs of our Fortune 500 clients. These solutions are cost effective, and can rapidly and easily be implemented in the enterprise environment.

Starschema Antares iDL™

A fully automated, compliant-by-design intelligence data lake architecture with real-time ingestion and best-of-breed standardization and audit features.

Large Scale Data Replication Deployment

Our client, a global manufacturer in the power generation industry, faced challenges with an aging reporting environment based on multiple Oracle ODS systems and reporting straight from source databases.