Gradient

Streamlined Enterprise Accounting with a Data as a Service Framework

Practice Area

  • Data Engineering

Business Impacts

  • Easier, more transparent and accurate accounting
  • Improved access to finance data lake

Challenges

  • Replacing ad hoc development with a flexible framework
  • Solution had to involve transition to cloud-native technology

Technologies

  • AWS Glue
  • Spark
  • Greenplum
  • S3
  • PostgreSQL
  • SingleStore

Background

A Fortune 50 conglomerate was not satisfied with its ability to serve internal customers’ data needs and track the costs associated with data requests. Their system made it difficult to provide accurate information for the chargeback process that takes place after a company in the conglomerate accesses data, the costs of which get billed to the central corporate office.

The company needed a solution to achieve improved data access, as well as transparency and accuracy in intra-conglomerate financial affairs, with a unified audit log. Based on their experience with an earlier joint project, they selected Starschema to carry out the necessary development.


Streamlined Enterprise Accounting with a Data as a Service Framework

Challenge

The company was relying on a Starschema-designed finance data lake (FDL) to serve financial reporting and closing processes, but whenever internal customers not integrated into the FDL needed access to data stored in the FDL, unique ETL jobs needed to be developed, resulting in long turnaround times and additional costs.

The solution had to involve a service model that would enable users to subscribe to various data sources and retrieve data from them according to a fixed schedule to relieve the stress on the client’s finance data lake, while continuing to accommodate ad hoc requests. In addition to promoting easy access to data, this system needed to facilitate charging users for the costs associated with data requests by providing accurate data on transactions. The client also required that the solution comprise a metadata-driven framework to eliminate the need for ad hoc development so that modifications would not require costly and time-consuming additional work. The framework also has to be based on cloud-native technology for flexibility and scalability, which meant abandoning the legacy ETL tool for the new solution.

Streamlined Enterprise Accounting with a Data as a Service Framework

Solution

Starschema proposed a list of technologies and approaches to creating a data as a service (DaaS) framework. In line with the client’s requirements, the eventual solution would be built on existing use cases but generic enough to easily accommodate new internal customers and demands. In addition to identifying the use cases that would serve as the foundation for the framework, the client also provided project management support during the development process.

After reviewing the options and suggestions provided by Starschema, the client decided against implementing an always-running cluster, as it could have ensured lower costs only at the expense of accuracy in measuring the parameters of individual transactions. Ultimately, they chose a solution that would be more costly but provide uncompromising accuracy to prevent disputes about chargeback amounts.

The technology stack includes Apache Spark as the analytics engine, AWS Glue as the ETL service providing custom-tailored jobs and Amazon RDS for PostGreSQL as the database engine. Amazon S3 enabled the measuring of the amount of storage used for a request, while Glue made it easier to tag every job run with the name of the subscriber making the request and follow the pricing of individual jobs, since each job run in Glue represents a separate instance. Because the key AWS-native technologies of the solution can clearly indicate the costs associated with a request, their combination made it easy to calculate exact chargeback values.

Streamlined Enterprise Accounting with a Data as a Service Framework


Outcome

The flexible framework that Starschema built enabled the client to serve subscribers with the data they need from their FDL in a fully auditable, metadata-driven manner and calculate the exact costs associated with each request – whether scheduled or ad hoc. This greatly streamlined the related accounting processes and made them more transparent, which has helped the client avoid disputes about chargeback amounts. Development took a total of four months, but thanks to a gradual rollout schedule, certain elements of the solution were already deployed and in use by the second month.

The DaaS framework will also serve as the basis for a series of developments. To promote cost savings in shared environment usage, when the number of subscribers reaches critical mass, an update to the framework will enable moving part of the ingestion and the architecture to an EMR cluster. Another development will be a metadata-driven “pub-sub” subscription model, which will automatically move approved requests into the metadata layer and start the feed to improve user experience and make maintenance easier. Finally, a mirror stream will complement the current data-model-based system to enable true live streaming of data, fulfil requests more quickly and grant access to data that is not a part of the data model.

Ask the Expert

Ledényi Norbert 5222
Norbert Ledényi

BI Developer Team Lead

Norbert has over 10 years of experience delivering cutting edge technology solutions for Fortune 500 companies as a data engineer, tech lead, project manager and, most recently, BI developer team lead. His main areas of interest are AWS-based cloud architectures, data lakes and data warehouses.

By completing this form, you agree to receive periodic product and services related communications, emails and promotional materials from Starschema and their partners, and provide your consent to our processing of your information in accordance with Starschema’s Privacy Policy. We take the protection of your information seriously.

Enterprise-Scale AWS Cloud Migration

Our client, a Fortune 50 energy company needed to migrate all their existing data and analytics platforms to Amazon Web Services and redesign the existing architecture to leverage AWS-native technologies. This was an initiative to optimize performance and reduce long-term operation costs by taking advantage of recent advancements with AWS cloud-native solutions.

Starschema HealthLake

Healthcare Information Technology (HIT) is an indispensable part of managing and delivering healthcare services but patient data handling is highly regulated, presenting a challenge to practitioners. Learn how Starschema HealthLake can ensure compliance with the U.S. Health Insurance Portability and Accountability Act (HIPAA) while streamlining analytics.

Oracle to AWS Data Platform Migration

When large organizations merge or divest, then new entity has business transformation thrust upon it. Our client, a provider of power generation solutions, divested from a larger organization to become a brand-new company. With the looming deadline of a cut-over from the existing data platform, the client engaged Starschema to validate and test the existing technology stack and determine the best way to move forward.

Starschema Antares iDL™

A fully automated, compliant-by-design intelligence data lake architecture with real-time ingestion and best-of-breed standardization and audit features.