Background
One of the world’s largest investment, advisory and risk management firms provides portfolio management software that helps customers visualize and act on investment data. To build on the software’s success, the company set out to create a marketplace for financial data sets that leverages the technology behind the software. However, during development, they were unsatisfied with the time it took to onboard new datasets and make them usable on the platform — the underlying issue resulted in month-long delays before a dataset could be shared on the data marketplace. The company wanted to improve the platform’s performance to ensure customer satisfaction upon the product’s launch and reached out to Starschema to design and implement a solution.
Practice Area
- Data Engineering
Business Impact
- Faster insights via reduced data load times
- Improved customer satisfaction through better product performance
Challenges
- Moving from legacy architecture and batch processing to a modern stream-based architecture
- Ensuring the system could handle the huge increase of data due to stream-based architecture
- Designing and deploying a metadata-driven ingestion system to reduce the technical knowledge required for data stewards to onboard datasets
- Building data pipelines in highly volatile infrastructure
- Accommodating a wide range of data types
Technologies
- Snowflake
- dbt
- Python
- Apache Airflow
- DataHub
- Kubernetes