
Solution
Metadata-Driven Ingestion
Solution
Metadata-Driven Ingestion
Industry
CPG
Region
US
Technology
Microsoft Azure
Context
The client, a multinational manufacturer of confectionery and other consumer products, manages vast volumes of structured and unstructured data across multiple business units, geographies, and sources. Their existing data ingestion processes were fragmented, code-driven, effort-intensive, and difficult to scale, lacking the ease of use and flexibility of a metadata-driven approach. This resulted in delays in data availability, operational inefficiencies, governance risks, and slower time-to-insight.
Problem Statement
MathCo partnered with the client to implement a metadata-driven ingestion framework on Azure Data Factory, enabling scalable, reusable, and governed data onboarding:
- Developed reusable connectors to handle diverse data formats (CSV, Excel, API, Delta) and load patterns, including Append, Merge, Overwrite, and SCD Type 2.
- Standardized and centralized data into a unified data lake with automated Data Quality (DQ) rules.
- Leveraged Azure DevOps to enable seamless, automated deployment of notebooks, scripts, and workflows.
- Utilized Unity Catalog for data cataloging, lineage, and access control, complemented by Azure Purview for centralized catalog management.
Impact
- Successfully onboarded 8+ analytics products with streamlined data capture and transformation on Databricks.
- Saved 5,000+ hours in building ingestion pipelines by implementing a metadata-driven framework.
- Achieved 50% faster data ingestion for new data sources, accelerating Time-to-Insights.
- Reduced data quality issues by 40% through automated data quality checks, enhancing trust and reliability of analytics.