Data Mesh: Building a Scalable Information Architecture
CID offers critical experience in establishing modern information architecture, including Data Mesh. This is how we typically do it.
Transformation
With two inaugural activities, we start the journey towards a modern, state-of-the-art information architecture, specifically a Data Mesh.
1. Workshops
Extensive workshop sessions with business and software stakeholders explain the data needs emerging from business and the modern concept of a Data Mesh. We mainly focus on the urgency of fostering data culture and literacy, which will benefit all stakeholders across the organization.
2. Driving up data literacy
We harness a structured approach to describing information offerings. This helps drive data literacy, maintain a shared understanding and semantics, and establish change management for data products. The process supports the design of the new data architecture and the definition of the data products and data marts.
Case Study
Data Mesh at a Leading European Retailer
Building upon the existing software architecture, the foundational layer of the Data Mesh and the inaugural set of data products and data marts were successfully deployed. The swift implementation began a new era where business processes, analytics, and decision-making are continually enhanced through an increasing array of data products and insights.
Implementation
We typically form a “data platform team” squad, responsible for the infrastructure setup (e.g., Kubernetes, Snowflake), the creation of the required tooling (operators, drivers), and supporting other squads to create data products efficiently and in line with governance.
We introduce or harness a data catalog so squads can register their products. This enables other stakeholders to find and integrate data products, document usage, and comply with governance requirements, including lineage.
Product or service owners oversee the creation of data products. For instance, when a customer places an order (a business event), the data product “order history” provides order data in line with the product specification, which requires strict constraints, consistent scheme versioning, and reproducible data output. Such rules are critical to enable a semantic presentation and self-service re-use of a data product. The squads cover the development of the data product, which includes the transformation and storage of the data, the provision of an API, and the registration in the data catalog. The data products provide data for further processing by other products and services.
Data Marts are consumers of (multiple) data products that directly enable self-service analytics. Unlike data products that are – in comparison – rather raw, data marts sit in data platforms like Snowflake and are ready to use with reporting and analytics tools such as Tableau; they ensure the appropriate user experience required for business user self-service access. Data marts are built by the squads that created the business service and data product. For multiple data products, the “nearest” squad is typically in charge (e.g., the orders squad covers a data mart offering order histories enriched with CRM data). This maintains scalability and domain expertise when building data marts.
Organization and Roles
A pivotal aspect of the data mesh journey is creating a data platform and squad focused on crafting the standards, governance, and infrastructure. This cross-functional team of experts, including CID professionals, collaborates closely with legal, compliance, and IT units to create a mesh that stands on the pillars of robust governance.
In this decentralized approach, the onus of data product creation and testing rests with the squads aligned with the respective business services, thus promoting responsibility while maintaining scalability.
Retail Media: Leveraging Data as a Product
In the evolving landscape of retail business, chains, including supermarkets, are perpetually in the quest for lucrative opportunities to spur growth and enhance profitability. A frontrunner in this innovation race is leveraging customer data, not just as a tool for internal improvements but as a product with immense market potential.