πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Data Mesh on Azure: Domain Ownership & Data Products

Azure Data EngineeringData Mesh⭐ Premium

Advertisement

Data Mesh on Azure: Domain Ownership & Data Products

Decentralized data architecture with domain-oriented ownership and self-serve data platform

Data Mesh Architecture

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    DATA MESH ARCHITECTURE ON AZURE                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                     β”‚
β”‚  DOMAIN 1: SALES              DOMAIN 2: MARKETING                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚  β”‚ Data Product     β”‚        β”‚ Data Product     β”‚                 β”‚
β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚        β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚                 β”‚
β”‚  β”‚ β”‚ Raw: ADLS    β”‚ β”‚        β”‚ β”‚ Raw: ADLS    β”‚ β”‚                 β”‚
β”‚  β”‚ β”‚ Curated:Deltaβ”‚ β”‚        β”‚ β”‚ Curated:Deltaβ”‚ β”‚                 β”‚
β”‚  β”‚ β”‚ API: Synapse β”‚ β”‚        β”‚ β”‚ API: Synapse β”‚ β”‚                 β”‚
β”‚  β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚        β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚                 β”‚
β”‚  β”‚ Owner: Sales Eng β”‚        β”‚ Owner: Mktg Eng β”‚                 β”‚
β”‚  β”‚ SLA: 99.9%       β”‚        β”‚ SLA: 99.5%       β”‚                 β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚           β”‚                           β”‚                            β”‚
β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                            β”‚
β”‚                       β”‚                                            β”‚
β”‚                       β–Ό                                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚                    SELF-SERVE PLATFORM                        β”‚   β”‚
β”‚  β”‚                                                               β”‚   β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚   β”‚
β”‚  β”‚  β”‚ Data Lake    β”‚  β”‚ Compute      β”‚  β”‚ Governance   β”‚      β”‚   β”‚
β”‚  β”‚  β”‚ (ADLS Gen2)  β”‚  β”‚ (Synapse/    β”‚  β”‚ (Purview)    β”‚      β”‚   β”‚
β”‚  β”‚  β”‚              β”‚  β”‚  Databricks) β”‚  β”‚              β”‚      β”‚   β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚   β”‚
β”‚  β”‚                                                               β”‚   β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚   β”‚
β”‚  β”‚  β”‚ Identity     β”‚  β”‚ Monitoring   β”‚  β”‚ Discovery    β”‚      β”‚   β”‚
β”‚  β”‚  β”‚ (Azure AD)   β”‚  β”‚ (Monitor)    β”‚  β”‚ (Purview)    β”‚      β”‚   β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                     β”‚
β”‚  FEDERATED COMPUTATIONAL GOVERNANCE                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ β€’ Global data standards (naming, schema, quality)            β”‚   β”‚
β”‚  β”‚ β€’ Cross-domain data contracts                                β”‚   β”‚
β”‚  β”‚ β€’ Interoperability protocols                                 β”‚   β”‚
β”‚  β”‚ β€’ Data product certification                                 β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Product Template

{
  "dataProduct": {
    "name": "sales-transactions",
    "domain": "sales",
    "version": "2.1.0",
    "owner": "sales-data-team@company.com",
    "sla": {
      "availability": "99.9%",
      "freshness": "1 hour",
      "latency": "< 100ms"
    },
    "dataAssets": [
      {
        "name": "fact_sales",
        "type": "delta-table",
        "location": "abfss://sales@stdatalake001.dfs.core.windows.net/curated/fact_sales",
        "schema": "schema/fact_sales.json",
        "quality": {
          "completeness": 99.5,
          "accuracy": 99.9
        }
      }
    ],
    "interfaces": [
      {
        "type": "sql-endpoint",
        "connection": "syn-workspace.sql.azuresynapse.net",
        "database": "sales_analytics"
      },
      {
        "type": "rest-api",
        "endpoint": "https://api.company.com/sales/v2"
      }
    ],
    "discovery": {
      "purviewCollection": "SalesData",
      "tags": ["transactions", "revenue", "daily"]
    }
  }
}

Domain-Specific Data Pipelines

# Sales domain data product pipeline
from pyspark.sql import SparkSession
from delta.tables import DeltaTable

spark = SparkSession.builder \
    .appName("sales-data-product") \
    .config("spark.sql.catalog.sales", "com.databricks.sql.datacatalog") \
    .getOrCreate()

# Ingest raw data
raw_df = spark.read \
    .format("parquet") \
    .load("abfss://raw@stdatalake001.dfs.core.windows.net/sales/")

# Apply domain transformations
curated_df = raw_df \
    .filter(raw_df.amount > 0) \
    .withColumn("revenue", raw_df.quantity * raw_df.unit_price) \
    .groupBy("sale_date", "product_category", "region") \
    .agg(
        F.sum("revenue").alias("total_revenue"),
        F.count("*").alias("transaction_count")
    )

# Write as data product
curated_df.write \
    .format("delta") \
    .mode("overwrite") \
    .save("abfss://sales@stdatalake001.dfs.core.windows.net/curated/fact_sales")

ℹ️

Pro Tip: Each data product should be independently deployable, discoverable via Purview, and have clear SLAs and data contracts.

Interview Questions

Q1: How does Data Mesh differ from traditional data warehousing? A: Traditional: Central team owns all data. Data Mesh: Domain teams own their data as products. Benefits: Scalability, domain expertise, faster time-to-market. Challenges: Cross-domain governance, data consistency.

Q2: What are the four principles of Data Mesh? A: 1) Domain ownership, 2) Data as a product, 3) Self-serve platform, 4) Federated computational governance. Each principle addresses a specific challenge in decentralized data architectures.

Q3: How do you implement cross-domain data sharing in Data Mesh? A: Define data contracts between domains, use standardized APIs (Synapse SQL endpoints), implement data product discovery via Purview, and establish governance policies for data quality and access.

Advertisement