πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Purview Deep Dive: Classification, Lineage & Power BI

Azure Data EngineeringPurview Deep Dive⭐ Premium

Advertisement

Purview Deep Dive: Classification, Lineage & Power BI

Master Purview with advanced classification, lineage tracking, and Power BI integration

Purview Capabilities

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    PURVIEW CAPABILITIES                              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                     β”‚
β”‚  AUTOMATED DISCOVERY & SCANNING                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ β€’ Connect to 100+ data sources                               β”‚   β”‚
β”‚  β”‚ β€’ Scheduled scanning (daily/weekly)                          β”‚   β”‚
β”‚  β”‚ β€’ Incremental scanning (changed data only)                   β”‚   β”‚
β”‚  β”‚ β€’ Custom scan rulesets                                       β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                     β”‚
β”‚  CLASSIFICATION                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ β€’ 100+ built-in classifiers (PII, financial)                 β”‚   β”‚
β”‚  β”‚ β€’ Custom classifiers (regex, keyword list)                   β”‚   β”‚
β”‚  β”‚ β€’ Auto-labeling with sensitivity labels                      β”‚   β”‚
β”‚  β”‚ β€’ Column-level classification                                 β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                     β”‚
β”‚  LINEAGE                                                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ β€’ End-to-end lineage across Azure services                   β”‚   β”‚
β”‚  β”‚ β€’ ADF pipeline lineage                                       β”‚   β”‚
β”‚  β”‚ β€’ Databricks notebook lineage                                β”‚   β”‚
β”‚  β”‚ β€’ Synapse SQL lineage                                        β”‚   β”‚
β”‚  β”‚ β€’ Power BI dataset lineage                                   β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                     β”‚
β”‚  DATA MAP & CATALOG                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ β€’ Unified metadata repository                                β”‚   β”‚
β”‚  β”‚ β€’ Business glossary                                          β”‚   β”‚
β”‚  β”‚ β€’ Search and discovery                                       β”‚   β”‚
β”‚  β”‚ β€’ Impact analysis                                            β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Custom Classification

from azure.purview.datamap import PurviewDataMapClient
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
client = PurviewDataMapClient(credential=credential, account_name="purview-prod")

# Create custom classifier
client.classification.create_classification_rule(
    rule={
        "name": "CustomerCodeClassifier",
        "description": "Detects internal customer codes",
        "ruleType": "Regex",
        "pattern": r"CUST-\d{6}",
        "columnNamePatterns": ["customer_id", "cust_code"],
        "minPrecision": 0.8
    }
)

# Apply custom classifier
client.classification.classify_asset(
    asset_type="azure_datalake_gen2_path",
    asset_guid="asset-guid",
    classifiers=["CustomerCodeClassifier"]
)

Lineage Tracking

# Get lineage for a data asset
lineage = client.lineage.get_lineage(
    entity_guid="asset-guid",
    direction="Both"
)

# Visualize lineage graph
for edge in lineage.relations:
    print(f"{edge.source_entity.name} --> {edge.target_entity.name}")
    print(f"  Type: {edge.relationship_type}")

Power BI Integration

{
  "scanName": "powerbi-workspace-scan",
  "dataSource": {
    "type": "PowerBI",
    "properties": {
      "tenantId": "tenant-id",
      "workspaceIds": ["workspace-id-1", "workspace-id-2"]
    }
  },
  "scanRuleset": {
    "type": "System",
    "name": "PowerBI"
  }
}

ℹ️

Pro Tip: Use Purview's lineage to trace data from source to Power BI report. This enables impact analysis when source schemas change and helps with data trust assessment.

Interview Questions

Q1: How do you implement end-to-end lineage in Purview? A: Enable integrations with ADF, Databricks, Synapse, and Power BI. Configure scanning schedules. Use Purview SDK to register custom lineage. Link glossary terms to assets for business context.

Q2: What are the best practices for Purview scanning? A: 1) Scan by domain/collection, 2) Use appropriate scan rulesets, 3) Schedule incremental scans, 4) Monitor scan status, 5) Review and approve classification results, 6) Use custom classifiers for domain-specific data.

Q3: How does Purview support data governance? A: Provides automated discovery, classification, lineage tracking, business glossary, access policies, and compliance reportingβ€”enabling organizations to understand, manage, and protect their data assets.

Advertisement