πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

CI/CD: DevOps, ARM/Bicep & ADF Git Integration

Azure Data EngineeringCI/CD⭐ Premium

Advertisement

CI/CD: DevOps, ARM/Bicep & ADF Git Integration

Automated deployment pipelines for Azure data engineering with DevOps, IaC, and Git integration

CI/CD Architecture

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    CI/CD PIPELINE ARCHITECTURE                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                     β”‚
β”‚  DEVELOPMENT            BUILD                 DEPLOYMENT            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚ Git      │────────>β”‚ Azure DevOps │─────>β”‚ Dev          β”‚      β”‚
β”‚  β”‚ Repositoryβ”‚        β”‚ Build Pipelineβ”‚     β”‚ Environment  β”‚      β”‚
β”‚  β”‚ (ADF Git)β”‚         β”‚              β”‚      β”‚              β”‚      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β”‚ β€’ Validate   β”‚      β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚                       β”‚ β€’ Package    β”‚             β”‚                β”‚
β”‚                       β”‚ β€’ Test       β”‚             β–Ό                β”‚
β”‚                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚                                             β”‚ QA           β”‚      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                          β”‚ Environment  β”‚      β”‚
β”‚  β”‚ ARM/Bicep    │─────────────────────────>β”‚              β”‚      β”‚
β”‚  β”‚ Templates    β”‚                          β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                 β”‚                β”‚
β”‚                                                   β–Ό                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚ Terraform    │─────────────────────────>β”‚ Production   β”‚      β”‚
β”‚  β”‚ (Optional)   β”‚                          β”‚ Environment  β”‚      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚                                                                     β”‚
β”‚  ADF GIT INTEGRATION:                                               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Dev Branch ──> PR ──> Main Branch ──> Publish ──> Prod     β”‚   β”‚
β”‚  β”‚                                                               β”‚   β”‚
β”‚  β”‚  β€’ Collaborative editing in ADF Studio                       β”‚   β”‚
β”‚  β”‚  β€’ Branch-based development                                  β”‚   β”‚
β”‚  β”‚  β€’ PR validation and code review                             β”‚   β”‚
β”‚  β”‚  β€’ Automated publishing to collaboration mode                β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Bicep Template Example

// Data engineering infrastructure
param location string = resourceGroup().location
param environment string = 'prod'

// Storage Account
resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
  name: 'stdatalake${environment}${location}'
  location: location
  kind: 'StorageV2'
  sku: { name: 'Standard_LRS' }
  properties: {
    isHnsEnabled: true
    supportsHttpsTrafficOnly: true
    minimumTlsVersion: 'TLS1_2'
    encryption: {
      services: { blob: { enabled: true } }
      keySource: 'Microsoft.Storage'
    }
  }
}

// Synapse Workspace
resource synapseWorkspace 'Microsoft.Synapse/workspaces@2023-05-01' = {
  name: 'syn-${environment}-workspace'
  location: location
  identity: { type: 'SystemAssigned' }
  properties: {
    defaultDataLakeStorage: {
      accountUrl: 'https://${storageAccount.name}.dfs.core.windows.net'
      filesystem: 'synapsefs'
    }
    sqlAdministratorLogin: 'sqladmin'
  }
}

// ADF with Git integration
resource dataFactory 'Microsoft.DataFactory/factories@2018-06-01' = {
  name: 'adf-${environment}'
  location: location
  identity: { type: 'SystemAssigned' }
  properties: {
    repoConfiguration: {
      accountName: 'your-ado-org'
      projectName: 'data-engineering'
      repositoryName: 'adf-repo'
      collaborationBranch: 'main'
      rootFolder: '/'
    }
  }
}

Azure DevOps Pipeline

# azure-pipelines.yml
trigger:
  branches:
    include:
      - main
      - feature/*

stages:
  - stage: Build
    jobs:
      - job: ValidateTemplates
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - task: AzureCLI@2
            inputs:
              azureSubscription: 'dataengineering-subscription'
              scriptType: 'bash'
              scriptLocation: 'inlineScript'
              inlineScript: |
                az deployment group validate \
                  --resource-group rg-dataengineering-dev \
                  --template-file infra/main.bicep \
                  --parameters environment=dev

  - stage: DeployDev
    dependsOn: Build
    condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
    jobs:
      - deployment: DeployToDev
        environment: 'dev'
        strategy:
          runOnce:
            deploy:
              steps:
                - task: AzureCLI@2
                  inputs:
                    azureSubscription: 'dataengineering-subscription'
                    scriptType: 'bash'
                    scriptLocation: 'inlineScript'
                    inlineScript: |
                      az deployment group create \
                        --resource-group rg-dataengineering-dev \
                        --template-file infra/main.bicep \
                        --parameters environment=dev

  - stage: DeployProd
    dependsOn: DeployDev
    condition: succeeded()
    jobs:
      - deployment: DeployToProd
        environment: 'prod'
        strategy:
          runOnce:
            deploy:
              steps:
                - task: AzureCLI@2
                  inputs:
                    azureSubscription: 'dataengineering-subscription'
                    scriptType: 'bash'
                    scriptLocation: 'inlineScript'
                    inlineScript: |
                      az deployment group create \
                        --resource-group rg-dataengineering-prod \
                        --template-file infra/main.bicep \
                        --parameters environment=prod

ℹ️

Pro Tip: Use ADF's Git integration for collaborative development. Publish changes through DevOps pipelines to ensure consistent deployments across environments.

Interview Questions

Q1: How do you handle parameterization across environments in ADF? A: Use ADF parameters, linked services with Key Vault references, and ARM template parameters. Store environment-specific values in Key Vault and reference them via expressions.

Q2: What is the difference between ADF collaboration mode and live mode? A: Collaboration mode (Git) enables collaborative development with branching and version control. Live mode is the published, running version. Changes in collaboration mode must be published to go live.

Q3: How do you implement blue-green deployments for data pipelines? A: Deploy new pipeline version alongside existing, validate with test data, switch traffic using deployment slots or feature flags, monitor for issues, and rollback if needed.

Advertisement