Azure DevOps: Pipelines, Repos & Artifacts for Data
CI/CD automation with Azure DevOps for data engineering workloads
DevOps for Data Engineering
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DEVOPS FOR DATA ENGINEERING β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β REPOS BUILD RELEASE β
β ββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β ADF Git βββββββββ>β Azure DevOps ββββββ>β Dev β β
β β ARM/Bicepβ β Build β β Environment β β
β β Python β β Pipeline β β β β
β β SQL β β β ββββββββ¬ββββββββ β
β ββββββββββββ β β’ Validate β β β
β β β’ Package β βΌ β
β β β’ Test β ββββββββββββββββ β
β ββββββββββββββββ β QA β β
β β Environment β β
β ββββββββββββββββ ββββββββ¬ββββββββ β
β β Artifacts β β β
β β (Packages) βββββββββββββββββββββββββββββββββββ β
β β β β
β β β’ Python β ββββββββββββββββ β
β β β’ NuGet ββββββββββββββββββββββββββ>β Production β β
β β β’ Maven β β Environment β β
β ββββββββββββββββ ββββββββββββββββ β
β β
β TESTS β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Unit Tests: Validate transformation logic β β
β β β’ Integration Tests: Test pipeline connectivity β β
β β β’ Data Quality Tests: Validate output data β β
β β β’ Performance Tests: Benchmark query performance β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
YAML Pipeline Example
# azure-pipelines.yml
trigger:
branches:
include:
- main
- feature/*
variables:
- group: dataengineering-variables
stages:
- stage: Build
jobs:
- job: ValidateAndTest
pool:
vmImage: 'ubuntu-latest'
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '3.9'
- script: |
pip install pytest
pytest tests/ --junitxml=test-results.xml
displayName: 'Run Unit Tests'
- task: PublishTestResults@2
inputs:
testResultsFiles: 'test-results.xml'
- stage: DeployDev
dependsOn: Build
condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
jobs:
- deployment: DeployToDev
environment: 'dev'
strategy:
runOnce:
deploy:
steps:
- task: AzureCLI@2
inputs:
azureSubscription: 'dataengineering-dev'
scriptType: 'bash'
inlineScript: |
az deployment group create \
--resource-group rg-dataengineering-dev \
--template-file infra/main.bicep \
--parameters environment=dev
- stage: DeployProd
dependsOn: DeployDev
condition: succeeded()
jobs:
- deployment: DeployToProd
environment: 'prod'
strategy:
runOnce:
deploy:
steps:
- task: AzureCLI@2
inputs:
azureSubscription: 'dataengineering-prod'
scriptType: 'bash'
inlineScript: |
az deployment group create \
--resource-group rg-dataengineering-prod \
--template-file infra/main.bicep \
--parameters environment=prod
Branch Strategy
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GIT BRANCH STRATEGY β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β main (Production) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Stable, production-ready code β β
β β Protected branch: PR required, builds must pass β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β² β
β β PR β
β develop (Integration) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Integration branch for feature merges β β
β β Auto-deploy to Dev environment β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β² β
β β PR β
β feature/* (Features) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Individual feature branches β β
β β Developers work here β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βΉοΈ
Pro Tip: Use branch policies to require PR reviews and successful builds before merging to main. This ensures code quality and prevents broken deployments.
Interview Questions
Q1: How do you implement CI/CD for ADF with Azure DevOps? A: 1) Enable ADF Git integration, 2) Create build pipeline for validation, 3) Use ARM/Bicep templates for infrastructure, 4) Create release pipeline with environment promotion, 5) Implement approval gates for production.
Q2: What is the difference between build and release pipelines? A: Build pipelines compile, test, and package code. Release pipelines deploy artifacts to environments with approval gates and deployment strategies.
Q3: How do you handle rollback in Azure DevOps? A: Use deployment slots for blue-green deployments. Keep previous version artifacts. Implement automated rollback triggers on failure. Use infrastructure as code for environment recreation.