Compliance: HIPAA, GDPR, SOC2 & FedRAMP
Regulatory compliance for Azure data engineering with HIPAA, GDPR, SOC2, and FedRAMP implementation
Compliance Framework
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β COMPLIANCE FRAMEWORK β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β REGULATION REQUIREMENTS AZURE SERVICES β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β HIPAA β β PHI β β Azure β β
β β (Healthcare) ββββ Protection ββββββββ Key Vault β β
β β β β Audit Logs β β Purview β β
β ββββββββββββββββ β BAAs β β Monitor β β
β ββββββββββββββββ ββββββββββββββββ β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β GDPR β β Data β β Azure β β
β β (EU Privacy) ββββ Protection ββββββββ Purview β β
β β β β Right to β β Key Vault β β
β ββββββββββββββββ β Erasure β β RBAC β β
β β Consent β ββββββββββββββββ β
β ββββββββββββββββ β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β SOC 2 β β Security β β Azure β β
β β (Trust) ββββ Availability ββββββββ Monitor β β
β β β β Processing β β Defender β β
β ββββββββββββββββ β Confidentiality β Sentinel β β
β ββββββββββββββββ ββββββββββββββββ β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β FedRAMP β β FedRAMP β β Azure β β
β β (Gov Cloud) ββββ Moderate/ ββββββββ Government β β
β β β β High β β Cloud β β
β ββββββββββββββββ β Baseline β β β β
β ββββββββββββββββ ββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
GDPR Implementation
# Data subject access request (DSAR)
from azure.purview.datamap import PurviewDataMapClient
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
client = PurviewDataMapClient(credential=credential, account_name="purview-prod")
# Find all data for a customer (right to access)
def process_dsar(customer_id):
# Search Purview for all assets containing customer data
search_results = client.discovery.search(
keywords=customer_id,
filters={
"and": [
{"fieldName": "typeName", "values": ["azure_datalake_gen2_path"]}
]
}
)
for asset in search_results:
# Export data for this customer
export_customer_data(asset, customer_id)
# Generate report
generate_dsar_report(customer_id, search_results)
# Right to erasure
def process_erasure(customer_id):
# Find and delete/anonymize customer data
search_results = client.discovery.search(keywords=customer_id)
for asset in search_results:
# Anonymize or delete customer records
anonymize_customer_data(asset, customer_id)
HIPAA Compliance Checklist
{
"hipaa_controls": {
"access_control": {
"azure_ad_mfa": true,
"rbac_least_privilege": true,
"conditional_access": true,
"privileged_identity_management": true
},
"encryption": {
"at_rest": "AES-256 (CMK in Key Vault)",
"in_transit": "TLS 1.2",
"key_management": "Azure Key Vault HSM"
},
"audit_logging": {
"azure_monitor": true,
"diagnostic_settings": true,
"log_analytics_workspace": true,
"log_retention_days": 2555
},
"data_protection": {
"backup_enabled": true,
"geo_redundancy": true,
"soft_delete": true,
"purview_classification": true
}
}
}
Compliance Monitoring
// HIPAA audit log query
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.KEYVAULT"
| where OperationName == "SecretGet"
| project TimeGenerated, CallerIdentity=identity_claim_iat_s,
SecretName=resourceId_s, ResultType
| where ResultType == "Success"
| order by TimeGenerated desc
// GDPR data access tracking
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STORAGE"
| where OperationName == "Read"
| summarize AccessCount = count() by CallerIpAddress, bin(TimeGenerated, 1h)
| render timechart
β οΈ
Compliance Critical: Ensure all PHI/PII data is encrypted at rest with Customer-Managed Keys (CMK) stored in Key Vault. Implement audit logging for all data access operations.
Interview Questions
Q1: How do you implement GDPR compliance for a data lake? A: 1) Classify PII with Purview, 2) Implement data subject access requests (DSAR), 3) Enable right to erasure (anonymization/deletion), 4) Track consent, 5) Implement data retention policies, 6) Maintain audit logs.
Q2: What is the difference between data encryption at rest and in transit? A: At rest: Data is encrypted when stored (AES-256 in Key Vault). In transit: Data is encrypted during network transfer (TLS 1.2). Both are required for compliance with most regulations.
Q3: How do you audit data access for compliance? A: Enable diagnostic settings for all services, send logs to Log Analytics, create KQL queries for access patterns, implement alerts for suspicious activity, and maintain logs for required retention periods (7 years for HIPAA).