Multi-Cloud Strategy: Abstraction, Portability, Cost
Difficulty: Staff Level | Companies: Netflix, Google, Microsoft, HashiCorp, Pulumi
Interview Question
"Design a multi-cloud strategy for a global enterprise using AWS, Azure, and GCP. How do you handle abstraction, portability, cost optimization, and governance?"
โน๏ธKey Concepts
This question tests your understanding of multi-cloud architecture, cloud abstraction, and enterprise governance.
Complete Multi-Cloud Architecture
Architecture Overview
Architecture Diagram
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MULTI-CLOUD ARCHITECTURE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโโโโโโโโ ABSTRACTION LAYER โโโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Cloud Abstraction Layer โ โ โ
โ โ โ (Terraform / Pulumi / Crossplane) โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Kubernetes (EKS/AKS/GKE) โ โ โ
โ โ โ (Container Orchestration Abstraction) โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโ CLOUD PROVIDERS โโโโโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ โ
โ โ โ AWS โ โ Azure โ โ GCP โ โ โ
โ โ โ โ โ โ โ โ โ โ
โ โ โ โโโโโโโโ โ โ โโโโโโโโ โ โ โโโโโโโโ โ โ โ
โ โ โ โEKS โ โ โ โAKS โ โ โ โGKE โ โ โ โ
โ โ โ โRDS โ โ โ โSQL DBโ โ โ โCloud โ โ โ โ
โ โ โ โS3 โ โ โ โBlob โ โ โ โSQL โ โ โ โ
โ โ โ โโโโโโโโ โ โ โโโโโโโโ โ โ โโโโโโโโ โ โ โ
โ โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโ GOVERNANCE LAYER โโโโโโโโโโโโโโโโโ โ
โ โ Cost Management โ Security โ Compliance โ IAM โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Mathematical Foundation: Multi-Cloud Metrics
Cloud Distribution:
- Total workloads: W = 100
- AWS workloads: A = 40%
- Azure workloads: Z = 35%
- GCP workloads: G = 25%
Cost Optimization:
- AWS monthly cost: C_aws = $500,000
- Azure monthly cost: C_azure = $450,000
- GCP monthly cost: C_gcp = $400,000
- Total monthly cost: C_total = $1,350,000
- Optimized distribution savings: S = 15% = $202,500/month
Availability Calculation:
- AWS availability: A_aws = 99.99%
- Azure availability: A_azure = 99.99%
- GCP availability: A_gcp = 99.99%
- Multi-cloud availability: A_multi = 1 - (1 - A_aws) ร (1 - A_azure) ร (1 - A_gcp)
- A_multi = 99.999999%
Latency Optimization:
- AWS latency: L_aws = 50ms
- Azure latency: L_azure = 45ms
- GCP latency: L_gcp = 55ms
- Best latency: L_best = min(L_aws, L_azure, L_gcp) = 45ms
Terraform Multi-Cloud Implementation
# Multi-cloud provider configuration
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0"
}
google = {
source = "hashicorp/google"
version = "~> 5.0"
}
}
}
provider "aws" {
alias = "us_east"
region = "us-east-1"
}
provider "aws" {
alias = "eu_west"
region = "eu-west-1"
}
provider "azurerm" {
features {}
subscription_id = var.azure_subscription_id
}
provider "google" {
project = var.gcp_project_id
region = "us-central1"
}
# AWS EKS Cluster
module "aws_eks" {
source = "./modules/eks"
providers = {
aws = aws.us_east
}
cluster_name = "multi-cloud-eks"
cluster_version = "1.27"
vpc_id = module.aws_vpc.vpc_id
subnet_ids = module.aws_vpc.private_subnet_ids
}
# Azure AKS Cluster
module "azure_aks" {
source = "./modules/aks"
providers = {
azurerm = azurerm
}
cluster_name = "multi-cloud-aks"
kubernetes_version = "1.27"
vnet_id = module.azure_vnet.vnet_id
subnet_ids = module.azure_vnet.subnet_ids
}
# GCP GKE Cluster
module "gcp_gke" {
source = "./modules/gke"
providers = {
google = google
}
cluster_name = "multi-cloud-gke"
kubernetes_version = "1.27"
network = module.gcp_vpc.network_id
subnetwork = module.gcp_vpc.subnet_id
}
# Multi-cloud networking
module "multi_cloud_network" {
source = "./modules/multi-cloud-network"
aws_vpc_id = module.aws_vpc.vpc_id
azure_vnet_id = module.azure_vnet.vnet_id
gcp_network_id = module.gcp_vpc.network_id
aws_subnet_ids = module.aws_vpc.private_subnet_ids
azure_subnet_ids = module.azure_vnet.subnet_ids
gcp_subnet_id = module.gcp_vpc.subnet_id
}
# Global load balancer
module "global_load_balancer" {
source = "./modules/global-lb"
aws_endpoints = module.aws_eks.endpoint
azure_endpoints = module.azure_aks.endpoint
gcp_endpoints = module.gcp_gke.endpoint
}
Kubernetes Multi-Cloud Abstraction
# Multi-cloud Kubernetes deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: multi-cloud-app
labels:
app: multi-cloud-app
cloud: ${CLOUD_PROVIDER}
spec:
replicas: 3
selector:
matchLabels:
app: multi-cloud-app
template:
metadata:
labels:
app: multi-cloud-app
cloud: ${CLOUD_PROVIDER}
spec:
# Cloud-specific node selector
nodeSelector:
cloud: ${CLOUD_PROVIDER}
# Cloud-specific tolerations
tolerations:
- key: "cloud"
operator: "Equal"
value: "${CLOUD_PROVIDER}"
effect: "NoSchedule"
containers:
- name: app
image: multi-cloud-app:latest
ports:
- containerPort: 8080
# Cloud-specific resources
resources:
requests:
cpu: "500m"
memory: "512Mi"
# AWS-specific
# aws.amazon.com/gpu: "1"
# Azure-specific
# azure.com/gpu: "1"
# GCP-specific
# cloud.google.com/gpu: "1"
# Cloud-specific environment variables
env:
- name: CLOUD_PROVIDER
value: ${CLOUD_PROVIDER}
- name: CLOUD_REGION
value: ${CLOUD_REGION}
- name: CLOUD_ZONE
value: ${CLOUD_ZONE}
# Cloud-specific volume mounts
volumeMounts:
- name: cloud-storage
mountPath: /data
volumes:
- name: cloud-storage
# Cloud-specific volume type
# AWS EBS
# awsElasticBlockStore:
# volumeID: vol-12345
# fsType: ext4
# Azure Disk
# azureDisk:
# diskName: my-disk
# diskURI: /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Compute/disks/my-disk
# GCP Persistent Disk
# gcePersistentDisk:
# pdName: my-disk
# fsType: ext4
emptyDir: {}
---
# Cloud-agnostic service
apiVersion: v1
kind: Service
metadata:
name: multi-cloud-service
annotations:
# AWS ALB
# service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
# Azure Load Balancer
# service.beta.kubernetes.io/azure-load-balancer-internal: "true"
# GCP Load Balancer
# cloud.google.com/neg: '{"ingress": true}'
spec:
selector:
app: multi-cloud-app
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
Cloud Abstraction Layer
# Cloud abstraction layer
from abc import ABC, abstractmethod
from typing import Dict, Any, Optional, List
from dataclasses import dataclass
from enum import Enum
class CloudProvider(Enum):
AWS = "aws"
AZURE = "azure"
GCP = "gcp"
@dataclass
class CloudConfig:
provider: CloudProvider
region: str
credentials: Dict[str, str]
class CloudProviderInterface(ABC):
"""Abstract interface for cloud providers"""
@abstractmethod
def create_vm(self, name: str, size: str, image: str) -> str:
pass
@abstractmethod
def create_database(self, name: str, engine: str, size: str) -> str:
pass
@abstractmethod
def create_storage(self, name: str) -> str:
pass
@abstractmethod
def deploy_container(self, name: str, image: str, replicas: int) -> str:
pass
@abstractmethod
def get_metrics(self, resource_id: str) -> Dict[str, Any]:
pass
class AWSProvider(CloudProviderInterface):
"""AWS implementation"""
def __init__(self, config: CloudConfig):
self.config = config
self.ec2 = boto3.client('ec2', region_name=config.region)
self.rds = boto3.client('rds', region_name=config.region)
self.s3 = boto3.client('s3', region_name=config.region)
self.ecs = boto3.client('ecs', region_name=config.region)
def create_vm(self, name: str, size: str, image: str) -> str:
response = self.ec2.run_instances(
ImageId=image,
InstanceType=size,
MinCount=1,
MaxCount=1,
TagSpecifications=[{
'ResourceType': 'instance',
'Tags': [{'Key': 'Name', 'Value': name}]
}]
)
return response['Instances'][0]['InstanceId']
def create_database(self, name: str, engine: str, size: str) -> str:
response = self.rds.create_db_instance(
DBInstanceIdentifier=name,
DBInstanceClass=size,
Engine=engine,
MasterUsername='admin',
MasterUserPassword='password'
)
return response['DBInstance']['DBInstanceIdentifier']
def create_storage(self, name: str) -> str:
self.s3.create_bucket(Bucket=name)
return name
def deploy_container(self, name: str, image: str, replicas: int) -> str:
response = self.ecs.create_service(
cluster='default',
serviceName=name,
taskDefinition=image,
desiredCount=replicas
)
return response['service']['serviceName']
def get_metrics(self, resource_id: str) -> Dict[str, Any]:
cloudwatch = boto3.client('cloudwatch', region_name=self.config.region)
response = cloudwatch.get_metric_statistics(
Namespace='AWS/EC2',
MetricName='CPUUtilization',
Dimensions=[{'Name': 'InstanceId', 'Value': resource_id}],
StartTime=datetime.utcnow() - timedelta(hours=1),
EndTime=datetime.utcnow(),
Period=300,
Statistics=['Average']
)
return response
class AzureProvider(CloudProviderInterface):
"""Azure implementation"""
def __init__(self, config: CloudConfig):
self.config = config
# Azure SDK initialization
def create_vm(self, name: str, size: str, image: str) -> str:
# Azure VM creation
return f"azure-vm-{name}"
def create_database(self, name: str, engine: str, size: str) -> str:
# Azure SQL Database creation
return f"azure-db-{name}"
def create_storage(self, name: str) -> str:
# Azure Blob Storage creation
return f"azure-storage-{name}"
def deploy_container(self, name: str, image: str, replicas: int) -> str:
# Azure Container Instances
return f"azure-container-{name}"
def get_metrics(self, resource_id: str) -> Dict[str, Any]:
# Azure Monitor metrics
return {}
class GCPProvider(CloudProviderInterface):
"""GCP implementation"""
def __init__(self, config: CloudConfig):
self.config = config
# GCP SDK initialization
def create_vm(self, name: str, size: str, image: str) -> str:
# GCP Compute Engine
return f"gcp-vm-{name}"
def create_database(self, name: str, engine: str, size: str) -> str:
# GCP Cloud SQL
return f"gcp-db-{name}"
def create_storage(self, name: str) -> str:
# GCP Cloud Storage
return f"gcp-storage-{name}"
def deploy_container(self, name: str, image: str, replicas: int) -> str:
# GCP GKE
return f"gcp-container-{name}"
def get_metrics(self, resource_id: str) -> Dict[str, Any]:
# GCP Cloud Monitoring
return {}
class MultiCloudManager:
"""Multi-cloud management"""
def __init__(self):
self.providers: Dict[CloudProvider, CloudProviderInterface] = {}
def register_provider(self, provider: CloudProvider, interface: CloudProviderInterface):
"""Register cloud provider"""
self.providers[provider] = interface
def deploy_to_cloud(self, provider: CloudProvider,
resource_type: str, **kwargs) -> str:
"""Deploy resource to specific cloud"""
interface = self.providers.get(provider)
if not interface:
raise ValueError(f"Provider {provider} not registered")
if resource_type == 'vm':
return interface.create_vm(**kwargs)
elif resource_type == 'database':
return interface.create_database(**kwargs)
elif resource_type == 'storage':
return interface.create_storage(**kwargs)
elif resource_type == 'container':
return interface.deploy_container(**kwargs)
else:
raise ValueError(f"Unknown resource type: {resource_type}")
def deploy_multi_cloud(self, resource_type: str,
distribution: Dict[CloudProvider, int], **kwargs) -> Dict[str, str]:
"""Deploy across multiple clouds"""
results = {}
for provider, count in distribution.items():
for i in range(count):
resource_id = self.deploy_to_cloud(
provider,
resource_type,
**{**kwargs, 'name': f"{kwargs['name']}-{provider.value}-{i}"}
)
results[f"{provider.value}-{i}"] = resource_id
return results
Cost Optimization Across Clouds
# Multi-cloud cost optimization
import boto3
from typing import Dict, Any, List
from dataclasses import dataclass
from datetime import datetime, timedelta
@dataclass
class CloudCostReport:
provider: str
service: str
cost: float
period: str
trend: float
class MultiCloudCostOptimizer:
"""Multi-cloud cost optimization"""
def __init__(self):
self.providers = {
'aws': self._get_aws_cost,
'azure': self._get_azure_cost,
'gcp': self._get_gcp_cost
}
def get_total_cost(self, days: int = 30) -> Dict[str, Any]:
"""Get total cost across all clouds"""
costs = {}
total = 0
for provider, getter in self.providers.items():
cost = getter(days)
costs[provider] = cost
total += cost
return {
'costs': costs,
'total': total,
'period': f'last_{days}_days'
}
def _get_aws_cost(self, days: int) -> float:
"""Get AWS cost"""
ce = boto3.client('ce')
end_date = datetime.utcnow().strftime('%Y-%m-%d')
start_date = (datetime.utcnow() - timedelta(days=days)).strftime('%Y-%m-%d')
response = ce.get_cost_and_usage(
TimePeriod={'Start': start_date, 'End': end_date},
Granularity='MONTHLY',
Metrics=['UnblendedCost']
)
total = sum(
float(result['Total']['UnblendedCost']['Amount'])
for result in response['ResultsByTime']
)
return total
def _get_azure_cost(self, days: int) -> float:
"""Get Azure cost"""
# Azure Cost Management API
return 0.0
def _get_gcp_cost(self, days: int) -> float:
"""Get GCP cost"""
# GCP Billing API
return 0.0
def optimize_cloud_distribution(self) -> Dict[str, Any]:
"""Optimize workload distribution across clouds"""
costs = self.get_total_cost()
# Analyze cost efficiency
efficiencies = {}
for provider, cost in costs['costs'].items():
# Assume same workload across all clouds
efficiencies[provider] = 1 / cost if cost > 0 else 0
# Normalize to percentages
total_efficiency = sum(efficiencies.values())
optimal_distribution = {
provider: efficiency / total_efficiency
for provider, efficiency in efficiencies.items()
}
return {
'current_distribution': self._get_current_distribution(),
'optimal_distribution': optimal_distribution,
'potential_savings': self._calculate_savings(optimal_distribution)
}
def _get_current_distribution(self) -> Dict[str, float]:
"""Get current cloud distribution"""
return {'aws': 0.4, 'azure': 0.35, 'gcp': 0.25}
def _calculate_savings(self, optimal: Dict[str, float]) -> float:
"""Calculate potential savings"""
current = self._get_current_distribution()
costs = self.get_total_cost()
current_cost = sum(
costs['costs'].get(provider, 0) * share
for provider, share in current.items()
)
optimal_cost = sum(
costs['costs'].get(provider, 0) * share
for provider, share in optimal.items()
)
return current_cost - optimal_cost
โ ๏ธMulti-Cloud Considerations
Multi-cloud adds complexity. Use it for specific needs like compliance, performance, or vendor lock-in avoidance. Start with a primary cloud and add others strategically.
Summary
| Strategy | Purpose | Implementation |
|---|---|---|
| Abstraction | Cloud-agnostic | Terraform, Kubernetes |
| Portability | Move workloads | Containers, microservices |
| Cost Optimization | Reduce spend | Multi-cloud analytics |
| Governance | Compliance | Policy-as-code |
| Disaster Recovery | High availability | Multi-region, multi-cloud |