Migration Options
Storage Transfer Service
from google.cloud import storage_transfer_v1
client = storage_transfer_v1.StorageTransferServiceClient()
# Create transfer from S3 to GCS
transfer_job = client.create_transfer_job(
request={
"project_id": "my-project",
"transfer_spec": {
"aws_s3_data_source": {
"bucket_name": "my-s3-bucket",
"aws_access_key_id": "AKIAIOSFODNN7EXAMPLE",
"aws_secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
},
"gcs_data_sink": {
"bucket_name": "my-gcs-bucket"
},
"transfer_options": {
"overwrite_objects_already_existing_in_sink": True,
"delete_objects_from_source_after_transfer": False
}
},
"status": "ENABLED",
"schedule": {
"schedule_start_date": {"year": 2025, "month": 1, "day": 15},
"repeat_interval": {"seconds": 86400} # Daily
}
}
)
print(f"Created transfer job: {transfer_job.name}")
Migration Best Practices
# Migration checklist
migration_checklist = {
"pre_migration": [
"Audit source data quality and volume",
"Plan cutover window and downtime",
"Set up GCP IAM permissions",
"Configure networking (VPN/Interconnect)",
"Test with subset of data first"
],
"during_migration": [
"Monitor transfer progress",
"Validate data integrity",
"Track migration metrics",
"Document any issues"
],
"post_migration": [
"Validate row counts and checksums",
"Run comparison queries",
"Update applications to use new sources",
"Decommission old systems",
"Archive migration logs"
]
}
β¨
Best Practice: Always test migrations with a subset of data first. Use checksums to validate data integrity. Plan cutover windows during low-traffic periods. Implement rollback procedures. Monitor transfer progress and set up alerts for failures.
Common Interview Questions
Q1: When would you use Transfer Appliance vs. Storage Transfer Service?
Answer: Use Transfer Service for <10TB with good network connectivity. Use Transfer Appliance for >50TB or limited network. Transfer Appliance ships a physical device to your data center, which you fill and return to Google.
Q2: How do you validate data after migration?
Answer: 1) Compare row counts, 2) Run checksum comparisons, 3) Validate data types and schemas, 4) Sample random records for comparison, 5) Run business logic validation queries, 6) Check for data completeness.
Q3: What is the Database Migration Service?
Answer: DMS is a managed service for migrating databases to Cloud SQL. It supports MySQL, PostgreSQL, and SQL Server with continuous replication. It minimizes downtime by enabling continuous sync before cutover.
Q4: How do you minimize downtime during migration?
Answer: 1) Use continuous replication (DMS), 2) Perform initial bulk transfer during off-hours, 3) Sync changes incrementally, 4) Switch applications at cutover, 5) Validate before decommissioning old systems.
Q5: What are common migration pitfalls?
Answer: 1) Underestimating data volume, 2) Not testing with production data, 3) Ignoring schema compatibility, 4) Poor network planning, 5) Insufficient validation, 6) Not planning for rollback.