When I tell cloud architects that Fintropy — a platform whose core value proposition is helping companies control their multi-cloud costs — runs across AWS, GCP, and Azure simultaneously, I get one of two reactions.
The first: “That’s expensive and operationally painful.”
The second: “That’s actually the only way you can credibly do this.”
Both are right. Here’s the honest account of why we did it, what each cloud does, and what we’d do differently.
The Architecture
Each environment runs on a completely different cloud provider — deliberately:
Dev runs on GCP — Cloud Run in asia-south1. Scale-to-zero economics make it cheap to keep alive without traffic. Cloud Scheduler and Cloud Tasks handle background jobs. Secret Manager stores credentials.
QA runs on Azure — Container Apps in southindia. QA on Azure means every Azure-native feature (Azure scanner, Azure SLA claim filing, Azure subscription onboarding) gets tested against real Azure infrastructure before it ships to production.
Prod runs on AWS — ECS on Fargate in ap-south-1, RDS PostgreSQL, Route53, ACM. Most of our enterprise customers run significant AWS workloads, so AWS production gives us the closest parity to customer environments.
Why Each Environment Is on a Different Cloud
This wasn’t the result of a single design session. It evolved from three separate decisions:
GCP for Dev: Scale-to-Zero Economics
Dev environments sit idle most of the time. Cloud Run scales to zero — no traffic, no cost. An equivalent GKE or EC2 setup would cost $600–800/month in idle node charges. Cloud Run dev costs ~$30/month.
The engineering tradeoff: Cloud Run’s cold-start behaviour forced us to adopt lazy imports for all cloud SDKs (more on this in a separate post). That convention now applies everywhere in the codebase.
Azure for QA: Test What You Ship
We build Azure scanning, Azure SLA filing, and Azure subscription onboarding. If QA doesn’t run on Azure, we’re testing those features against mocks.
Container Apps mirrors the Cloud Run deployment model closely enough that backend code behaves the same way. But the Azure-specific integrations — managed identity, Key Vault references, Azure SQL connectivity — only work when QA actually runs on Azure.
When we built SLA breach auto-filing for Azure, we ran the first end-to-end test against our own QA Azure subscription. We found one bug in the support ticket classification lookup that only reproduced against the real Azure Support API.
AWS for Prod: Where Customers Live
Most enterprise cloud customers have significant AWS workloads. Running production on AWS means:
- Network proximity to customer AWS accounts for scanning
- Real-world testing of the AWS scanner against our own production account
- IAM patterns our customers recognise when reviewing our onboarding flow
- AWS Support access for our own SLA claims (yes, we file them)
The CI/CD Consequence
Three clouds means three deployment pipelines. GitHub Actions, four workflows:
ci.yml — lint + test gate (runs on every PR)
deploy-dev.yml — GCP Cloud Run deploy (on merge to main)
deploy-qa.yml — Azure Container Apps deploy (on release tag)
deploy-prod.yml — AWS ECS deploy (on release tag, manual approval)
Each workflow handles provider authentication differently:
- GCP: Workload Identity Federation (no long-lived keys)
- Azure: Service Principal with OIDC (no long-lived keys)
- AWS: OIDC role assumption (no long-lived keys)
Every engineer learns all three provider deployment patterns. Onboarding takes longer. Debugging an incident requires knowing which layer failed and how to query that provider’s logs. That’s the real cost.
What You Actually Learn
Running your own infrastructure on all three clouds teaches you things no amount of customer conversations will:
GCP’s Cloud Run cold-start behaviour bit us in development. We fixed it permanently. Every customer whose GCP scanner we build benefits from that fix.
Azure Container Apps networking has specific limitations around egress IPs and VNet integration that only showed up when we tried to connect QA to an Azure SQL instance with a firewall rule. We documented the workaround; it’s now in our Azure onboarding guide.
AWS ECS task definition versioning has a subtle behaviour with environment variable precedence that caused a 40-minute production incident in our second month. We now have a specific test for it in the deployment preflight checklist.
None of these would have surfaced if we ran everything on a single cloud and simulated the others.
The Cross-Scanning Topology: The Product Tests Itself
Here’s the detail that makes the tri-cloud setup more than just a deployment curiosity.
Each Fintropy environment scans the other two:
- GCP Dev scans Azure QA and AWS Prod
- Azure QA scans GCP Dev and AWS Prod
- AWS Prod scans GCP Dev and Azure QA
This is not busywork. It’s continuous integration for the product itself.
What it validates on every scan cycle:
- Cross-cloud credential authentication actually works (IAM roles, service principals, service account keys)
- The cost scanner correctly ingests billing data from a foreign cloud
- SLA monitoring correctly detects and tracks uptime for services on other providers
- The evidence packaging and scan result storage work end-to-end
If our AWS scanner has a regression, GCP Dev and Azure QA will both start reporting scan failures for their AWS targets within the hour. No customer needs to tell us.
It also makes the confidence claim credible. When we tell a prospect “our Azure scanner works reliably,” we can point to the fact that our own GCP and AWS environments have been continuously scanning real Azure infrastructure in production for months. That’s not a demo environment — it’s our QA environment running real workloads.
And it catches cross-cloud blind spots. When we built the FOCUS 1.2 billing normalisation, we tested it against GCP billing data ingested by the Azure scanner and AWS billing data ingested by the GCP scanner. Subtle field mapping issues that would have looked correct in isolation surfaced immediately when viewed from a foreign environment’s perspective.
This is the part of the architecture that we’d replicate on any future product: design your environments so the product monitors itself across environments. The cost is a few extra scan jobs. The benefit is a continuous, unignorable signal that your core engine works.
The Real Cost
Dev + QA + Prod across three clouds runs approximately:
- Dev (GCP): ~$60/month
- QA (Azure): ~$120/month
- Prod (AWS): scales with customer load
The operational overhead — three sets of secrets, three IAM configurations, three monitoring dashboards, three CLI toolchains — is real. We’ve accepted it because the business case (credibility, real-world testing, dogfooding) is sound.
But we’d tell any early-stage team: don’t start here. Start on one cloud, run it well, and expand to additional clouds when you have a specific reason to — not because multi-cloud sounds sophisticated.
We expanded to three clouds because we built a product that specifically needs to understand all three. That’s a reason. “Avoiding vendor lock-in” alone is not.
Fintropy is a multi-cloud FinOps platform in private beta — Dev on GCP, QA on Azure, Prod on AWS. Learn more at nuvikatech.com