# The 9 Commits That Completed Our Multi-Cloud SLA Claim Filing Module

A technical founder's honest account of shipping a complex multi-provider feature: what we designed, what we got wrong in code review, and what we'd do differently.

May 3, 2026 · 5 min · Amit Jethva

[](#the-bugs-that-didnt-ship)

[](#what-wed-do-differently)

Software shipping stories usually skip the interesting parts. You see the before and after, not the false starts, the design decisions that required three conversations to resolve, or the code review that caught a bug that would have silently written wrong data to production.

Here’s the real story of shipping Fintropy’s SLA claim filing module: 9 commits, 3 providers, one Cloud Scheduler job, and the bugs that nearly shipped.

---

## What We Built

The claim filing module auto-files SLA credit claims with AWS, Azure, and GCP when a breach is detected. The full flow:

1. Breach resolves → auto-file check runs

2. Tenant settings checked (`claim_filing_enabled`, `auto_file_claims`)

3. Provider connector built with customer credentials

4. Claim filed via provider Support API

5. Cloud Scheduler polls daily for approval/denial

Nothing in that list sounds complicated. The complexity is in the details.

---

## The 9 Commits

### Commit 1: `ConnectorFactory`

**`feat(claim-filing): add ConnectorFactory for all three providers`**

Before this, connector construction was scattered across two files with different credential-handling logic for each provider. The factory unified it:

- AWS: reads credentials from `subscription.auth_metadata`

- Azure: reads `tenant_id` from auth_metadata + service principal from env vars

- GCP: reads `service_account_key` JSON from auth_metadata

**Bug caught in code review:** The initial implementation called `strategy.authenticate(credentials)` but ignored the return value. If authentication failed (wrong credentials, SDK unavailable), we’d return a broken strategy object with `None` clients instead of raising `ConnectorBuildError`. Fixed before merge.

---

### Commit 2: `TenantClaimSettings`

**`feat(claim-filing): add TenantClaimSettings helper`**

A thin dataclass that reads `claim_filing_enabled`, `auto_file_claims`, and `claim_filing_providers` from `Tenant.settings` JSONB with safe defaults. 20 lines of code, 6 tests. Nothing surprising here — but critical to have as a well-tested primitive before building on top of it.

---

### Commit 3: AWS service code expansion

**`feat(claim-filing): expand AWS service code map to 27 services`**

The AWS Support API requires specific service codes when creating cases. Our existing map had 8 entries. We expanded to 27: DynamoDB, ElastiCache, Redshift, SageMaker, API Gateway, SNS, SQS, Kinesis, Glue, Athena, EMR, OpenSearch, Batch, Step Functions, and aliases for common naming variants.

**Bug noted but deferred:** The map exists in two files (providers/aws.py and claim_filing.py). DRY violation. We noted it, decided the fix was lower priority than shipping, left a comment.

---

### Commit 4: GCP `get_claim_status()`

**`feat(claim-filing): implement GCP get_claim_status via Cloud Support API v2`**

Replaced the stub that returned `{"status": "unknown"}` with a real implementation using `google.cloud.support_v2.CaseServiceClient.get_case()`.

GCP state mapping:

- `SOLUTION_PROVIDED` → `approved`

- `CLOSED` → `denied`

- anything else → `pending`

**Bug caught in code review:** The credentials fallback path (when `_sa_credentials` is None, fall back to parsing `service_account_key` from `self.credentials`) had no test. The code was correct; the test was missing. Added before merge.

---

### Commit 5: Auto-file hook

**`feat(claim-filing): add auto-file hook to BreachLifecycleService.resolve_breach`**

The most complex commit. Added `_maybe_auto_file()` to `BreachLifecycleService` — called at the end of every breach resolution, checks tenant settings, builds connector, files claim.

**Design decision recorded here:** The hook uses lazy imports for all `sla_monitoring.*` modules. This prevents circular imports between `app.services` and `sla_monitoring`. All test patches target the source module (`sla_monitoring.connector_factory.build`) not the local name.

**Bug caught in code review:** `ClaimFilingService.file_claim()` legitimately returns `{"status": "assisted"}` for AWS Basic plan accounts and GCP without Support SDK. The initial implementation treated everything except `"submitted"` as an error, which hit the exception handler and logged a misleading “Auto-file failed” warning for completely normal assisted-filing outcomes. Added an explicit `elif result.get("status") == "assisted"` branch.

---

### Commit 6: Poll endpoint

**`feat(claim-filing): add POST /api/sla/poll-claims endpoint for Cloud Scheduler`**

Protected by `X-CloudScheduler-Token` header checked against `CLOUD_SCHEDULER_SECRET` env var. Queries all “Filed” breaches across tenants with `claim_filing_enabled: true`, calls `get_claim_status()` per provider, updates on status change.

**Bug caught in code review (critical):** AWS and Azure `get_claim_status()` were returning raw provider status strings (`"resolved"`, `"closed"`), not our normalised vocabulary (`"approved"`, `"denied"`). The poll endpoint checked `if credit_status == "approved"`, so `"resolved"` from AWS hit the `else` branch and marked the claim as `"Credit Denied"`. Silent. Incorrect. Fixed in a follow-up commit.

---

### Commit 7: Refactor `_build_provider_connector`

**`refactor(claim-filing): replace _build_provider_connector with ConnectorFactory for all providers`**

The existing `_build_provider_connector` in `sla_monitoring.py` only built Azure connectors (returned `None` for AWS/GCP, letting `ClaimFilingService` handle them internally). After the factory was built, we replaced the 45-line Azure-only function with a 15-line wrapper that handles all three providers.

---

### Commit 8: Duplicate removal

**`fix(claim-filing): remove duplicate file_sla_claim from routes.py`**

A stale `file_sla_claim` endpoint existed in `routes.py` from an earlier iteration. It passed `connector=None` to `ClaimFilingService` — bypassing the factory entirely. We wrote a test that asserts the function no longer exists in `routes.py` (via AST parsing), deleted the duplicate, verified the test passes.

---

### Commit 9: Status normalisation fix

**`fix(claim-filing): normalize AWS/Azure get_claim_status to approved/denied/pending/unknown`**

The fix for the critical bug in commit 6. Both AWS and Azure `get_claim_status()` now map their raw vocabulary to `approved/denied/pending/unknown`. The poll endpoint also gained a guard: `if credit_status not in ("approved", "denied", "pending"): continue`.

---

## The Bugs That Didn’t Ship

Three bugs were caught in code review before merging:

1. `ConnectorFactory` ignoring `authenticate()` return value

2. GCP missing credentials fallback test

3. AWS/Azure `get_claim_status()` returning raw strings instead of normalised vocabulary

The third one is the one that mattered most. It was a silent data corruption bug — would have marked legitimate approved credits as denied, with no error log, no exception, no obvious signal. Caught by a final code review that read the poll endpoint logic carefully and traced the flow from provider API to database write.

That’s the value of the two-stage review process. Not the obvious bugs — the ones that look correct at the function level but are wrong at the system level.

---

## What We’d Do Differently

**Extract the service code map earlier.** The DRY violation in the AWS service code map (two identical 27-entry dicts) would have been free to avoid at the time of writing. We deferred it and now it requires touching two files whenever a new service is added.

**Design the status vocabulary before the providers.** We designed the poll endpoint first, then built the providers, then discovered the vocabulary mismatch. Starting from the vocabulary contract and building the providers to match it would have caught this at design time.

---

_Fintropy is a multi-cloud FinOps platform in private beta. [Learn more at nuvikatech.com](https://www.nuvikatech.com)_
