LinkedIn Carousel: 433 Scan Rules Architecture
Post type: Technical · 10 slides
Blog post: https://www.nuvikatech.com/blog/posts/433-scan-rules-architecture
SLIDE 1 — Cover
Headline: 433 cloud scan rules. 6 provider categories. One shared interface. Here’s the architecture that makes them all work.
Sub-line: AWS (193), Azure (149), GCP (59), Kubernetes, on-prem, multi-cloud — discoverable, testable, and extendable by any engineer.
SLIDE 2
Label: THE FOUNDATION
Headline: Every rule — regardless of provider — implements the same BaseRule.
Body:
rule_id, name, severity, tier, evaluate(). One interface, 433 implementations. The simplicity of the contract is what makes the scale possible.
SLIDE 3
Label: TIER 1
Headline: Deterministic rules: binary criteria, reproducible results, built for audits.
Body: EC2 instances with <5% average CPU over 14 days → idle. EBS volumes with no instance attachment → orphaned. These have objective answers. A CFO can report the number.
SLIDE 4
Label: TIER 2
Headline: AI-assisted rules: pattern discovery with confidence scores. Not for audits.
Body: Reserved Instance coverage gaps, bursty CPU workloads that could use a different instance family — these require understanding workload shape over time. Gemini handles the pattern recognition.
SLIDE 5
Label: AUTO-DISCOVERY
Headline: 433 rules can’t be manually registered. The registry finds them automatically at startup.
Body:
Drop a new rule file in the right provider directory, give it a unique rule_id, implement evaluate(). The registry discovers it via importlib on next boot. No central file to update.
SLIDE 6
Label: SCAN CONTEXT
Headline: Rules don’t call cloud APIs. They receive pre-fetched data.
Body:
A scan job fetches all data once, then runs all applicable rules against a ScanContext. Rules are pure functions — same input, same output, every time. Testable without any cloud access.
SLIDE 7
Label: WHY THIS MATTERS
Headline: Pure functions = fast, reliable, parallelisable scans.
Body: When rules are isolated from I/O, you can run hundreds of them in parallel, reproduce any finding against synthetic test data, and benchmark each rule independently.
SLIDE 8
Label: LESSON 1
Headline: Rules must be lightweight. A rule that makes API calls becomes a bottleneck.
Body: The context-first design prevents this by construction. If a rule needs data that isn’t in the context, you add it to context fetch — not to the rule itself.
SLIDE 9
Label: LESSON 2
Headline: False positives are worse than missed findings.
Body: A missed idle instance = a customer pays $50 extra. A false positive = an ops team investigates a phantom issue and loses trust in the platform. We tune aggressively for precision over recall.
SLIDE 10 — CTA
Headline: Want the full story?
Body: The full rule engine architecture: Tier 1 vs Tier 2, the auto-discovery registry, versioning for audit trails, and the two lessons 433 rules taught us.
Link: nuvikatech.com/blog/posts/433-scan-rules-architecture