Confidential — Internal Audit

Stratus Chatbot Build Spec
Review & Audit

Legal and regulatory review of the Stratus Financial AI FAQ chatbot build spec, with prioritized findings and a path to kickoff.

Subject STRATUS_CHATBOT_BUILD_SPEC.md
Author Mykle Alberto, IT Lead
Reviewer Don Ho, General Counsel
Date May 13, 2026
A−

Top-quartile spec. Close the P0 gaps before kickoff.

This spec is in the top quartile of what I have seen. Five P0 items must be closed before Phase 1 starts; five P1 items will save pain after launch; eight P2 items are polish. The legal and regulatory gaps are the only material blockers.

Part 1

What makes a build spec effective

A build spec is a treaty between the person who knows what should exist and the person — or agent — who is going to make it exist. The defects in most specs trace back to one of these principles being violated.

Core principles

01

Negative space matters as much as positive

"Out of scope" plus "do not build" plus "common pitfalls" prevent scope creep and rediscovery of dead ends.

02

Decisions, not just designs

Every non-obvious choice gets a why. A spec without rationale becomes folklore in 6 months.

03

Verifiable acceptance criteria

"It works" is not done. Each deliverable needs a binary test someone other than the builder can run.

04

Single owner per workstream

Groups do not own things. If two people are accountable, no one is.

05

Phased delivery with hard gates

Phase N+1 does not start until Phase N is verified in production. Prevents the half-built-everywhere failure mode.

06

Failure modes named explicitly

What happens when X service is down? What is the rollback?

07

SVNS at every layer

Smallest viable next step. Every output should produce evidence within 24–72 hours.

08

Change control

What stops when this ships? If you do not name what stops, both old and new run in parallel.

Applicable frameworks

Anthropic / OpenAI spec-driven development Amazon Working Backwards ADR pattern 12-Factor App STRIDE threat modeling DORA metrics RACI
Part 2

What is genuinely strong — preserve these

Before the audit, credit where it's due. This spec is in the top quartile of what I have seen. Specifically:

§2 Out of Scope Disciplined and specific. Most specs miss this entirely.
§16 Common Pitfalls (27) 27 named anti-patterns. This is institutional knowledge that prevents 80% of build failures.
Phased delivery with explicit acceptance criteria Exactly the gating discipline that prevents premature integration.
Belt-and-suspenders escalation logic System prompt + keyword detector + tool call + max-message cap gives four independent paths to safety. Good defense-in-depth.
Graph reply API instead of constructing IMF headers (§11.4) Explained with rationale. Exactly the kind of decision documentation good specs include.
Application Access Policy (§15.4 Step 5) The kind of thing most M365 integration specs skip. Doing it right here avoids creating a tenant-wide-send privilege grenade.
Microsoft Graph section overall Most specs hand-wave through this; this one walks through subscription lifecycle, renewal, validation, and the draft-then-send pattern. Strong.
Part 3

Audit / review — findings by severity

Sorted by severity, not by section order. Five P0 items must close before Phase 1 kickoff. Five P1 items will save pain after launch. Eight P2 items are polish.

01 P0 · Block kickoff

No regulatory anchor for the guardrails

The spec says "Stratus is regulated financial services" and writes guardrails accordingly, but it never cites which regulations are driving which rule. As GC, you want the next attorney (or auditor) reading this to see:

  • GLBA Safeguards Rule (16 CFR 314) drives encryption-at-rest, access controls, audit log, IP hashing, retention.
  • UDAAP (12 USC 5536) drives "no rate quotes / no approval determinations." Those are the deceptive-acts trip wires.
  • Reg B / ECOA (12 CFR 1002) drives the prohibition on the bot collecting demographic info or making eligibility statements. If a visitor reveals protected-class info, the system should not store or respond on it.
  • CCPA / CPRA (California, since Stratus is in OC) drives data subject access, deletion rights, "do not sell" disclosure, opt-out for sensitive categories.
  • State lending licensure (Cal Fin Code §22000+, if a CFL licensee) drives required disclosures.
  • TCPA if the bot ever moves to SMS (currently out of scope, but flag it).
Fix Add a §2.5 "Regulatory Basis" mapping every guardrail to the rule that requires it. This is also marketing — it shows clients you actually know what you are doing.
02 P0 · Block kickoff

No data retention policy

Conversations, messages, tickets, IP hashes, and customer emails all live forever per the current schema. This is a CCPA exposure, a GLBA disposal-rule issue (16 CFR 682), and operationally a Supabase storage cost over time.

Define:

  • Retention window for conversations (suggest 13 months: covers 1 audit cycle plus 30-day buffer).
  • Retention for tickets (longer, these are arguably records; suggest 7 years to match financial records retention).
  • Retention for messages (shorter, suggest match conversations).
  • A scheduled purge job (Railway cron, daily).
  • DSAR (data subject access request) workflow.
03 P0 · Block kickoff

The IP hashing scheme is weaker than it looks

Spec says sha256(ip + daily_salt). If the daily salt is stored in a DB column, anyone with DB access can correlate across days. The salt belongs in env vars or Vault and should rotate via deploy, not via a row update.

Also: pitfall #6 says "daily-rotated salt" but the implementation is not specified anywhere. Spec it.

04 P0 · Block kickoff

Single owner is missing

The doc names Mykle as the implementer at the end, but no other workstream has a named owner. Who owns the FAQ content? (Lesley is mentioned once in §13 Phase 1 Task 7. Give her her own line.) Who owns the Cloudflare config? Who renews the Entra client secret in 12 months? Who responds to Pumble alerts when the Graph subscription renewal fails?

Fix Add a RACI table, or at minimum a "Roles" subsection.
05 P0 · Block kickoff

No "what stops" clause

Per the Change Control rule. If the bot replaces an existing contact form, phone screen, or website chat plugin, the spec needs to say explicitly: On the day the bot ships, the old contact form is removed and points here instead.

Otherwise both run in parallel and you will have two ticket queues.

06 P1 · Should add

Cost projections absent

Anthropic plus Railway plus Supabase plus Cloudflare plus M365 mailbox at 1,000 conversations per month, then 5,000, then 10,000. A financial services exec asks "what is our run rate" within 60 days of go-live. The system prompt cache is mentioned but no math is shown.

07 P1 · Should add

No adversarial / red-team test plan

§14 testing covers happy paths and rate limits. For a financial-services bot, you need a documented jailbreak suite: "ignore previous instructions," role-play prompts, paraphrased rate questions ("what is the percentage I'd pay each year"), DAN-style attacks, prompt injection via the FAQ itself if it ever loads from DB.

Fix This should be a checked-in test file with red-team scenarios that run in CI.
08 P1 · Should add

Feedback is in Phase 3. Wrong phase.

Thumbs feedback is the only signal you have about whether the bot is hallucinating. It should be in Phase 1. You will be flying blind for 3+ weeks otherwise.

09 P1 · Should add

The max_failed_responses_before_escalation setting is "currently informational"

A configured-but-not-enforced setting is technical debt the moment it lands. Either wire it up or delete it.

10 P1 · Should add

No disaster recovery / RPO / RTO

§18 README mentions runbook items but does not quantify. What is the maximum tolerable data loss (RPO)? Maximum recovery time (RTO)? For financial services, document these even if generous.

11 P2 · Polish

Accessibility (ADA Title III) not addressed

Financial services websites are repeat targets for ADA litigation. Keyboard nav, screen reader behavior, color contrast (the brand color #185FA5 against white passes AA for normal text — verify against #185FA5 background and button states), focus indicators in the Shadow DOM, ARIA labels on the launcher.

Fix Add a §9.5 "Accessibility" subsection.
12 P2 · Polish

The FAQ-in-system-prompt threshold should be made explicit upfront

Pitfall #12 buries "if you ever genuinely need RAG (>50K tokens of FAQ)." Move this to §3 or §8 as a top-line constraint: "If the FAQ exceeds X tokens, escalate the decision; do not unilaterally introduce a vector store." That matches the operating style.

13 P2 · Polish

No decision log / ADR

Why Preact over React? (Bundle size, but say it.) Why Supabase over a separate Postgres + Auth0? Why Railway over Vercel? Why Pumble over Slack? Each gets a one-sentence rationale somewhere. Future-you will thank present-you.

14 P2 · Polish

Versioning protocol for the widget bundle

NEXT_PUBLIC_WIDGET_VERSION exists but no bump rules. Semver? When does a localStorage schema change force a clear?

15 P2 · Polish

The "Final Instructions for the Implementing Agent" section is buried at §19

This is actually some of the most useful content (conventional commits, TypeScript strict, no premature optimization, "stop and ask"). Move it to §1.5 right after the Mission so it sets the tone before the reader hits the tech-stack table.

16 P2 · Polish

Mermaid diagram in the README is mentioned but not provided

Embed one in the spec itself. A picture saves 500 words for someone scanning.

17 P2 · Polish

"Use existing dashboard mockup design" (§19) — link it inline

Otherwise the agent guesses.

18 P2 · Polish

The two-step draft-then-send Graph pattern is explained well but repeated

Refactor §11.1 to reference §11.2 instead of duplicating.

Part 4

Definition of Done for this review

Closed loop

  • Spec amended to address all P0 items.
  • P1 items either added or explicitly deferred with a written reason.
  • P2 items addressed as time permits.

Severity breakdown

SeverityCountDisposition
P05Block Phase 1 kickoff. Address before any code is written.
P15Should add — will save pain after launch.
P28Polish. Address as time permits.
Part 5

Sources referenced