בס״ד

FME — Failure Grid (LOCKED)

docs/FME_FAILURE_GRID.md · last changed (pre-VM history) · rendered from GitHub master

FME — Failure Grid (LOCKED)

Message/asset loss = system failure. Every failure path degrades to "captured raw," never to "lost." Locked 2026-06-05 (ZW-ENGINE-V9).

The #1 rule — capture never blocks

Order is ALWAYS: Capture → Save → Acknowledge, THEN (async) Classify → File → Confirm.
- Nothing is ever rejected. A bad classifier can NEVER break intake.
- Worst case an item's chapter = Unfiled.
- NEVER REQUIRE FILING — the human never picks a chapter. One message / voice / photo → done.

Failure decision grid

Failure Behavior
Telegram down save to thread, retry (queue the outbound)
Cloudflare KV unavailable queue locally, replay when back; never drop
Claude (summary) down store raw transcript / raw text only; skip the summary, mark AI_UNREVIEWED
Whisper (transcribe) down forward the raw audio file; keep original_uri
Drive down store a pending-attachment marker + retry; keep the Telegram file_id
Bot token revoked / deploy fail alert Sam (tg.ps1 -Source System)
Worker restart mid-request idempotent writes (dedup by update_id / FLB-id); replay

Invariants

Source trail · docs/FME_FAILURE_GRID.md @ master · rendered 2026-07-02 7:23 PM EDT by scripts/build-docs.py · the .md in the repo is the truth; this page is the phone-readable view