Codex/whe 15 bootstrap workspace (#1)

* feat(WHE-15): bootstrap bun workspace with app and package scaffolds * chore(WHE-17): switch workspace typecheck to tsgo * chore(WHE-16): configure oxlint and oxfmt no-semicolon style * chore: address CodeRabbit review feedback * chore: apply coderabbit fixes and add review script * docs: add ADR decision metadata
2026-03-31 12:04:02 +00:00 · 2026-03-05 00:56:24 +03:00
parent 768400214e
commit 4a26ac81d6
48 changed files with 1057 additions and 1 deletions
--- a/docs/specs/HOUSEBOT-003-purchase-parser.md
+++ b/docs/specs/HOUSEBOT-003-purchase-parser.md
@@ -1,22 +1,27 @@
 # HOUSEBOT-003: Purchase Parser (Hybrid Rules + LLM Fallback)

 ## Summary
+
 Parse free-form purchase messages (primarily Russian) from the Telegram topic `Общие покупки` into structured ledger entries.

 ## Goals
+
 - High precision amount extraction with deterministic rules first.
 - Fallback to LLM for ambiguous or irregular message formats.
 - Persist raw input, parsed output, and confidence score.

 ## Non-goals
+
 - Receipt image OCR.
 - Full conversational NLP.

 ## Scope
+
 - In: parsing pipeline, confidence policy, parser contracts.
 - Out: bot listener wiring (separate ticket).

 ## Interfaces and Contracts
+
 - `parsePurchase(input): ParsedPurchaseResult`
 - `ParsedPurchaseResult`:
  - `amountMinor`
@@ -27,11 +32,13 @@ Parse free-form purchase messages (primarily Russian) from the Telegram topic `
  - `needsReview`

 ## Domain Rules
+
 - GEL is default currency when omitted.
 - Confidence threshold determines auto-accept vs review flag.
 - Never mutate original message text.

 ## Data Model Changes
+
 - `purchase_entries` fields:
  - `raw_text`
  - `parsed_amount_minor`
@@ -42,21 +49,25 @@ Parse free-form purchase messages (primarily Russian) from the Telegram topic `
  - `needs_review`

 ## Security and Privacy
+
 - Sanitize prompt inputs for LLM adapter.
 - Do not send unnecessary metadata to LLM provider.

 ## Observability
+
 - Parser mode distribution metrics.
 - Confidence histogram.
 - Error log for parse failures.

 ## Edge Cases and Failure Modes
+
 - Missing amount.
 - Multiple possible amounts in one message.
 - Non-GEL currencies mentioned.
 - Typos and slang variants.

 ## Test Plan
+
 - Unit:
  - regex extraction fixtures in RU/EN mixed text
  - confidence scoring behavior
@@ -65,10 +76,12 @@ Parse free-form purchase messages (primarily Russian) from the Telegram topic `
 - E2E: consumed in bot ingestion ticket.

 ## Acceptance Criteria
+
 - [ ] Rules parser handles common RU message patterns.
 - [ ] LLM fallback adapter invoked only when rules are insufficient.
 - [ ] Confidence and parser mode stored in result.
 - [ ] Tests include ambiguous message fixtures.

 ## Rollout Plan
+
 - Start with conservative threshold and monitor review rate.