# Codex Handoff — Supermarket Price Comparison System

**Date:** 2026-03-08
**From:** Claude (Opus 4.6)
**To:** Codex
**Resume with Claude:** Tuesday Mar 10 after rate limit reset

---

## Project Overview

A supermarket price comparison + virtual cart system for Buenos Aires. Compares prices across 7 Argentine supermarket chains using government data (SEPA) enriched with direct store APIs (Coto Constructor.io, VTEX).

**End goal:** Virtual carts that calculate the cheapest combination across all stores for a given shopping list.

**User:** Ignacio/Chicho (username: ubuntu, agent name: Pipo)
**Location:** Céspedes 2491, CABA (-34.5700, -58.4490)

---

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                    Data Sources                              │
│  SEPA (gov) ──→ SQLite DB ←── Coto CIO ←── VTEX APIs       │
│  Daily ZIPs      sepa-prices.db   products-latest.json      │
│  16:30 UTC       (1.2GB)          product-promos-latest.json │
└────────────────────┬────────────────────────────────────────┘
                     │
         ┌───────────▼───────────────┐
         │  product-search-server.js │  ← port 3005
         │  Unified Search + Cart API│
         └───────────┬───────────────┘
                     │
         ┌───────────▼───────────────┐
         │  carrito.html (frontend)  │  ← gateway static
         │  + descuentos.html        │
         └───────────────────────────┘
```

---

## Key Files

### Scripts (all in `~/.openclaw/workspace/scripts/`)

| File | Purpose | Status |
|------|---------|--------|
| `sepa-ingest.js` | SEPA daily ingestion (downloads ZIPs, parses CSVs, loads SQLite) | Working, 2-tier (nearby+online) |
| `product-search-server.js` | Unified API: search, product detail, virtual cart | Working, systemd enabled |
| `coto_fetch_products.js` | Coto catalog via Constructor.io (43K products) | Working, timer at 03:15 UTC |
| `coto_fetch_promos.js` | Coto bank/day promos via ATG API | Working, timer at 03:15 UTC |
| `fetch_product_promos.js` | VTEX promo scraper (Jumbo/Disco/Carrefour/Changomas/DIA) | Partially broken — last 2 runs returned 0 items, needs debugging |

### Data (all in `~/.openclaw/workspace/data/`)

| Path | Content |
|------|---------|
| `sepa/db/sepa-prices.db` | Main SQLite DB: prices, products, branches, dollar_rates, ingest_log |
| `coto/products-latest.json` | Coto CIO catalog (43K products, 25MB) |
| `discounts/product-promos-latest.json` | VTEX promo data (currently stale — points to Mar 7 good data) |

### Frontend (in `/usr/lib/node_modules/openclaw/dist/control-ui/`)

| File | Purpose |
|------|---------|
| `carrito.html` | **NEW** — Virtual cart + unified product search |
| `descuentos.html` | Existing — Bank promos + per-store product browsing |

### Systemd Services

| Service | Timer | What |
|---------|-------|------|
| `sepa-ingest` | 16:30 UTC | SEPA daily prices |
| `product-promos-fetch` | 03:15 UTC | VTEX store promos |
| `coto-fetch-products` | 03:15 UTC | Coto CIO catalog |
| `coto-fetch-promos` | 03:15 UTC | Coto ATG promos |
| `product-search-server` | always-on | API server on port 3005 |

---

## Database Schema (sepa-prices.db)

```sql
-- Prices: one row per date/ean/store/branch
CREATE TABLE prices (
  date TEXT, ean TEXT, store TEXT, branch_id TEXT DEFAULT 'online',
  name TEXT, brand TEXT, unit_qty REAL, unit_measure TEXT,
  price_lista REAL, price_promo REAL, promo_text TEXT,
  tier TEXT DEFAULT 'online',  -- 'nearby' or 'online'
  PRIMARY KEY (date, ean, store, branch_id)
);

-- Products master: best name per EAN
CREATE TABLE products (
  ean TEXT PRIMARY KEY, best_name TEXT, brand TEXT,
  unit_qty REAL, unit_measure TEXT, last_seen TEXT
);

-- Branches: store locations
CREATE TABLE branches (
  store TEXT, branch_id TEXT, name TEXT, type TEXT, address TEXT,
  lat REAL, lng REAL, distance_km REAL, province TEXT, locality TEXT,
  is_nearby INTEGER DEFAULT 0,
  PRIMARY KEY (store, branch_id)
);
```

**Store keys in DB:** `coto`, `dia`, `carrefour` (online rep), `carrefour-express`, `carrefour-hiper`, `carrefour-market`, `changomas`, `jumbo`, `disco`, `vea`

**Chain grouping:** carrefour-* → "Carrefour", others 1:1

**Two-tier model:**
- `nearby` = branches within 3km of user's location (per-branch prices)
- `online` = one representative branch per chain (deduplicated by EAN, for price comparison)

---

## API Endpoints (port 3005, proxied via `/preview/products-api/`)

```
GET  /stores                    → store list with product counts + latest dates
GET  /products?q=&store=&page=&size=&promo=  → search (uses per-store latest date)
GET  /product/:ean              → single product with price history + all store prices
POST /cart/cheapest             → body: {items:[{ean,qty}]}
                                  returns: multiStore (best combo) + singleStore (ranking)
GET  /chips?store=&limit=       → VTEX promo label chips (backward compat)
GET  /meta                      → product counts (backward compat)
```

---

## What Needs Refinement

### Priority 1: VTEX Fetch Script is Broken
`scripts/fetch_product_promos.js` — last 2 runs (Mar 7 evening, Mar 8 morning) returned 0 items for all stores. The Mar 7 morning run worked fine (94K items). Likely an API change or rate limiting. Debug and fix.

### Priority 2: Search Quality
- The search is basic LIKE matching on product name/brand/EAN
- Consider adding FTS5 (SQLite full-text search) for better results
- Product names from SEPA are ALL CAPS and inconsistent across stores
- Name normalization / fuzzy matching would help

### Priority 3: Coto EAN Mapping
- Coto CIO data (43K products) does NOT have EAN codes
- SEPA has EAN for Coto (12.6K products)
- We built a price+fuzzy-name matcher (7.9K matches at 0.75 similarity) but it's not integrated into the server yet
- The `findCotoMatch()` function in product-search-server.js uses exact normalized name matching — works but misses many products
- Better approach: pre-build a mapping table (coto_sku → ean) and load it at startup

### Priority 4: Frontend Polish
`carrito.html` is functional but could use:
- Product images (Coto CIO has `imageUrl` for matched products)
- Promo labels/badges from VTEX data
- Price history sparklines in product detail
- "Share cart" functionality (URL-encoded cart state)
- Better empty state / onboarding
- Link to online store for each product (Coto CIO has URL, VTEX has links)

### Priority 5: Missing Data Gaps
- Cencosud (Jumbo/Disco/Vea) stopped appearing in SEPA ZIPs after Mar 5
- This is a SEPA data source issue, not a bug — the gov sometimes lags
- The server handles this with per-store latest date logic
- Carrefour only has data through Mar 5 in online tier (viernes/sabado/domingo ZIPs didn't include comercio 10)

### Priority 6: Descuentos Page Update
- `descuentos.html` still uses the old API format (VTEX-only product search)
- Should be updated to use the new unified API
- Or just add a prominent link to carrito.html

---

## Important Constraints

1. **NEVER restart the OpenClaw gateway** (port 18789) — it's managed by the Gemini CLI process
2. **Tailscale serve** routes are how the frontend accesses the API (`/preview/products-api/` → port 3005)
3. **CSP blocks external scripts** — all JS must be inline in HTML files
4. **Coto WAF blocks web pages** from this VPS — use Constructor.io API or ATG REST API only
5. **Static files** go in `/usr/lib/node_modules/openclaw/dist/control-ui/` (needs sudo)
6. **sqlite3 binary** is at `/home/linuxbrew/.linuxbrew/bin/sqlite3` (not in default PATH for systemd)

---

## Useful Commands

```bash
# Check all timers
systemctl list-timers --all | grep -E 'sepa|product|coto'

# Check API server
systemctl status product-search-server
curl -s http://127.0.0.1:3005/stores | python3 -m json.tool

# Query the DB directly
sqlite3 ~/.openclaw/workspace/data/sepa/db/sepa-prices.db "SELECT store, tier, COUNT(DISTINCT ean) FROM prices WHERE date=(SELECT MAX(date) FROM prices WHERE store=prices.store) GROUP BY store, tier;"

# Re-run SEPA backfill
cd ~/.openclaw/workspace && node scripts/sepa-ingest.js --backfill

# Re-run Coto fetch
cd ~/.openclaw/workspace && node scripts/coto_fetch_products.js

# Restart API server after code changes
sudo systemctl restart product-search-server

# Check Tailscale routes
sudo tailscale serve status

# Access the cart page
# https://miopenclaw-vnic.tail9799d2.ts.net/carrito.html
```

---

## Memory Files

- `~/.openclaw/workspace/MEMORIES.md` — agent long-term memory (canonical)
- `~/.claude/projects/-home-ubuntu/memory/MEMORY.md` — Claude Code auto-memory
- Both contain detailed notes on data sources, API keys, store codes, etc.

---

## SEPA Store Map (comercio_id → banderas)

```
9:  Cencosud → 1:vea, 2:disco, 3:jumbo
10: Carrefour → 1:carrefour-hiper, 2:carrefour-market, 3:carrefour-express, 4:carrefour-maxi
11: Changomas → 1-5:changomas
12: Coto → 1:coto
15: DIA → 1:dia
```

---

## Coto Constructor.io API

- Key: `key_r6xzz4IAoTWcipni`
- Browse endpoint: `https://ac.cnstrc.com/browse/{facet}/{value}?key={key}&section=Products`
- Store 060 = Coto Digital (online prices, usually cheaper)
- Store 133 = Mayorista (skip)
- 10K pagination cap per query — bypass by querying subcategories or sale_types individually
- 21 sale_type values, all fetched separately

---

Good luck! The system is functional end-to-end. The main opportunities are search quality, Coto EAN mapping enrichment, and fixing the VTEX fetch script.
