# Carrefour Argentina Integration Strategy

**Date:** 2026-03-05
**Context:** Following successful DIA and COTO autonomous ranking systems using HTTP cookie-replay pattern.

---

## Platform Identification: VTEX

Carrefour Argentina (carrefour.com.ar) runs on **VTEX IO**, confirmed by public Carrefour-VTEX Postman workspaces and VTEX case studies citing Carrefour as a flagship customer. This fundamentally shapes the integration approach.

---

## 1. Authentication Mechanism

**Hybrid: Session Cookie + Token-Based**

VTEX uses a dual-layer auth model:

- **Browser sessions:** Anchored by `VtexIdclientAutCookie` — a signed, httpOnly JWT-like cookie issued at login. This is the primary session identifier for storefront requests.
- **API layer:** VTEX Core Commerce APIs support AppKey/AppToken pairs (`X-VTEX-API-AppKey` / `X-VTEX-API-AppToken` headers) for back-office access, but these require merchant credentials we don't hold.
- **User token path:** VTEX ID tokens (short-lived Bearer tokens) can be obtained by exchanging the session cookie via `https://{account}.vtexid.com.br/api/vtexid/pub/authenticated/user`.

This is **more complex than DIA** (simple session cookie) but more structured than raw HTML scraping.

---

## 2. API vs HTML Scraping

Carrefour Argentina exposes **semi-public VTEX REST endpoints** that return JSON:

- Product/catalog: `https://www.carrefour.com.ar/api/catalog_system/pub/...`
- Order history: `https://www.carrefour.com.ar/api/oms/user/orders` (requires authenticated session cookie)
- Cart/checkout: `https://www.carrefour.com.ar/api/checkout/pub/...`

**No HTML scraping needed.** VTEX storefronts make XHR calls to these JSON APIs, which can be replayed with authenticated cookies. This is cleaner than COTO's HTML parsing and aligns with the DIA pattern.

---

## 3. Session Lifetime and Refresh

- **VtexIdclientAutCookie:** Typically **30 days** (configurable by merchant). Far longer than DIA sessions.
- **Refresh mechanism:** VTEX auto-renews on activity. Periodic background keep-alive requests to any authenticated endpoint (e.g., profile or cart API) extend the session.
- **Token exchange:** If short-lived Bearer tokens are needed for specific endpoints, they can be refreshed by POSTing the persistent cookie to the VTEX ID endpoint — no re-login required.

Session management is **simpler and more durable** than DIA here.

---

## 4. CAPTCHA / 2FA Concerns

- **Login flow:** Uses DNI (national ID) + password. No confirmed 2FA on the consumer app.
- **CAPTCHA risk:** Google reCAPTCHA is likely present on the login form (standard for VTEX deployments). Cloudflare Bot Management may also be active.
- **Mitigation:** Since session cookies are long-lived (30 days), login frequency is very low — CAPTCHA challenges should occur rarely. Manual login once to capture the cookie jar is the practical solution.
- **Proxy risk:** Residential IP or the VPS's clean IP should be used. Datacenter proxies trigger Cloudflare challenges.

---

## 5. Recommended Approach: Cookie-Jar Replay (Primary) + Playwright Fallback

**Recommendation: Cookie-jar replay, identical to DIA pattern.**

### Implementation Plan

1. **One-time manual login** via Playwright headless Chromium (profile `openclaw`) to handle any CAPTCHA at login. Export the resulting `VtexIdclientAutCookie` + supporting cookies to a JSON cookie jar.

2. **All subsequent requests** use HTTP cookie-jar replay (axios/fetch with cookie headers). Target the VTEX JSON APIs directly — no HTML parsing.

3. **Session health check:** Before each run, hit `GET /api/oms/user/orders?page=1&per_page=1`. HTTP 200 = session valid; 401/403 = re-run Playwright re-login.

4. **Cookie refresh:** If session nears expiry (check cookie `expires` field), make a keep-alive request 24h before expiry.

### Key API Endpoints for Orders/Ranking

```
GET /api/oms/user/orders                    # order history
GET /api/oms/user/orders/{orderId}          # order detail
GET /api/catalog_system/pub/products/search # product search
GET /api/checkout/pub/orderForm             # current cart
```

### Why Not Playwright-Only?

Playwright is slower and resource-intensive. VTEX's JSON APIs are clean and stable — cookie replay is fast, deterministic, and runs without a browser process. Reserve Playwright only for the initial login.

---

## Summary

| Dimension | Assessment |
|-----------|-----------|
| Auth type | VTEX session cookie (`VtexIdclientAutCookie`) — long-lived, JSON-API accessible |
| API availability | Yes — VTEX JSON REST APIs, no HTML parsing needed |
| Session lifetime | ~30 days, auto-refreshable |
| CAPTCHA risk | Low (only at login); reCAPTCHA likely on login form only |
| **Strategy** | **Cookie-jar replay (DIA pattern) + one-time Playwright login** |

Carrefour is the easiest of the three to integrate at the API layer due to VTEX's structured JSON APIs. Implement immediately following the DIA cookie-replay pattern with the one-time Playwright bootstrap for login.
