# Security Review — Health Bridge `/ingest_signed`

**Scope:** `server.py` (`/ingest_signed` endpoint, HMAC verification, replay protection, token handling) and `ingest_public_proxy.py`
**Type:** Read-only static analysis. No live testing.
**Date:** 2026-06-02

---

## Summary Table

| Severity | Location | Finding |
|---|---|---|
| CRITICAL | `HealthConnectSync.kt:36` | HMAC signing secret hardcoded in APK source |
| CRITICAL | `server.py:330` | Nonce cleanup runs only after INSERT succeeds; cleanup failure leaves no retry |
| HIGH | `server.py:307` | Device header check is redundant and misleading |
| HIGH | `server.py:313` | 10-minute timestamp window creates a bounded replay window |
| HIGH | `server.py:297-330` | No rate limiting on `/ingest_signed` |
| HIGH | `server.py:325-330` | Nonce table grows unbounded if cleanup fails |
| MEDIUM | `server.py:29` | `SIGNED_INGEST_SECRET` has no minimum-strength validation |
| MEDIUM | `server.py:292` | Bearer token check uses `!=` not `hmac.compare_digest` |
| LOW | `ingest_public_proxy.py:42` | Proxy passes headers without validation and no rate limit of its own |
| LOW | `server.py:291` | Token comparison is timing-safe anyway due to HMAC internals, but fragile |
| INFO | `server.py:330` | Nonce table cleanup is best-effort; stale entries accumulate if requests are sparse |
| INFO | `server.py:400` | No TLS client-cert verification (OK given Tailscale Funnel network model) |

---

## Finding 1 — CRITICAL

**File/Line:** `android-app/app/src/main/java/com/healthbridge/app/HealthConnectSync.kt:36`

**Finding:** HMAC signing secret hardcoded in APK source code.

```kotlin
private const val SIGNING_SECRET = "QwCARhYS0feJe0ZQZlELtNt03Icb-ZtK8KpTiN1KkEGBT7e6bIJr1dVQqBwHTL9L"
private const val DEVICE_ID = "pixel9a-chicho"
```

**Description:** The HMAC secret used to sign all ingest requests is a plain-text string constant in the Kotlin source. Android APKs are trivially decompilable with tools like `jadx` or `apktool`. An attacker who decompiles the APK (e.g., from the download endpoint or a MITM proxy) extracts the secret in seconds.

**Exploit scenario:**
1. Attacker decompiles `health-bridge.apk` (available at `/download`).
2. Extracts `SIGNING_SECRET` and `DEVICE_ID` constants.
3. Crafts any JSON payload, computes correct HMAC signature locally.
4. Sends signed requests directly to `http://127.0.0.1:3007/ingest_signed` (bypasses proxy entirely) or via Tailscale Funnel.
5. Writes arbitrary health records to Chicho's database, corrupting daily summaries.

**Suggested fix:** Never store secrets in APK source. Use Android Keystore (`AndroidKeyStore` + AES-GCM or the `SecurityCrypto` library) to generate/store a per-device key on first launch. Alternatively, use a signed JWT issued by a real auth server. The secret in the source is the core credential; everything else (HMAC, nonce, timestamp) is defense-in-depth that collapses if this is exposed.

---

## Finding 2 — CRITICAL

**File/Line:** `server.py:328-330`

```python
except sqlite3.IntegrityError:
    raise HTTPException(status_code=409, detail="Replay nonce")
db.execute("DELETE FROM ingest_nonces WHERE received_at < datetime('now', '-1 day')")
```

**Finding:** Nonce cleanup runs only after the successful INSERT. If the DELETE fails (e.g., WAL lock, disk full, permission error), there is no retry and no fallback. The cleanup is also only triggered on requests that passed full HMAC verification — not on background GC.

**Exploit scenario (low probability but high impact):**
- The `DELETE` fails once (e.g., transient SQLite WAL contention).
- Old nonces accumulate. The table grows without bound until disk fills or SQLite performance degrades.
- A second `DELETE` failure on the same condition means cleanup never runs again.

**Suggested fix:** Move DELETE to a separate call with its own try/except and log. Add a periodic cleanup task (e.g., FastAPI lifespan startup + periodic async task) so cleanup happens even if no valid requests arrive. Consider adding a cleanup threshold (e.g., DELETE when >10K rows exist).

---

## Finding 3 — HIGH

**File/Line:** `server.py:307`

```python
if device != SIGNED_INGEST_DEVICE or not nonce or not timestamp_raw or not signature:
    raise HTTPException(status_code=401, detail="Missing or invalid signed ingest headers")
```

**Finding:** The `X-HB-Device` header is validated with a simple string equality check *before* HMAC verification. Since the device is already included in the signed payload (`device\ntimestamp\nnonce\nbody_hash`), this check is redundant — the HMAC already binds the device.

The check is not a vulnerability by itself (HMAC verification is correct), but it creates a false sense of security: a developer could mistakenly change `SIGNED_INGEST_DEVICE` to the wrong value and believe the system is still protected, when in fact any device value matching the signature would work.

**Exploit scenario:** None directly, since HMAC verification covers device. Risk is misconfiguration.

**Suggested fix:** Remove the pre-HMAC device check. The signature already authenticates the device. Keep the check only if you want to fail-fast for obvious misconfigurations (in which case, log a warning rather than silently ignoring it).

---

## Finding 4 — HIGH

**File/Line:** `server.py:313-314`

```python
if abs(int(time.time()) - timestamp) > SIGNED_INGEST_MAX_SKEW_SECONDS:
    raise HTTPException(status_code=401, detail="Timestamp outside allowed window")
```

**Finding:** The default `SIGNED_INGEST_MAX_SKEW_SECONDS = 300` (5 minutes each direction) means the window is **up to 600 seconds wide**. A valid signed request captured by an attacker can be replayed for up to 10 minutes.

**Exploit scenario:**
1. Legitimate app sends a signed request at `t=0`.
2. Attacker intercepts the request (on the local network, via a compromised router, or from server logs).
3. Attacker replays the same request at `t=400` (6.6 minutes later). Server time is `t=800`. `|800 - 500| = 300` — within window. Request accepted.
4. Even with a fresh nonce, the attacker can inject the same records multiple times, causing duplicate writes (overwriting with the same data is idempotent but still wasteful and confusing).

**Suggested fix:** Reduce `SIGNED_INGEST_MAX_SKEW_SECONDS` to 60 seconds or less. The app uses `System.currentTimeMillis() / 1000` which should be within a few seconds of the server. A 60-second window accommodates normal clock drift while limiting replay exposure. Better: add a server-side monotonic counter (e.g., `ts:nonce:seq`) or use a short-lived JWT instead of HMAC.

---

## Finding 5 — HIGH

**File/Line:** `server.py:297-330` (`_verify_signed_ingest` function, no rate limiter)

**Finding:** `/ingest_signed` has no rate limiting whatsoever. Any party with a valid HMAC (e.g., from the decompiled APK) can send unlimited requests.

**Exploit scenario:**
1. Attacker extracts secret from APK.
2. Sends 1 request/second with 2MB payloads.
3. `raw_data` table grows ~80MB/day. `daily_summary` rebuilds run constantly.
4. SQLite WAL file grows unbounded.
5. Disk eventually fills.

**Suggested fix:** Add per-device rate limiting (e.g., a simple in-memory counter: max 10 requests/minute per device). Use a library like `slowapi` attached to the FastAPI app, keyed on `X-HB-Device`. Alternatively, use a sliding-window nonce table with TTL.

---

## Finding 6 — HIGH

**File/Line:** `server.py:325-330` (nonce table growth)

**Finding:** Nonce cleanup (`DELETE FROM ingest_nonces WHERE received_at < datetime('now', '-1 day')`) runs only on successful requests. If the server receives no valid signed requests for an extended period, stale nonces accumulate indefinitely.

**Exploit scenario:**
1. App stops syncing (e.g., user goes on vacation).
2. Nonces from past days accumulate.
3. Table size grows linearly with usage lifetime.
4. Eventually: WAL bloat, query slowdown on nonce INSERT (PK lookup scans growing index).

**Suggested fix:** See Finding 2. Move cleanup to a background task. Also add an explicit max-rows guard: `DELETE ... LIMIT 1000` per request to avoid long-running DELETEs on huge tables.

---

## Finding 7 — MEDIUM

**File/Line:** `server.py:29`

```python
SIGNED_INGEST_SECRET = os.environ.get("HEALTH_BRIDGE_SIGNED_INGEST_SECRET", "")
```

**Finding:** The secret defaults to empty string `""`. If the operator never sets the env var, the endpoint returns 503 (correct). However, if the operator sets a short or guessable secret (e.g., `"secret"`, `"hunter2"`, the server starts and accepts requests with weak HMACs.

**Exploit scenario:**
1. Operator sets `HEALTH_BRIDGE_SIGNED_INGEST_SECRET=secret` for testing and forgets to rotate it.
2. Attacker guesses the secret (or finds it in a leaked `.env` file).
3. Attacker can now forge arbitrary signed requests.

**Suggested fix:** Add startup validation: if the secret is set but shorter than 32 bytes, log a warning and refuse to start. Document the minimum secret requirements clearly.

---

## Finding 8 — MEDIUM

**File/Line:** `server.py:292`

```python
if token != AUTH_TOKEN:
```

**Finding:** The bearer token comparison uses `!=` (string inequality) rather than `hmac.compare_digest`. While the token is a random-looking string from the service file, direct string comparison leaks the equality check timing via response latency (though the delta is nanoseconds — likely not exploitable over a network).

**Exploit scenario:** Theoretically, a network-level attacker could measure response times for `401` vs `200` to narrow down the token. In practice, for a 40-character random token, this is not realistically exploitable.

**Suggested fix:** Use `hmac.compare_digest(token, AUTH_TOKEN)` instead. This is a one-line fix with no downside.

---

## Finding 9 — LOW

**File/Line:** `ingest_public_proxy.py:24-42`

**Finding:** The proxy passes `X-HB-Signature`, `X-HB-Timestamp`, `X-HB-Nonce`, and `X-HB-Device` without validation and adds no rate limiting of its own. Any attacker who captures a valid signed request can replay it through the proxy.

**Note:** This is not an additional vulnerability because the backend at `127.0.0.1:3007` is reachable directly from the proxy host. However, the proxy's lack of rate limiting means it cannot serve as a choke point to catch or throttle abuse.

**Suggested fix:** If the proxy is meant to be a security gate (not just a routing layer), add per-IP rate limiting. If it's just routing, document clearly that the backend must not be directly accessible from untrusted networks.

---

## Finding 10 — LOW / INFO

**File/Line:** `server.py:330`

**Finding:** The `DELETE` cleanup uses `datetime('now', '-1 day')` which is evaluated at statement execution time. If multiple requests arrive within the same second, they all trigger the same DELETE. This is fine but slightly inefficient.

**Note:** The proxy exposes `/health` and static file serving alongside `/ingest_signed`. The `/fba/` and `/static/` paths serve files from `/home/ubuntu/baby-checklist/public/`. A path traversal (e.g., `/static/../../etc/passwd`) is not possible because FastAPI resolves the path literally after the prefix, but if `baby-checklist/public/` symlinks to sensitive directories, that could be a concern. Not exploitable given the current code, but worth noting.

---

## What I Did Not Check

- **Network-level exposure:** I did not verify whether `127.0.0.1:3007` is truly unreachable from outside the server (i.e., whether there's a firewall rule preventing direct access). The `server.py` binds to `0.0.0.0:3007` which means it's listening on all interfaces. If the Tailscale Funnel or any other mechanism accidentally exposes port 3007, the HMAC check is the only barrier.
- **Tailscale Funnel ACLs:** I did not audit Tailscale ACL policy for port 8443 exposure scope (tailnet-only vs. internet).
- **Android APK binary analysis:** I did not decompile the actual built APK to verify the secret is present as-is. I read the source, which is sufficient to declare it hardcoded.
- **SQLite WAL growth under concurrent writes:** The nonce table uses `PRAGMA journal_mode=WAL`. Under high concurrent load, WAL files can grow large. I did not stress-test this.
- **Replay via clock drift:** I did not check whether server clock drift combined with max-skew can extend the replay window beyond 10 minutes.
- **Data integrity of `_ingest_body`:** I did not verify whether duplicate records (same `record_key`) truly produce idempotent `ON CONFLICT` updates or whether there are edge cases where the summary rebuild produces incorrect results.
- **Input validation on `record_key`:** The `record_key` is derived from client-supplied fields (`subject`, `id`, etc.). Malformed `record_key` values could potentially cause issues in downstream processing.
- **Dependency vulnerabilities:** I did not audit the Python dependencies (`fastapi`, `uvicorn`, `httpx`, etc.) for known CVEs.
- **TLS certificate validation in Android app:** The app uses `HttpURLConnection` with default certificate validation. I did not verify whether it properly validates the Tailscale Funnel TLS cert.
- **Code review of `ingest_public_proxy` error handling:** The proxy uses `httpx.AsyncClient` with a 20s timeout. I did not verify what happens on upstream timeout or whether error responses leak information.
- **Audit of `/health-ingest` Tailscale serve route:** There is a `tailscale serve` route `/health-ingest` → `127.0.0.1:3007`. I did not verify whether this path bypasses the proxy and what its exposure is.