PassLane “Lane” — Third Draft (Run 3)
Adversarial evaluation + refinement · code-verified · plan only · 2026-06-19

PassLane "Lane" — Third Draft (Run 3)

Principal / Head of Product + Engineering · 2026-06-19 · Refinement of Run-2, not a rewrite

Code-verified against the single 287 KB passlane/app/index.html, questions.json (323 Qs), voice-sandbox/harness.js, the vendored speech fork at passlane-ios/local-plugins/speech-recognition, and app/privacy.html. Every line citation below was re-checked this session.


1. Verdict on the Fundamentals

Run-2's mechanical skeleton holds. Run-2's product thesis for Tier-1 does not. Two launch-blockers were missed by both prior runs, and one of them is live in the store build right now, independent of Lane.

What the code verification VALIDATES (build on these, do not relitigate):

What MUST change — bluntly:

  1. Tier-1 as specced is net-negative, and the data is worse than run-2 thought. Run-2's premise — "the one 1–2 sentence already on screen" — is false. Verified: explanations are a 37-word median (mean 37.1), 23 min, 93 max; 300 of 323 are ≥30 words; zero are terse. They are teaching paragraphs, not stubs. So "Put it simply / Show an example / When does this apply?" chips can only reprint (forbidden) or restate a substantial paragraph (lipstick). Shipping echo chips burns trust on the highest-intent beat. The clarifier chips come out of Phase 1.
  1. The 2-second autoadvance default makes the whole feedback card vanish before it's read. Verified: advanceSeconds = 2 (L1735, slider value=2 L1423, PACING_DEFAULTS.advance L4476). In silent/text mode the bar shows, then setTimeout(advanceNow, 2000) (L3037) wipes it. Run-2's "passive source chip does not freeze the timer" rule means Lane's entire affordance flashes and dies in 2s for the default-config majority. Neither prior run stated the default. This single fact governs whether Tier-1 is worth anything.
  1. The "Report this" control that both runs assume is shipping does not exist. Grep confirms the only "flag" is az_flagged (L1853/L1881) — a replay bookmark, not a content-error report. There is no az_reports buffer, no report button, no sink, anywhere. Google Play's generative-AI policy requires an in-app report path on any AI feature. Both runs lean on it as if built. It is vaporware and a launch-blocker for any AI tier.
  1. privacy.html L41 already makes the false claim, in the live build, before any Lane work. It reads: "PassLane itself does not record, store, or transmit any audio." The shipping answer mic round-trips audio to Apple/Google. This is a present compliance defect with zero dependency on Lane and must be hotfixed in the imminent launch, not filed under Phase 1.5.

Net: keep the wiring, gut the Tier-1 content thesis, fix two things that have nothing to do with AI, and stop building the voice narrative on a privacy claim that is per-device-false.


2. Changelog vs Run-2

KEPT

CHANGED

DROPPED

ADDED


3. Refinements & Upgrades (prioritized, with effort)

P0 — do before any cx- code, foundational, cheap:

  1. Hotfix privacy.html L41 copy — ship immediately, zero review. Replace the false audio claim with the truthful "voice answering uses your device or platform speech service (Apple/Google), which may convert your speech to text on their servers; PassLane does not itself store or keep any audio." Decoupled from Lane; ship in the current launch. Effort: S.
  2. Reconcile the Apple privacy label + Google Data-safety form — start now, but do NOT block the copy fix or the launch on it. These are store-console edits that can trigger re-review and delay an imminent launch; sequence them in parallel, not as a gate on P0-1. Effort: S (work) + variable (review clock).

P1 — Phase-1 Lane (the integration spike), each concrete:

  1. Ship Tier-1 as source-chip-ONLY + the local-state teaching strip. Provenance chip (ti-shield-check, "AZ question bank · Disability income," scoped to STATE_FILE) + a your-data line (weakCategories() + progress[id]: "Insurable Interest — you're at 40% here") + one "Drill this topic" button routing to the existing startStudy('weak')/buildQueue filtered to q.category. Zero authored prose, all local, $0, dodges the no-reprint trap entirely. Effort: M.
  2. Build the az_reports buffer + "Report this" control, distinct from az_flagged, writing {qid, shownText, ts} locally, surfaced via a hidden Settings copy-to-clipboard export. Effort: S.
  3. Resolve the autoadvance contract: source chip lives as a hairline footer inside the existing 32vh explanation scroller (≈16px, not a new row); a one-shot first-session beat (reuse the az_seen_hint idiom) pauses the 2s timer once to teach provenance; thereafter passive. Effort: M.
  4. Adaptive vertical budget: on engage, collapse #feedback-expl's 32vh cap (it's been read) + the redundant voice-bar; give reclaimed space to cx-lane capped at min(30vh, remaining); feedback-bar is the single scroll container, .feedback-next stays flex-shrink:0. Prototype at 667pt before calling Phase 1 done. Effort: M.
  5. Mandatory displayQuestion teardown of the cx-lane sibling at L2755, beside the existing voice-transcript clear. Nothing in the current reset path touches a sibling. Effort: S.
  6. Dedicated cx- focus/input CSS in green→cyan (+ light-theme olive remap) — reuse the #fb-text HTML idiom, never its blue :focus. Effort: S.
  7. First-run Lane hint for the free text-only majority + fire the AI-disclosure on first TEXT appearance (not first voice turn — the majority never speaks). Effort: S.
  8. Adopt the cx- placement convention (one-line note, not a project): put all cx- JS inside the existing app <script> block (the one containing qSilenceGen). harness.js already hard-fails loudly at L50-51 if the extracted block lacks the voice engine, so a stray <script> fails, it does not silently false-green; the only residual risk is cx- voice code landing in a different block, which this convention closes. Effort: trivial.

P2 — conversion + privacy correctness (copy/config, high-leverage):

  1. Add PW_CONTEXT.ask selling coaching ("You've got the why. Want to ask follow-ups out loud and talk any question through? That's Plus.") and route Lane's voice tiers through it. Effort: minutes.
  2. Re-headline Pro around READINESS, not conversation — "Know you're ready before you book the test"; voice drops from headline to a bullet. Reuses shipped computeReadiness()/weakCategories()/exam scoring. Effort: hours (copy) + M (the Opus diagnostic later).
  3. Reconcile to ONE price set before any Lane copy: the live IAP truth is $8.99/mo (L1529) + $39.99 lifetime (L1525), not the brief's $15.99/$59.99 nor the referral memo's $7.99/$29.99. No annual plan exists (product-id comments L2307-2308). Effort: hours.

P3 — voice ladder (spike-gated):

  1. Separate startCoachListening() with own CX_HARD_CAP_MS≈30000, own ~2.0–2.5s adaptive endpoint, own cue, no 900ms answer-fallback; branch all three answer-clock touchers on cxListening. Effort: M.
  2. Server-verified receipt → per-install signed token gating any transmitting/billable call (consent + spend both bind to it). Effort: M.

4. Implementation Reality

How it actually works, tier by tier

Tier 1 (TEXT, free, $0, ships first). On feedback, the grounded sentence renders as today in #feedback-expl. Lane inserts a cx-lane sibling between L1633 and L1634 containing: the provenance chip (hairline footer inside the explanation scroller), the local-state teaching line, and "Drill this topic." It calls clearAdvanceTimer() only on explicit engagement; the passive render does not freeze the 2s timer (but the one-shot first-session beat does, once). displayQuestion tears it down. It transmits nothing — it ships no audio and makes no network call. This is the only tier in the first release, and it is the integration spike that proves the whole additive seam (sibling insert, teardown, brand CSS, placement convention, autoadvance timing) before any mic or authored content exists.

Tier 2 (PUSH-TO-TALK, Pro + consent, spike-gated). Press-and-hold the same control; a separate startCoachListening() window streams interim text into a Lane-owned transcript; release endpoints, finalized transcript shows, then matches the grounded bank. Inherits nothing from startListening — re-implements the isExam guard, the modal-refusal guard (L3505), and explicitly does NOT inherit the 900ms fallback.

Tier 3 (HANDS-FREE, Pro + consent, the only transmitting tier). Half-duplex, entered from Home. Loop: read → answer → clearAdvanceTimer() → Lane speaks the grounded "why" → Lane-owned "your turn" cue (never speechReady) → cx-listen → adaptive endpoint → think → answer → re-arm. Sends text only, never audio, after the consent gate. Barge-in over Lane's speech is tap-only by construction (no mic is open while Lane talks, per the .playAndRecord/duckOthers session) — "voice stop" applies only inside the user's own listen window. Do not imply a voice interrupt the architecture cannot deliver.

The real landmines (from the verification ledger) + how each is handled

textContent wipes children every Q (L2927/L3003)
Handling Insert as sibling, never child. Verified survives.
displayQuestion does NOT touch feedback-bar siblings (L2755 hides bar only)
Handling Lane adds its own explicit teardown at L2755. Mandatory.
Shared single-bind handlers (partialResults L3373, speechReady L3386)
Handling Branch on cxListening at the top of each; when false, byte-for-byte current behavior → harness stays green.
THIRD answer-clock toucher: 900ms speechReady fallback (L3542-3549, fires haptic + noteVoiceActivity())
Handling Separate startCoachListening() with its own cue; never inherit the fallback.
startListening isExam guard is local (L3501)
Handling startCoachListening() re-implements if (isExam) return + the L3505 modal guard at its own entry.
advanceSeconds=2 default (L1735/L4476) wipes the card in 2s
Handling One-shot first-session pause; passive thereafter; chips (if ever) only in Instant/manual pacing.
#feedback-expl already aria-live=polite (L1633)
Handling Announce only on explicit engagement; passive chip is not a second live region; streaming target aria-hidden.
.fb-text:focus is var(--blue) (L895)
Handling Dedicated cx- focus CSS in green→cyan.
Harness lastIndexOf('<script>') over 3 tags (L44)
Handling Already hard-fails loudly (L50-51) if the grabbed block lacks the voice engine — a stray <script> fails, not false-greens. Residual rule: keep cx- JS inside the app block (the one with qSilenceGen).
No az_reports / report control exists (only az_flagged bookmark, L1853/L1881/L2652)
Handling Build it in Phase 1, distinct buffer, offline queue.
isPro() spoofable localStorage (L2148) gates all voice (canUseVoice L2174)
Handling Any transmitting/billable tier gates on server-verified receipt, not the client flag.
privacy.html L41 false today
Handling Hotfix in current launch, Lane-independent.

Feasible-now vs spike-gated

What would actually break, and the guard

Spike reality (de-risked, not free)

The plugin is already a vendored fork at passlane-ios/local-plugins/speech-recognition — hand-edited Swift + Java, compiling. The spike is "add a property to code we own," not "fork from zero." But: iOS on-device needs a first-run model download (silent cloud fallback during that window — the worst moment for an honesty product), iOS 26's new DictationTranscriber drops the contextualStrings biasing the fork uses for A/B/C/D letters, and on-device accuracy is lower. So the spike's pass bar splits into two gates: (1) on-device PHRASE transcription offline (near-certain), and (2) single-LETTER A/B/C/D offline (the genuine maybe). Recast success as: on-device is the default and the code asserts it (no silent cloud fallback — if unavailable, voice is disabled, not quietly transmitted); a cold-start "preparing offline voice…" UX; a vocab-specific accuracy floor.


5. The Tier-1 Question, Settled

Is "a source chip + capped chips that can only paraphrase the one sentence on screen" worth shipping? No — and the verification proves it harder than run-2 admitted. The explanations are not stubs: 37-word median, mean 37.1, 300/323 ≥30 words. A "Put it simply" chip on a paragraph this substantial is either a reprint (forbidden) or a reword (lipstick). One empty tap on the highest-intent beat teaches "Lane is filler," permanently — the worst outcome for a trust-is-the-moat product.

A note on distractor discrimination (corrected from run-2's draft): the corpus discriminates distractors far less than first claimed. A strict detector — explanations that explicitly contrast the right answer against a named wrong concept (e.g. "This differs from life insurance, where…") — finds only ~2–8%. A loose detector that counts any contrastive/negation token ("not," "while," "rather," "however") reaches ~30%, but most of those are ordinary exposition using the word "not," not distractor teaching. The honest figure is therefore ~2–8% genuine discrimination, ~30% merely contrastive prose. This strengthens the case below: an authored "Why not B?" layer is less pre-empted by the existing corpus than a "33% already do it" claim would imply.

The minimum that makes Tier-1 genuinely worth shipping — and it needs no authored prose corpus:

  1. The provenance chip — genuinely new information (which vetted corpus this came from, scoped to the real STATE_FILE, with the AZ+5-state fallback). Trust, not paraphrase. Zero content cost.
  2. A local-state teaching strip — the user's own miss pattern (weakCategories(), progress[id]), a misconception pointer driven by the wrong letter they actually picked (submitAnswer already knows it), and one-tap "Drill this topic" into the existing weak-drill machinery. This is information the flat explanation cannot contain — the learner's behavior — and it's what makes a named companion feel like "a teacher who's paying attention."
  3. The az_reports report control — Play-required, and the day-one escape hatch the moment Lane frames any content.

The honest alternative (if even that strip is descoped): ship the provenance chip alone. It is honestly useful, offline, $0, and the front door for the voice ladder. Never ship echo chips, and never ship a disabled "Ask Lane" with a waitlist.

On the authored second corpus (the deferred, larger truth): if a future authored layer is funded, it is not a blanket 323×3 fill and not a mechanical "Why not B?" sprint across every distractor in all six live banks. The right mechanism: run the existing generate→judge pipeline (master-plan §4.4) with the judge question inverted to novelty ("does this add information NOT already in q.explanation?"). Keep only cells that pass containment AND novelty; emit null otherwise — null is the expected majority state. Store it in a separate coach-az.json (id-keyed, src_hash-stamped) so it never bloats the frozen bank, stays kill-switch-suppressible per id, and edits without a bank rebuild. This settles run-2's OQ3 empirically for single-digit dollars instead of eyeballing 2-3 samples — but it is a fast-follow, not a Phase-1 dependency.


6. New Considerations (what runs 1–2 missed)

  1. The 2-second autoadvance default — neither run stated advanceSeconds=2. It makes the entire feedback card, and therefore all of Tier-1, vanish before the default-config majority can read it. This is the most important miss; it reframes the whole "highest-intent beat" premise.
  2. The report control is vaporware — both runs treat az_reports/a report button as shipping; grep proves only az_flagged (a bookmark) exists. Play-required, so a launch-blocker.
  3. privacy.html L41 is already false — a live, Lane-independent compliance defect, not a future Plane-B task.
  4. The aria-live collision#feedback-expl is already aria-live=polite; a second polite region is an accessibility regression (doubled spoken verbosity per question) for an older, partly low-vision candidate pool, not a feature.
  5. The 900ms speechReady fallback is a third answer-clock toucher — a bare setTimeout, not a plugin event, so a "branch the listeners" reading misses it.
  6. The explanation corpus shape — 37-word median, 300/323 ≥30 words, and (corrected) only ~2–8% genuinely discriminate a named distractor while ~30% merely contain a contrastive word. The Tier-1 problem is too much on-screen teaching prose for chips to add to, not too little.
  7. The paywall contexts sell the wrong thingvoice="drive," listen="read-aloud"; reusing them for Lane's "ask" wall produces the "pay to talk" read mechanically. Neither run read the copy it proposed reusing.
  8. Voice may be the wrong Pro hook — test-prep buyers pay for rationales, adaptive drilling, scored mocks, and readiness analytics; conversation is absent from that list. Duolingo moved its AI explanation to FREE in Jan 2026 (gating didn't convert) and found making it free didn't accelerate growth — free AI teaching is necessary for trust but not a conversion lever. Put the dollar on readiness.
  9. The plugin is already forked — the spike is an edit to owned code, materially de-risking the voice ladder (neither run noticed); but iOS-26's DictationTranscriber drops contextualStrings, and on-device has a cold-start cloud-fallback window — so "audio never leaves the device" is per-device-conditional, not binary.
  10. Half-duplex barge-in over Lane's speech is architecturally impossible (no mic open while Lane talks) — an eyes-busy gap the commuter hits immediately.
  11. No onboarding for the free text-only majority — run-2's disclosure is voice-turn-centric, so the largest segment (and the one that decides the store rating) gets no Lane intro and the AUP disclosure can be missed entirely.
  12. i18n hard wall<html lang="en">, all-English corpus + copy, English-only on-device STT. A Spanish-first commuter is exactly the voice persona, silently excluded. Externalize Lane's new strings now (cheap insurance); defer the Spanish fork explicitly.

7. Updated Build Plan

Phase 0 — Launch hygiene (decoupled from Lane, in the current release):

Gate: harness green; privacy copy truthful against the shipped answer mic; store-label work in flight but not gating the copy fix.

Phase 1a — CRAWL: provenance + report (integration spike, free, $0): the cx-lane sibling, source chip (hairline footer in the 32vh scroller), displayQuestion teardown, dedicated cx- brand CSS, adaptive vertical budget, the one-shot first-session autoadvance pause, the az_reports report control, the first-run hint + AI-disclosure-on-first-text. Prereq: P0a/P0c done. Gate: voice-sandbox/harness.js green before/after (asserts zero diff to mode_select→…→advancing); prototype at 667pt keeps .feedback-next pinned; airplane-mode smoke green; zero new /api/ calls.* This proves the additive seam with zero voice and zero authored content.

Phase 1b — the local-state teaching strip: miss-pattern line + wrong-letter misconception pointer + "Drill this topic" on existing weakCategories()/startStudy('weak'). Gate: routes correctly per category; transmits nothing.

Phase 1.5 — GATE: on-device-STT spike on real iOS + Android against IOS-VOICE-TEST-PLAN.md, split into the phrase gate and the letter gate, with the no-silent-fallback + cold-start + vocab-floor success bar. Until green: no "sends nothing" copy, PTT out of release. Content prerequisite (parallel, off the critical path): run the generate→judge novelty pass to size coach-az.json non-null cells; AZ first.

Phase 2 — WALK: hold-to-ask + the conversational wedge: separate startCoachListening(), the all-events cxListening router (incl. the 900ms fallback), PW_CONTEXT.ask, the readiness re-headline. Gate: new harness scenarios — cxListening window leaves the 5s clock + answer haptic untouched; 400ms cooldown honored on every coach stop→start; a coach utterance never calls processVoiceMatches. Mock-coaching fires 0× mid-exam.

Phase 3 — RUN: hands-free + live Pro tier: the half-duplex loop, Lane-owned cue, tap-only barge-in, the key-holding proxy, the §5.7 remote denylist, and the readiness diagnostic as the paid centerpiece. Gate: server-verified receipt → per-install token live (client flag gates UI only, cannot authorize spend); consent record + spend bind to the token; budgets + kill-switch tested; every privacy surface — in-app L1226, privacy.html, terms, paywall, Apple label, Play Data-safety, and the marketing site — flips in one coupled, gated release; first-token ≤1.5s p50 / 8s→fail-closed.


8. Reality-Check Risks

Default user (2s autoadvance) never sees Lane — card flashes and dies
Mitigation One-shot first-session pause teaches the affordance; provenance as hairline footer in the already-read scroller; chips (if ever) only in Instant/manual pacing
Echo chip on a 37-word paragraph teaches "Lane is filler" — permanent trust loss
Mitigation Drop generic chips; ship provenance + local-state strip (new info, not paraphrase); any authored layer gated on the novelty judge, null-by-default
AI feature rejected for missing report path
Mitigation Build az_reports + "Report this" in Phase 1, offline queue, distinct from az_flagged
Live privacy claim false today (privacy.html L41)
Mitigation P0a copy hotfix in current launch, decoupled from Lane; P0b store-label reconcile in parallel, not gating launch
cx- voice handler lands in the wrong <script> block
Mitigation Placement convention (cx- JS inside the app block with qSilenceGen); harness already hard-fails loudly on an empty grab (L50-51), so the dangerous case is narrow
Answer haptic + 5s clock leak into a coach listen
Mitigation Separate startCoachListening(); branch all THREE touchers (incl. 900ms fallback); harness scenario asserts isolation
Stale cx-lane bleeds a wrong lesson onto the next card
Mitigation Explicit displayQuestion teardown at L2755
Spoofed az_pro → unmetered cloud STT → OS throttles voice for everyone
Mitigation Server receipt before any billable/transmitting call; on-device STT removes the cost vector entirely
30vh Lane pushes Next off a 667pt screen
Mitigation Adaptive budget: collapse read explanation + voice-bar on engage; min(30vh, remaining); Next stays flex-shrink:0; prototype at 667pt
Fixed 1.2–1.5s endpoint cuts off a thinking commuter in road noise
Mitigation User-owned turn boundary: 2.0–2.5s adaptive + a "tap when done" affordance; pilot on real road audio before Phase 3
Screen-reader hears explanation + Lane on every feedback
Mitigation Announce only on explicit engagement; passive chip not a live region; streaming target aria-hidden
.fb-text:focus paints Lane blue
Mitigation Dedicated cx- focus CSS, green→cyan + olive light-theme
Paywall reads "pay to talk"
Mitigation PW_CONTEXT.ask sells coaching; re-headline Pro on readiness; voice → a bullet
iOS on-device cold-start silently falls back to cloud during first session
Mitigation Spike asserts on-device-or-voice-off; "preparing offline voice…" UX; never a static global "sends nothing" string
Spanish-first commuter excluded
Mitigation Externalize Lane strings now; name the Spanish STT/grounding fork as explicit post-launch work

9. Open Questions for the Owner (genuine only)

  1. Price reconciliation (blocks all Lane copy). Live IAP is $8.99/mo (L1529) + $39.99 lifetime (L1525) — there is no annual plan (product-id comments L2307-2308), and the brief ($15.99/$59.99) and referral memo ($7.99/$29.99) both disagree with the store. Confirm we plan and write all Lane/paywall copy against the live numbers, and update the referral memo's payout math (computed off phantom prices today).
  1. Pro headline: readiness vs voice. The evidence says exam-prep buyers pay for certainty (readiness verdict + coached mock debrief + targeted drills), and the category panned "premium AI chat." Do we re-headline Pro as "know you're ready before you book," with hands-free conversation as a supporting bullet — or keep voice as the marquee? This reshapes the paywall and what we build first in Phase 3.
  1. Authored clarifier layer: fund, defer, or skip. Source-chip-only + the local-state strip ships now with zero authored content. The novelty-judged coach-az.json layer is single-digit dollars to size but carries an SME-review + light-human-pass cost per non-null cell, per state. Fund it (AZ first), defer to a fast-follow, or skip and let provenance + local data carry Tier-1? (Recommend: size it via the judge pass, then decide with real counts.)
  1. STT spike scope before we invest. The plugin fork is owned, so the edit is cheap — but the letter gate (offline A/B/C/D) is the genuine risk, and iOS-26's new API drops the biasing we currently rely on (the fork leaves on-device at the system default rather than forcing it). Do we (a) fund a two-recognizer path (legacy SFSpeechRecognizer+contextualStrings for letters, DictationTranscriber for conversation), or (b) accept letter-answering stays cloud-disclosed and only conversation goes on-device? Both are defensible; the choice sets the privacy copy.
  1. Population outcome signal (run-1 OQ4, still open). computeReadiness() already yields a local before/after mastery delta — we can prove "Lane works" to each user with zero analytics. Do you want even a single separately-consented, aggregates-only readiness-delta beacon for population signal, or hold the absolute no-tracking line? (Default: hold the line.)

Files referenced (all absolute): /Users/arizona/CLAUDE CODE/passlane/app/index.html (lines cited verified this session), /Users/arizona/CLAUDE CODE/passlane/app/questions.json (top-level shape is {questions: [323 objects]}, each object keyed category, choices, correct, difficulty, explanation, id, mode, question; explanation 23/37/93 word min/median/max, mean 37.1, 300 ≥30 words, 0 empty; ~2–8% genuinely discriminate a named distractor, ~30% merely contrastive), /Users/arizona/CLAUDE CODE/passlane/app/privacy.html (L41 audio claim), /Users/arizona/CLAUDE CODE/passlane/voice-sandbox/harness.js (extraction L42-54, hard-fail guard L50-51), /Users/arizona/CLAUDE CODE/passlane-ios/local-plugins/speech-recognition/ (vendored fork; Swift L115-122 leaves requiresOnDeviceRecognition commented out, Android L50/L226 online recognizer), /Users/arizona/CLAUDE CODE/docs/passlane-ai-companion-MASTER-PLAN.md (run-1), /Users/arizona/CLAUDE CODE/docs/passlane-coach-interaction-wiring-SPEC.md (run-2).

Private working document — unlisted, not indexed. PassLane / Somos LLC.