Quiz Guy
An iOS quiz answer assistant that lives on the Dynamic Island. Start a screen broadcast once, tap a Dynamic Island button to capture the current quiz screen, and the answer letter shows up on the island without leaving t…
An iOS quiz answer assistant that lives on the Dynamic Island. Start a screen broadcast once, tap a Dynamic Island button to capture the current quiz screen, and the answer letter shows up on the island without leaving the quiz app.
Three iOS processes share an App Group: the main app, a ReplayKit Broadcast Upload Extension (the worker that owns screen frames and the SQLite handle), and a Live Activity widget. They talk to each other through Darwin notifications and the shared database.
The interesting part: how the hot path got fast
The first version called a vision-language model on every capture. It worked. It was also slow. I wrote a 38-screenshot benchmark that times three pipelines on the same images:
| Pipeline | Median latency |
|---|---|
| Apple Vision OCR + bank match | 189 ms |
| Cloud OCR (GLM-OCR) + bank match | 3,533 ms |
| Direct VLM answer, no bank | 7,695 ms |
Apple Vision is roughly 20x faster than cloud OCR and 40x faster than the direct VLM call. With a pre-seeded bank covering the exam, the hot path no longer needs a model call at all. The VLM is now a fallback for cache misses.
The other interesting part: the search method
I started with SQLite FTS5 default tokenizer. It scored about 11% top-1 on my eval. The reason: the default tokenizer treats a run of Chinese characters as one giant token, so token-level matching is useless. The FTS5 trigram tokenizer did better but still left a thin margin to the right answer.
So I swept n-gram size and scoring function on the cached OCR outputs:
- n=3 character grams with Jaccard similarity hit 100% top-1 on the eval
- n<=2 under-discriminates
- n>=4 over-fragments on OCR errors
The production matcher is therefore a brute-force char-trigram Jaccard nearest neighbor. Linear scan over a few thousand rows runs under 10 ms. The matcher returns both the score and the margin between top-1 and top-2, so a future "report wrong answer" button can feed real failures back and let an offline analysis separate "not in bank" misses from "mis-match" failures.
What's under the hood
- ReplayKit Broadcast Upload Extension caches frames at 3 fps and persists captures into the App Group container as JPEGs
- Live Activity state machine covers awaiting-broadcast, idle, collecting, identifying (with sub-phases OCR, matching, querying), answered, correcting, corrected, failed
- App Group SQLite holds the bank; first launch copies a prebuilt seed DB from the app bundle (built offline by
swift run bench buildbankagainst the source PDF) so there's no on-device PDF parse at install time - VLM client is OpenAI-compatible: Zhipu GLM-4.6V by default, OpenAI and Qwen DashScope swappable via env vars or in-app settings
- Per-step latency and confidence are logged to a JSONL file so the matcher threshold and bank coverage can be tuned offline from real usage data
- Vercel edge endpoint serves as a lightweight kill-switch the app pings on launch
Status
MVP scaffold complete, compiles on Xcode 26.5 and iOS Simulator 26.2. Waiting on paid Apple Developer Program activation for on-device validation on a real iPhone 14 Pro or later (Dynamic Island button taps require that hardware).