ios_app

Quiz Guy

An iOS quiz answer assistant that lives on the Dynamic Island. Start a screen broadcast once, tap a Dynamic Island button to capture the current quiz screen, and the answer letter shows up on the island without leaving t…

View source

Three iOS processes share an App Group: the main app, a ReplayKit Broadcast Upload Extension (the worker that owns screen frames and the SQLite handle), and a Live Activity widget. They talk to each other through Darwin notifications and the shared database.

The interesting part: how the hot path got fast

The first version called a vision-language model on every capture. It worked. It was also slow. I wrote a 38-screenshot benchmark that times three pipelines on the same images:

Pipeline	Median latency
Apple Vision OCR + bank match	189 ms
Cloud OCR (GLM-OCR) + bank match	3,533 ms
Direct VLM answer, no bank	7,695 ms

Apple Vision is roughly 20x faster than cloud OCR and 40x faster than the direct VLM call. With a pre-seeded bank covering the exam, the hot path no longer needs a model call at all. The VLM is now a fallback for cache misses.

The other interesting part: the search method

I started with SQLite FTS5 default tokenizer. It scored about 11% top-1 on my eval. The reason: the default tokenizer treats a run of Chinese characters as one giant token, so token-level matching is useless. The FTS5 trigram tokenizer did better but still left a thin margin to the right answer.

So I swept n-gram size and scoring function on the cached OCR outputs:

n=3 character grams with Jaccard similarity hit 100% top-1 on the eval
n<=2 under-discriminates
n>=4 over-fragments on OCR errors

The production matcher is therefore a brute-force char-trigram Jaccard nearest neighbor. Linear scan over a few thousand rows runs under 10 ms. The matcher returns both the score and the margin between top-1 and top-2, so a future "report wrong answer" button can feed real failures back and let an offline analysis separate "not in bank" misses from "mis-match" failures.

What's under the hood

ReplayKit Broadcast Upload Extension caches frames at 3 fps and persists captures into the App Group container as JPEGs
Live Activity state machine covers awaiting-broadcast, idle, collecting, identifying (with sub-phases OCR, matching, querying), answered, correcting, corrected, failed
App Group SQLite holds the bank; first launch copies a prebuilt seed DB from the app bundle (built offline by swift run bench buildbank against the source PDF) so there's no on-device PDF parse at install time
VLM client is OpenAI-compatible: Zhipu GLM-4.6V by default, OpenAI and Qwen DashScope swappable via env vars or in-app settings
Per-step latency and confidence are logged to a JSONL file so the matcher threshold and bank coverage can be tuned offline from real usage data
Vercel edge endpoint serves as a lightweight kill-switch the app pings on launch

Status

MVP scaffold complete, compiles on Xcode 26.5 and iOS Simulator 26.2. Waiting on paid Apple Developer Program activation for on-device validation on a real iPhone 14 Pro or later (Dynamic Island button taps require that hardware).