GlassKit is an open-source toolkit for building smart-glasses AI apps. Your AI coding agent can use the skill, docs, and runnable examples to build apps that understand what wearers see and hear, then guide them in real time.
Today GlassKit starts with Rokid Glasses. The long-term goal is a developer platform for building, hosting, and shipping smart-glasses apps across more devices, making it easier for anyone to create useful AI apps for glasses.
GlassKit is used by developers building glasses apps for real-world tasks, from manufacturing workflows to field support.
https://glasskit.ai Β Β·Β https://x.com/GlassKit_ai Β Β·Β https://discord.gg/v5ayGKhPNP
These demos cover the core GlassKit building blocks for Rokid Glasses: camera/mic capture, WebRTC streaming, a monochrome on-lens display (HUD), touchpad and offline voice controls, OpenAI Realtime, Overshoot, and object detection.
| Drink-making coach | Sushi speedrun timer | IKEA assembly assistant |
|---|---|---|
demo.webm |
demo.webm |
demo.webm |
|
Code
Proactive drink-making coach that watches ingredients, picks a recipe, and guides each step. Combines Overshoot video inference, recipe state, OpenAI Realtime, and HUD guidance. |
Code
Real-world speedrun timer for physical tasks, shown with sushi. Uses RF-DETR to detect configured objects and advance HUD splits after confirmation. |
Code /
Code with RF-DETR
Voice-first assembly assistant for an IKEA wooden box. Streams mic/camera input to OpenAI Realtime, with an RF-DETR variant for object-aware guidance. |
| Searchable life recording | Real-time privacy filter | Live scene reader |
demo.webm |
demo.mp4 |
demo.webm |
|
Full-day smart-glasses recording demo. Makes long first-person recordings browsable and searchable.
Read the build write-up |
Code
Real-time privacy layer between a camera and an app. Anonymizes video locally and tracks spoken consent. |
Code
Simple real-time scene reader that keeps describing what the wearer is looking at. Sends live camera context to Overshoot and displays inference text on the HUD. |
| Rokid feature demo | ||
demo.mp4 |
||
|
Code
Device-feature reference app for Rokid Glasses and phone/emulator testing. Covers touchpad navigation, offline Vosk voice commands, camera, mic, audio, and reusable screen controllers. |
There are three ways to start, depending on how you like to build.
Use this when you want Codex, Claude Code, Cursor, or another coding agent to understand smart-glasses app development while it builds your app.
Smart-glasses apps have unique aspects that coding agents are not used to handling: vision AI pipelines, small HUDs, camera, microphone, and sensor access, touchpad and voice inputs, battery use, and wearer-facing UX. The GlassKit agent skill packages that context with reference patterns and a starter template, so agents can build more realistic glasses apps from the first pass.
Install it with the Agent Skills CLI:
npx skills add RealComputer/GlassKitUpdate it later with:
npx skills update glasskitThen ask your coding agent with prompts like: create a starter rokid glasses app, add a camera preview to the first screen using the glasskit skill, or create a rokid glasses app that connects to openai realtime and talks about what it sees.
Use this when you want a small app scaffold with Rokid HUD layout and navigation patterns. You can copy it manually or run:
git clone https://github.com/RealComputer/GlassKit.git
mkdir rokid-starter
git -C GlassKit archive HEAD:skills/glasskit/assets/rokid-hello-world | tar -x -C rokid-starterThen follow the README.
Use this when a demo is close to the app you want to build. For example, to copy examples/rokid-feature-demo:
git clone https://github.com/RealComputer/GlassKit.git
mkdir my-glasses-app
git -C GlassKit archive HEAD:examples/rokid-feature-demo | tar -x -C my-glasses-appThen follow that example's README.
| Path | What it contains |
|---|---|
skills/glasskit/ |
Agent skill, Rokid Glasses starter, and smart-glasses app references for coding agents and human developers. |
docs/ |
Hardware setup, Rokid Glasses device notes, and demo-recording workflow. |
examples/ |
Runnable Rokid Glasses examples you can copy or adapt. |
A typical app in this repo has four pieces:
- A Rokid Glasses app (Android) captures camera/microphone input, handles touchpad gestures, and renders a HUD.
- WebRTC carries live media between the glasses, your backend, and AI services.
- A backend coordinates session setup, workflow state, model calls, tool calls, and app-specific decisions.
- The wearer gets real-time feedback via display and audio.
The exact architecture varies by example. Some pieces can run offline, including local voice commands, device controls, and local vision/privacy processing.
Many examples need:
- Rokid Glasses and a development cable
- Android Studio or
adb uvfor Python backends ornodefor TypeScript backends- API keys depending on the example, such as
OPENAI_API_KEY,OVERSHOOT_API_KEY, orROBOFLOW_API_KEY
Each example README has the exact setup steps and environment variables.
Contributions are welcome.
By submitting a pull request, you agree that your contribution is licensed under the MIT License of this project (see LICENSE), and you confirm that you have the right to submit it under those terms.