Skip to content

RealComputer/GlassKit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

657 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

GlassKit

GlassKit is an open-source toolkit for building smart-glasses AI apps. Your AI coding agent can use the skill, docs, and runnable examples to build apps that understand what wearers see and hear, then guide them in real time.

Today GlassKit starts with Rokid Glasses. The long-term goal is a developer platform for building, hosting, and shipping smart-glasses apps across more devices, making it easier for anyone to create useful AI apps for glasses.

GlassKit is used by developers building glasses apps for real-world tasks, from manufacturing workflows to field support.

https://glasskit.ai Β Β·Β  https://x.com/GlassKit_ai Β Β·Β  https://discord.gg/v5ayGKhPNP

Demos

These demos cover the core GlassKit building blocks for Rokid Glasses: camera/mic capture, WebRTC streaming, a monochrome on-lens display (HUD), touchpad and offline voice controls, OpenAI Realtime, Overshoot, and object detection.

Drink-making coach Sushi speedrun timer IKEA assembly assistant
demo.webm
demo.webm
demo.webm
Code
Proactive drink-making coach that watches ingredients, picks a recipe, and guides each step. Combines Overshoot video inference, recipe state, OpenAI Realtime, and HUD guidance.
Code
Real-world speedrun timer for physical tasks, shown with sushi. Uses RF-DETR to detect configured objects and advance HUD splits after confirmation.
Code / Code with RF-DETR
Voice-first assembly assistant for an IKEA wooden box. Streams mic/camera input to OpenAI Realtime, with an RF-DETR variant for object-aware guidance.
Searchable life recording Real-time privacy filter Live scene reader
demo.webm
demo.mp4
demo.webm
Full-day smart-glasses recording demo. Makes long first-person recordings browsable and searchable.
Read the build write-up
Code
Real-time privacy layer between a camera and an app. Anonymizes video locally and tracks spoken consent.
Code
Simple real-time scene reader that keeps describing what the wearer is looking at. Sends live camera context to Overshoot and displays inference text on the HUD.
Rokid feature demo
demo.mp4
Code
Device-feature reference app for Rokid Glasses and phone/emulator testing. Covers touchpad navigation, offline Vosk voice commands, camera, mic, audio, and reusable screen controllers.

Quick Start

There are three ways to start, depending on how you like to build.

1. Install the GlassKit agent skill

Use this when you want Codex, Claude Code, Cursor, or another coding agent to understand smart-glasses app development while it builds your app.

Smart-glasses apps have unique aspects that coding agents are not used to handling: vision AI pipelines, small HUDs, camera, microphone, and sensor access, touchpad and voice inputs, battery use, and wearer-facing UX. The GlassKit agent skill packages that context with reference patterns and a starter template, so agents can build more realistic glasses apps from the first pass.

Install it with the Agent Skills CLI:

npx skills add RealComputer/GlassKit

Update it later with:

npx skills update glasskit

Then ask your coding agent with prompts like: create a starter rokid glasses app, add a camera preview to the first screen using the glasskit skill, or create a rokid glasses app that connects to openai realtime and talks about what it sees.

2. Copy the Rokid starter app

Use this when you want a small app scaffold with Rokid HUD layout and navigation patterns. You can copy it manually or run:

git clone https://github.com/RealComputer/GlassKit.git
mkdir rokid-starter
git -C GlassKit archive HEAD:skills/glasskit/assets/rokid-hello-world | tar -x -C rokid-starter

Then follow the README.

3. Copy a complete example

Use this when a demo is close to the app you want to build. For example, to copy examples/rokid-feature-demo:

git clone https://github.com/RealComputer/GlassKit.git
mkdir my-glasses-app
git -C GlassKit archive HEAD:examples/rokid-feature-demo | tar -x -C my-glasses-app

Then follow that example's README.

Repository Map

Path What it contains
skills/glasskit/ Agent skill, Rokid Glasses starter, and smart-glasses app references for coding agents and human developers.
docs/ Hardware setup, Rokid Glasses device notes, and demo-recording workflow.
examples/ Runnable Rokid Glasses examples you can copy or adapt.

How Apps Work

A typical app in this repo has four pieces:

  1. A Rokid Glasses app (Android) captures camera/microphone input, handles touchpad gestures, and renders a HUD.
  2. WebRTC carries live media between the glasses, your backend, and AI services.
  3. A backend coordinates session setup, workflow state, model calls, tool calls, and app-specific decisions.
  4. The wearer gets real-time feedback via display and audio.

The exact architecture varies by example. Some pieces can run offline, including local voice commands, device controls, and local vision/privacy processing.

Requirements

Many examples need:

  • Rokid Glasses and a development cable
  • Android Studio or adb
  • uv for Python backends or node for TypeScript backends
  • API keys depending on the example, such as OPENAI_API_KEY, OVERSHOOT_API_KEY, or ROBOFLOW_API_KEY

Each example README has the exact setup steps and environment variables.

Contributing

Contributions are welcome.

By submitting a pull request, you agree that your contribution is licensed under the MIT License of this project (see LICENSE), and you confirm that you have the right to submit it under those terms.

About

😎 Toolkit for building AI apps that see, hear, and guide through smart glasses

Resources

License

Stars

Watchers

Forks

Contributors