Claude Skill
Intent-Lab/VisionClaw
VisionClaw transforms Meta Ray-Ban smart glasses with real-time AI assistance, combining voice, vision, and agentic actions through Gemini Live and OpenClaw technologies.
Overview
Repository
Install this Skill
git clone https://github.com/sseanliu/VisionClaw.gitRegistry
Summary
VisionClaw is a real-time AI assistant for Meta Ray-Ban smart glasses that integrates voice, vision, and agentic actions using Gemini Live and OpenClaw technologies.
Meta Ray-Ban智能眼镜实时AI助手——集成语音、视觉与通过Gemini Live及OpenClaw实现的智能体行动功能。
Key features
- Real-time AI assistant for Meta Ray-Ban glasses
- Voice and vision integration
- Agentic actions via Gemini Live
- Powered by OpenClaw framework
- Seamless smart glasses interaction
Use cases
- Hands-free smart glasses assistance
- Real-time visual information processing
- Voice-controlled AI interactions
- Agent-driven task automation
- Enhanced wearable AI experiences
README excerpt
# VisionClaw  A real-time AI assistant for Meta Ray-Ban smart glasses. See what you see, hear what you say, and take actions on your behalf -- all through voice.  Built on [Meta Wearables DAT SDK](https://github.com/facebook/meta-wearables-dat-ios) (iOS) / [DAT Android SDK](https://github.com/nichochar/openclaw) (Android) + [Gemini Live API](https://ai.google.dev/gemini-api/docs/live) + [OpenClaw](https://github.com/nichochar/openclaw) (optional). **Supported platforms:** iOS (iPhone) and Android (Pixel, Samsung, etc.) ## What It Does Put on your glasses, tap the AI button, and talk: - **"What am I looking at?"** -- Gemini sees through your glasses camera and describes the scene - **"Add milk to my shopping list"** -- delegates to OpenClaw, which adds it via your connected apps - **"Send a message to John saying I'll be late"** -- routes through OpenClaw to WhatsApp/Telegram/iMessage - **"Search for the best coffee shops nearby"** -- web search via OpenClaw, results spoken back The glasses camera streams at ~1fps to Gemini for visual context, while audio flows bidirectionally in real-time. ## How It Works  ``` Meta Ray-Ban Glasses (or phone camera) | | video frames + mic audio v iOS / Android App (this project) | | JPEG frames (~1fps) + PCM audio (16kHz) v Gemini Live API (WebSocket) | |-- Audio response (PCM 24kHz) --> App --> Speaker |-- Tool calls (execute) -------> App --> OpenClaw Gateway | | | v | 56+ skills: web search, |
Topics
No topics yet.