Claude Skill

Intent-Lab/VisionClaw

VisionClaw transforms Meta Ray-Ban smart glasses with real-time AI assistance, combining voice, vision, and agentic actions through Gemini Live and OpenClaw technologies.

Overview

Stars2,429
Forks463
LanguageUnknown
Last pushed2026-05-06
Last synced2026-07-02
View on GitHub

Repository

OwnerIntent-Lab
RepositoryVisionClaw
Full nameIntent-Lab/VisionClaw
Repo ID1,151,130,619

Install this Skill

git clone https://github.com/sseanliu/VisionClaw.git

Registry

Typeopenclaw_skill
Quality score75/100
Verificationreadme_parsed
Last verified2026-05-31
Platforms
OpenClaw
Capabilities
browserpdfsearchimagevideoterminal
Detected files
README.md
Config keys
YOUR_GITHUB_TOKENURL

Summary

VisionClaw is a real-time AI assistant for Meta Ray-Ban smart glasses that integrates voice, vision, and agentic actions using Gemini Live and OpenClaw technologies.

Chinese description

Meta Ray-Ban智能眼镜实时AI助手——集成语音、视觉与通过Gemini Live及OpenClaw实现的智能体行动功能。

Key features

  • Real-time AI assistant for Meta Ray-Ban glasses
  • Voice and vision integration
  • Agentic actions via Gemini Live
  • Powered by OpenClaw framework
  • Seamless smart glasses interaction

Use cases

  • Hands-free smart glasses assistance
  • Real-time visual information processing
  • Voice-controlled AI interactions
  • Agent-driven task automation
  • Enhanced wearable AI experiences

README excerpt

# VisionClaw ![VisionClaw](assets/teaserimage.png) A real-time AI assistant for Meta Ray-Ban smart glasses. See what you see, hear what you say, and take actions on your behalf -- all through voice. ![Cover](assets/cover.png) Built on [Meta Wearables DAT SDK](https://github.com/facebook/meta-wearables-dat-ios) (iOS) / [DAT Android SDK](https://github.com/nichochar/openclaw) (Android) + [Gemini Live API](https://ai.google.dev/gemini-api/docs/live) + [OpenClaw](https://github.com/nichochar/openclaw) (optional). **Supported platforms:** iOS (iPhone) and Android (Pixel, Samsung, etc.) ## What It Does Put on your glasses, tap the AI button, and talk: - **"What am I looking at?"** -- Gemini sees through your glasses camera and describes the scene - **"Add milk to my shopping list"** -- delegates to OpenClaw, which adds it via your connected apps - **"Send a message to John saying I'll be late"** -- routes through OpenClaw to WhatsApp/Telegram/iMessage - **"Search for the best coffee shops nearby"** -- web search via OpenClaw, results spoken back The glasses camera streams at ~1fps to Gemini for visual context, while audio flows bidirectionally in real-time. ## How It Works ![How It Works](assets/how.png) ``` Meta Ray-Ban Glasses (or phone camera) | | video frames + mic audio v iOS / Android App (this project) | | JPEG frames (~1fps) + PCM audio (16kHz) v Gemini Live API (WebSocket) | |-- Audio response (PCM 24kHz) --> App --> Speaker |-- Tool calls (execute) -------> App --> OpenClaw Gateway | | | v | 56+ skills: web search, |

Topics

No topics yet.

Explore more

Data from GitHub. Synced on 2026-07-02