Claude Skill

Intent-Lab/VisionClaw

VisionClaw 通过 Gemini Live 和 OpenClaw 技术,为 Meta Ray-Ban 智能眼镜提供集成了语音、视觉与智能体行动的实时 AI 助手功能。

概览

Stars2,429
Forks463
语言未知
最后更新2026-05-06
最近同步2026-07-03
前往 GitHub

仓库信息

拥有者Intent-Lab
仓库VisionClaw
完整名称Intent-Lab/VisionClaw
Repo ID1,151,130,619

安装这个 Skill

git clone https://github.com/sseanliu/VisionClaw.git

Registry 信息

类型openclaw_skill
质量分75/100
验证状态readme_parsed
最近验证2026-05-31
平台
OpenClaw
能力
browserpdfsearchimagevideoterminal
识别文件
README.md
配置键
YOUR_GITHUB_TOKENURL

项目简介

VisionClaw 是一款为 Meta Ray-Ban 智能眼镜设计的实时 AI 助手,通过 Gemini Live 和 OpenClaw 技术,集成了语音、视觉与智能体行动功能。

英文描述

Real-time AI assistant for Meta Ray-Ban smart glasses -- voice + vision + agentic actions via Gemini Live and OpenClaw

要点

  • 专为 Meta Ray-Ban 眼镜设计的实时 AI 助手
  • 语音与视觉功能集成
  • 通过 Gemini Live 实现智能体行动
  • 基于 OpenClaw 框架
  • 无缝的智能眼镜交互体验

使用场景

  • 免提智能眼镜辅助
  • 实时视觉信息处理
  • 语音控制的 AI 交互
  • 智能体驱动的任务自动化
  • 增强的可穿戴 AI 体验

README 摘要

# VisionClaw ![VisionClaw](assets/teaserimage.png) A real-time AI assistant for Meta Ray-Ban smart glasses. See what you see, hear what you say, and take actions on your behalf -- all through voice. ![Cover](assets/cover.png) Built on [Meta Wearables DAT SDK](https://github.com/facebook/meta-wearables-dat-ios) (iOS) / [DAT Android SDK](https://github.com/nichochar/openclaw) (Android) + [Gemini Live API](https://ai.google.dev/gemini-api/docs/live) + [OpenClaw](https://github.com/nichochar/openclaw) (optional). **Supported platforms:** iOS (iPhone) and Android (Pixel, Samsung, etc.) ## What It Does Put on your glasses, tap the AI button, and talk: - **"What am I looking at?"** -- Gemini sees through your glasses camera and describes the scene - **"Add milk to my shopping list"** -- delegates to OpenClaw, which adds it via your connected apps - **"Send a message to John saying I'll be late"** -- routes through OpenClaw to WhatsApp/Telegram/iMessage - **"Search for the best coffee shops nearby"** -- web search via OpenClaw, results spoken back The glasses camera streams at ~1fps to Gemini for visual context, while audio flows bidirectionally in real-time. ## How It Works ![How It Works](assets/how.png) ``` Meta Ray-Ban Glasses (or phone camera) | | video frames + mic audio v iOS / Android App (this project) | | JPEG frames (~1fps) + PCM audio (16kHz) v Gemini Live API (WebSocket) | |-- Audio response (PCM 24kHz) --> App --> Speaker |-- Tool calls (execute) -------> App --> OpenClaw Gateway | | | v | 56+ skills: web search, |

话题

暂无话题

探索更多

数据来自 GitHub,同步时间:2026-07-03