What are the key features of wwbin2017/bailing?

ASR+LLM+TTS architecture for voice conversations; Integrates models like DeepSeek R1; Connects to openClaw functionality; Low latency under 800ms; Runs on low-configuration devices (e.g., Mac); Supports voice interruption

What are the use cases of wwbin2017/bailing?

Personal voice assistant for daily tasks; Low-latency interactive voice applications; Voice interfaces on resource-constrained devices; Experiments with integrated ASR/LLM/TTS pipelines; Voice-controlled automation via openClaw; Accessible AI voice interaction

What programming language does wwbin2017/bailing use?

wwbin2017/bailing is primarily written in Python.

How to install wwbin2017/bailing?

Run: openclaw install wwbin2017/bailing

Claude Skill

wwbin2017/bailing

Q: What is wwbin2017/bailing?

Bailing is a voice conversation robot similar to GPT-4o, built on an ASR+LLM+TTS architecture. It integrates excellent models like DeepSeek R1 and connects to openClaw, serving as a true personal voice assistant with low latency under 800ms, capable of running on low-configuration devices like Mac, and supporting voice interruption.

Bailing is an open-source, GPT-4o-like voice conversation robot using ASR+LLM+TTS. Integrates DeepSeek R1, connects to openClaw, offers <800ms latency, runs on Mac/low-end devices, and supports int...

Language

Overview

Stars1,728

Forks304

LanguagePython

Last pushed2026-04-06

Last synced2026-07-03

View on GitHub

Repository

Ownerwwbin2017

Repositorybailing

Full namewwbin2017/bailing

Repo ID847,241,140

GitHub URLhttps://github.com/wwbin2017/bailing

Install this Skill

git clone https://github.com/wwbin2017/bailing.git

GitHub

Registry

Typeopenclaw_skill

Quality score70/100

Verificationreadme_parsed

Last verified2026-06-02

Platforms

OpenClaw

Capabilities

code-reviewmemorysearchaiasrchatgptchatttsdeepseekfunasrgpt-4o

Detected files

README.mdrequirements.txt

Install methods

git clone https://github.com/wwbin2017/bailing.git
pip install -r requirements.txt
pip install -r third_party/OpenManus/requirements.txt

Summary

Bailing is a voice conversation robot similar to GPT-4o, built on an ASR+LLM+TTS architecture. It integrates excellent models like DeepSeek R1 and connects to openClaw, serving as a true personal voice assistant with low latency under 800ms, capable of running on low-configuration devices like Mac, and supporting voice interruption.

Chinese description

百聆是一款类似GPT-4o的语音对话机器人，采用ASR（自动语音识别）+ LLM（大语言模型）+ TTS（语音合成）技术架构，集成了DeepSeek R1等优秀大模型，并接入openClaw功能，是一款真正的个人语音助手。其响应延迟低至800毫秒，即使在Mac等低配置设备上也能流畅运行，同时支持语音打断功能。

Key features

ASR+LLM+TTS architecture for voice conversations
Integrates models like DeepSeek R1
Connects to openClaw functionality
Low latency under 800ms
Runs on low-configuration devices (e.g., Mac)
Supports voice interruption

Use cases

Personal voice assistant for daily tasks
Low-latency interactive voice applications
Voice interfaces on resource-constrained devices
Experiments with integrated ASR/LLM/TTS pipelines
Voice-controlled automation via openClaw
Accessible AI voice interaction

README excerpt

# 百聆 (Bailing) <span>[ 中文 | <a href="README_en.md">English</a> ]</span> **百聆** 是一个开源的语音对话助手，旨在通过语音与用户进行自然的对话。该项目结合了语音识别 (ASR)、语音活动检测 (VAD)、大语言模型 (LLM) 和语音合成 (TTS) 技术，这是一个类似GPT-4o的语音对话机器人，通过ASR+LLM+TTS实现，提供高质量的语音对话体验，端到端时延800ms。百聆旨在无需GPU的情况下，实现类GPT-4o的对话效果，适用于各种边缘设备和低资源环境。 ![logo](assets/logo.png) ## 项目特点 - 🚀 **流畅对话体验**：低延迟、不卡顿，几乎像真人对话一样自然，百聆使用了多个开源模型，确保高效、可靠的语音对话体验。 - 🖥 **轻量级部署**：无需高端硬件，甚至不需要 GPU，通过优化，可本地部署，仍能提供类GPT-4的性能表现。 - 🔧 **模块化设计**：ASR、VAD、LLM和TTS模块相互独立，可以根据需求进行替换和升级。 - 🧠 **智能记忆功能**：具备持续学习能力，能够记忆用户的偏好与历史对话，提供个性化的互动体验。 - 🛠 **工具调用能力**：灵活集成外部工具，用户可通过语音直接请求信息或执行操作，提升助手的实用性。 - 📅 **任务管理**：高效管理用户任务，能够跟踪进度、设置提醒，并提供动态更新，确保用户不错过任何重要事项。 - 🌐 **可扩展生态**：除 OpenClaw 外，也支持逐步接入更多外部工具与 Agent 能力 ## 为什么重点支持 OpenClaw 百聆不仅是一个“能说话”的助手，更是一个“能做事”的助手。我们将 OpenClaw 作为核心工具调用引擎之一，用来处理复杂任务、外部工具编排和高阶 Agent 能力。通过 OpenClaw，百聆可以： - 将用户的自然语言请求转换为可执行任务 - 在对话中调用外部工具完成搜索、分析、操作等动作 - 处理更复杂的多步骤任务 - 让语音助手从“聊天机器人”升级为“行动型助手” 换句话说，OpenClaw 是百聆走向 JARVIS 化的重要一层。 ## 感谢开源社区百聆的诞生，离不开开源社区的无私贡献。感谢 DeepSeek、FunASR、Silero-VAD、ChatTTS、openclaw 等优秀的开源项目，让我们有机会打造一个真正开放、强大、低门槛的语音 AI 助手！如果你也认同让 AI 触手可及的理念，欢迎一起贡献代码、优化模型，让百聆更强、更智能，成为真正的 JARVIS！ 📢 欢迎 Star & PR ## 项目简介百聆通过以下技术组件实现语音对话功能： - 🎙 **ASR**: 使用 [FunASR](https://github.com/modelscope/FunASR) 进行自动语音识别，将用户的语音转换为文本。 - 🎚 **VAD**: 使用 [silero-vad](https://github.com/snakers4/silero-vad) 进行语音活动检测，以确保只处理有效的语音片段。 - 🧠 **LLM**: 使用 [deepseek](https://github.com/deepseek-ai/DeepSeek-LLM) 作为大语言模型来处理用户输入并生成响应，极具性价比。 - 🔊 **TTS**: 使用 [edge-tts](https://github.com/rany2/edge-tts) [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) [ChatTTS](https://github.com/2noise/ChatTTS) MacOS say进行文本到语音的转换，将生成的文本响应转换为自然流畅的语音。 ## 框架说明 ![百聆流程图](assets/bailing_flowchart_a.png) Robot 负责高效的任务管理与记忆管理，能够智能地处理用户的打断请求，同时实现各个模块之间的无缝协调与连接，以确保流畅的交互体验。 | 播放器状态 | 是否说话 | 说明 |

Topics

ai asr chatgpt chattts deepseek funasr gpt-4o llm openai openclaw tts voice-assistant

wwbin2017/bailing

Overview

Repository

Install this Skill

Registry

Summary

Key features

Use cases

README excerpt

Topics

Explore more

Related skills

NousResearch/hermes-agent

infiniflow/ragflow

zhayujie/CowAgent

HKUDS/nanobot

liyupi/ai-guide

BlockRunAI/ClawRouter