Claude Skill

atopos31/llmio

llmio is a lightweight TypeScript-based LLM API load-balancing gateway supporting OpenAI, Claude, Gemini, DeepSeek, and more. Features intelligent routing, failover, and rate limiting.

Overview

Stars311
Forks35
LanguageTypeScript
Last pushed2026-05-29
Last synced2026-06-17
View on GitHub

Repository

Owneratopos31
Repositoryllmio
Full nameatopos31/llmio
Repo ID1,032,927,082

Install this Skill

docker run -d \

Registry

Typeopenclaw_skill
Quality score80/100
Verificationreadme_parsed
Last verified2026-06-15
Platforms
ClaudeOpenClawCodex
Capabilities
searchimageterminalaiai-gatewayclaudeclaude-aiclaude-codecodexdeepseek
Detected files
README.mddocker-compose.ymldocs
Config keys
TOKENYOUR_TOKENOPENAI_API_KEYANTHROPIC_API_KEYGEMINI_API_KEY
Install methods
  • docker run -d \
  • git clone https://github.com/atopos31/llmio.git

Summary

llmio is a lightweight, high-performance LLM API load-balancing gateway written in TypeScript. It supports multiple AI providers including OpenAI, Claude, Gemini, DeepSeek, and others, offering intelligent request routing, fallback, and rate limiting for production AI applications.

Chinese description

LLM API 负载均衡网关。

Key features

  • Multi-provider support: OpenAI, Claude, Gemini, DeepSeek, and more
  • Intelligent load balancing with automatic failover
  • Rate limiting and request queuing
  • Lightweight and easy to deploy
  • TypeScript-based with type safety

Use cases

  • Distributing API calls across multiple LLM providers
  • Building resilient AI applications with automatic fallback
  • Managing API rate limits and costs in production
  • Centralizing AI gateway configuration for teams

README excerpt

# LLMIO English | [中文](README_cn.md) LLMIO is a Go-based LLM load‑balancing gateway that provides a unified REST API, weighted scheduling, observability, and a modern admin UI for LLM clients (openclaw / claude code / codex / gemini cli / cherry studio / open webui). It helps you integrate OpenAI, Anthropic, Gemini, and other model capabilities in a single service. **QQ group: 1083599685** ## Architecture ![LLMIO Architecture](./docs/llmio.svg) ## Features - **Unified API**: Compatible with OpenAI Chat Completions, OpenAI Responses, Gemini Native, and Anthropic Messages. Supports both streaming and non‑streaming passthrough. - **Weighted scheduling**: `balancers/` provides two strategies (random by weight / priority by weight). You can route based on tool calling, structured output, and multimodal capability. - **Admin Web UI**: React + TypeScript + Tailwind + Vite console for providers, models, associations, logs, and metrics. - **Rate limiting & failure handling**: Built‑in rate‑limit fallback and provider connectivity checks for fault isolation. - **Local persistence**: Pure Go SQLite (`db/llmio.db`) for config and request logs, ready to use out of the box. - **Session tracking**: Pass `session_id` in any request body (works with `extra_body` in OpenAI SDK) to tag logs with a session identifier. Filter and search by `session_id` in the admin UI or via `GET /api/logs?session_id=`. - **Observability**: Every request is recorded with TraceID, latency breakdown (proxy / first-chunk / completion time), TPS, token usage (input / cached / output), and optional full IO logging. Per-request cost is calculated from configurable per-million-token prices (CNY / USD) and shown in the log detail view alongside provider and model metadata. ## Deployment ### Docker Compose (

Topics

Explore more

Data from GitHub. Synced on 2026-06-17