Claude Skill
vllm-project/semantic-router
Semantic Router is a system-level intelligent router for mixture-of-models AI architectures, enabling efficient workload routing across cloud, data center, and edge environments. Part of the vLLM e...
Overview
Repository
Install this Skill
git clone https://github.com/vllm-project/semantic-router.gitRegistry
Summary
Semantic Router is a system-level intelligent router designed for mixture-of-models architectures, enabling efficient routing of AI workloads across cloud, data center, and edge environments. It is a key component of the vLLM ecosystem, built primarily in Go.
系统级智能路由器:面向云、数据中心与边缘的模型混合架构
Key features
- System-level intelligent routing for AI models
- Supports mixture-of-models (MoM) architectures
- Deployment across cloud, data center, and edge
- Built with Go for performance and scalability
- Integration with vLLM and Hugging Face ecosystems
- Kubernetes-native deployment support
Use cases
- AI gateway for routing LLM requests
- Dynamic model selection based on semantic context
- Load balancing across multiple AI model instances
- Edge AI inference with intelligent routing
- Multi-tenant AI service platforms
- PII detection and content filtering routing
README excerpt
<div align="center"> <img src="website/static/img/artworks/vllm-sr-logo.dark.png" alt="vLLM Semantic Router" width="50%"/> <p><strong>System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge</strong></p> <p> <a href="https://vllm-semantic-router.com">Documentation</a> | <a href="https://play.vllm-semantic-router.com">Playground</a> | <a href="https://vllm-semantic-router.com/blog/">Blog</a> | <a href="https://vllm-semantic-router.com/publications/">Publications</a> | <a href="https://huggingface.co/LLM-Semantic-Router">Hugging Face</a> </p> </div> --- ## About In the LLM era, the number of models is exploding. Different models vary across capability, scale, cost, and privacy boundaries. Choosing and connecting the right models to build semantic AI infrastructure is a system problem. **vLLM Semantic Router** is a **signal-driven** intelligent router for that problem. It helps teams build model systems that are more **efficient**, **safer**, and more **adaptive** across cloud, data center, and edge environments.  It delivers three core values: - **Token economics**: reduce wasted tokens, increase effective output, and maximize the value of every token. - **LLM safety**: detect jailbreaks, sensitive leakage, and hallucinations so agents remain controllable, trustworthy, and auditable. - **Fullmesh intelligence**: build personal AI at the edge and intelligent MaaS in the cloud by coordinating local, private, and frontier models across cost, privacy, and capability boundaries. ## Getting Started ### Install ```bash curl -fsSL https://vllm-semantic-router.com/install.sh | bash ``` For platform notes, detailed setup options, and troubleshooting, see the **[Installation Guide](https://v