What are the key features of jjang-ai/vmlx?

Continuous batch processing for efficient inference; Prefix caching and paged attention optimization; KV cache quantization for reduced memory usage; Vision-language (VL) support for multimodal tasks; Image generation and editing capabilities; Compatible with OpenAI and Anthropic APIs

What are the use cases of jjang-ai/vmlx?

Running large language models locally on MacBook with MLX; Building multimodal applications with vision-language models; Optimizing inference with KV cache reuse and compression; Deploying MCP servers for AI agent workflows; Generating and editing images via MLX Studio

What programming language does jjang-ai/vmlx use?

jjang-ai/vmlx is primarily written in Python.

How to install jjang-ai/vmlx?

Run: openclaw install jjang-ai/vmlx

Claude Skill

jjang-ai/vmlx

vMLX is an MLX-based framework powering MLX Studio with continuous batch, prefix cache, paged attention, KV cache quantization, and vision-language support. Compatible with OpenAI & Anthropic APIs...

Language

Overview

Stars687

Forks71

LanguagePython

Last pushed2026-06-17

Last synced2026-06-17

View on GitHub

Repository

Ownerjjang-ai

Repositoryvmlx

Full namejjang-ai/vmlx

Repo ID1,160,596,966

GitHub URLhttps://github.com/jjang-ai/vmlx

Install this Skill

uv tool install vmlx

GitHub

Registry

Typemcp_server

Quality score85/100

Verificationreadme_parsed

Last verified2026-06-08

Platforms

ClaudeMCPOpenClaw

Capabilities

pdfmemoryimagevideoterminalanthropic-apikvcache-compressionkvcache-optimizationkvcache-reusellm

Detected files

README.mddocspyproject.tomltests

Config keys

URL

Install methods

uv tool install vmlx
pip install vmlx
pip install vmlx[image]
git clone https://github.com/jjang-ai/vmlx.git
npm install && npm run build

Summary

vMLX is an advanced MLX-based framework that powers MLX Studio with features like continuous batch processing, prefix caching, paged attention, KV cache quantization, and vision-language support. It also provides image generation/editing capabilities and compatibility with OpenAI and Anthropic APIs, making it a versatile tool for local LLM deployment on Apple Silicon.

Chinese description

vMLX - JANG_Q的基地 - 连续批处理、前缀、分页、KV缓存量化、视觉语言 - 驱动MLX Studio。图像生成/编辑、OpenAI/Anth

Key features

Continuous batch processing for efficient inference
Prefix caching and paged attention optimization
KV cache quantization for reduced memory usage
Vision-language (VL) support for multimodal tasks
Image generation and editing capabilities
Compatible with OpenAI and Anthropic APIs

Use cases

Running large language models locally on MacBook with MLX
Building multimodal applications with vision-language models
Optimizing inference with KV cache reuse and compression
Deploying MCP servers for AI agent workflows
Generating and editing images via MLX Studio

README excerpt

<picture> <source media="(prefers-color-scheme: dark)" srcset="https://raw.githubusercontent.com/jjang-ai/vmlx/main/assets/logo-wide-dark.png"> <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/jjang-ai/vmlx/main/assets/logo-wide-light.png"> <img alt="vMLX" src="https://raw.githubusercontent.com/jjang-ai/vmlx/main/assets/logo-wide-light.png" width="400"> </picture> <h3 align="center">MLX Inference Server for Apple Silicon</h3> Self-hosted inference server for LLMs, VLMs, and image generation on Apple Silicon. OpenAI + Anthropic + Ollama compatible HTTP API. Self-hosted; no third-party API keys required. Native MTP artifact detection and family-specific cache policy gates keep speculative/cache settings explicit and model-safe. Looking for a native Swift macOS app or Swift inference engine? See <a href="https://osaurus.ai">osaurus.ai</a>. <a href="https://pypi.org/project/vmlx/"><img src="https://img.shields.io/pypi/v/vmlx?color=%234B8BBE&label=PyPI&logo=python&logoColor=white" alt="PyPI" /></a> <a href="https://github.com/jjang-ai/vmlx/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-Apache_2.0-green?logo=apache" alt="License" /></a> <a href="https://github.com/jjang-ai/vmlx"><img src="https://img.shields.io/github/stars/jjang-ai/vmlx?style=social" alt="Stars" /></a> <img src="https://img.shields.io/badge/Apple_Silicon-M1%2FM2%2FM3%2FM4-black?logo=apple" alt="Apple Silicon" /> <img src="https://img.shields.io/badge/Python-3.10+-3776AB?logo=python&logoColor=white" alt="Python" /> <img src="https://img.shields.io/badge/Electron-28-47848F?logo=electron&logoColor=white" alt=

Topics

anthropic-api kvcache-compression kvcache-optimization kvcache-reuse llm lmstudio macbook mcp-server mlx mlxllm mlxstudio omlx omlx-alternative openai-api openclaw openclaw-agent persistent-memory prefix-cache vmlx

jjang-ai/vmlx

Overview

Repository

Install this Skill

Registry

Summary

Key features

Use cases

README excerpt

Topics

Explore more

Related skills

NousResearch/hermes-agent

infiniflow/ragflow

zhayujie/CowAgent

HKUDS/nanobot

volcengine/OpenViking

cft0808/edict