Claude Skill

WeianMao/triattention

TriAttention uses trigonometric KV cache compression to enable efficient long reasoning and local deployment of OpenClaw on memory-constrained GPUs.

Overview

Stars724
Forks62
LanguagePython
Last pushed2026-04-23
Last synced2026-05-14
View on GitHub

Repository

OwnerWeianMao
Repositorytriattention
Full nameWeianMao/triattention
Repo ID1,200,960,609

🚀 Install this Skill

openclaw install WeianMao/triattention

Summary

TriAttention is an efficient long-reasoning technique that uses trigonometric KV cache compression to reduce memory usage, enabling local deployment of large models like OpenClaw on memory-constrained GPUs.

Chinese description

TriAttention — 通过三角键值缓存压缩实现高效长推理。支持在内存受限的GPU上本地部署OpenClaw。

Key features

  • Trigonometric KV cache compression for reduced memory footprint
  • Enables long-context reasoning on memory-constrained GPUs
  • Supports local deployment of OpenClaw models
  • Optimized for efficient inference with limited hardware

Use cases

  • Running large language models locally on consumer GPUs
  • Long-document analysis and summarization
  • Memory-efficient AI reasoning for edge devices

Topics

No topics yet.

Explore more

Data from GitHub. Synced on 2026-05-14