Claude Skill
ZJU-REAL/SkillZero
Official code for SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization. Features curriculum learning, OpenClaw integration, and reproducible Python experiments.
Overview
Repository
Install this Skill
pip install vllm==0.10.0Registry
pip install vllm==0.10.0pip install flash-attn==2.7.4.post1 --no-build-isolation --no-cache-dirpip install -e .pip install gym==0.26.2pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124
Summary
SkillZero (SKILL0) is the official codebase for the paper 'SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization.' It introduces a novel in-context reinforcement learning framework that enables agents to internalize skills through curriculum learning, leveraging the OpenClaw platform for robotic manipulation tasks.
"SKILL0: 面向技能内化的上下文智能体强化学习"的官方代码
Key features
- In-context reinforcement learning for skill internalization
- Curriculum learning strategy for progressive skill acquisition
- Integration with OpenClaw robotic platform
- Agentic learning without explicit reward engineering
- Open-source Python implementation with reproducible experiments
Use cases
- Robotic skill acquisition in manipulation tasks
- Research on in-context reinforcement learning
- Curriculum learning for complex agent behaviors
- Benchmarking agentic RL algorithms on OpenClaw
README excerpt
<h1 align="center"> SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization </h1> <div align='center' style="font-size:18px;"> <p> <a href="https://arxiv.org/abs/2604.02268"> <img src="https://img.shields.io/badge/Paper-arxiv%3A2604.02268-blue" alt="Paper"/> </a> <a href="https://huggingface.co/papers/2604.02268"> <img src="https://img.shields.io/badge/Daily%20Paper-huggingface-yellow" alt="HF Paper"/> </a> </p> </div> ## 🔥 Overview We introduce **SKILL0**, an in-context reinforcement learning framework designed for *skill internalization*. <div align="center" style="display:flex; justify-content:center; gap:20px; align-items:flex-start;"> <img src="docs/skillzero/motivation.png" alt="motivation" style="width:40%;"> <img src="docs/skillzero/method.png" alt="method" style="width:58%;"> </div> SKILL0 achieves substantial improvements over the standard RL baseline on ALFWorld and Search-QA. <div align="center"> <img src="docs/skillzero/metric.png" alt="Logo" style="width:80%;"> </div> ## 🗞️ News - **`2026-5-15`**: 🔥🔥 Our new work was released: [SDAR](https://github.com/ZJU-REAL/SDAR), which introduces Self-Distilled Agentic Reinforcement Learning. - **`2026-5-07`**: 🔥 Our new work was released: [SKILL1](https://github.com/AlphaLab-USTC/Skill1), which evloves skill-augmented agents in **one** unified policy. - **`2026-4-03`**: We release our paper and code. ## 🛠️ Installation ### Python environment ```bash conda create -n skillzero python=3.12 -y conda activate skillzero pip install vllm==0.10.0 pip install flash-attn==2.7.4.post1 --no-build-isolation --no-cache-dir pip install -e . ``` Log in to Weights & Biases if you use WandB logging (scripts pass `trainer.logger=['console','wandb']` in many cases):