Claude Skill

ZJU-REAL/SkillZero

SKILL0官方代码:面向技能内化的上下文智能体强化学习。包含课程学习、OpenClaw集成及可复现的Python实验。

概览

Stars343
Forks14
语言Python
最后更新2026-05-20
最近同步2026-07-03
前往 GitHub

仓库信息

拥有者ZJU-REAL
仓库SkillZero
完整名称ZJU-REAL/SkillZero
Repo ID1,199,334,958

安装这个 Skill

pip install vllm==0.10.0

Registry 信息

类型openclaw_skill
质量分85/100
验证状态readme_parsed
最近验证2026-06-15
平台
OpenClaw
能力
searchimageterminalagentcurriculum-learningin-context-reinforcement-learningopenclawopenclaw-skillsrlskill
识别文件
README.mddocsexamplespyproject.tomlrequirements.txttests
配置键
WANDB_API_KEY
安装方式
  • pip install vllm==0.10.0
  • pip install flash-attn==2.7.4.post1 --no-build-isolation --no-cache-dir
  • pip install -e .
  • pip install gym==0.26.2
  • pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124

项目简介

SkillZero(SKILL0)是论文《SKILL0: 面向技能内化的上下文智能体强化学习》的官方代码库。它提出了一种新颖的上下文强化学习框架,通过课程学习使智能体内化技能,并利用OpenClaw平台进行机器人操作任务。

英文描述

Official code for "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization"

要点

  • 面向技能内化的上下文强化学习
  • 渐进式技能获取的课程学习策略
  • 与OpenClaw机器人平台集成
  • 无需显式奖励工程的智能体学习
  • 开源Python实现,实验可复现

使用场景

  • 机器人操作任务中的技能获取
  • 上下文强化学习研究
  • 复杂智能体行为的课程学习
  • 在OpenClaw上基准测试智能体强化学习算法

README 摘要

<h1 align="center"> SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization </h1> <div align='center' style="font-size:18px;"> <p> <a href="https://arxiv.org/abs/2604.02268"> <img src="https://img.shields.io/badge/Paper-arxiv%3A2604.02268-blue" alt="Paper"/> </a> <a href="https://huggingface.co/papers/2604.02268"> <img src="https://img.shields.io/badge/Daily%20Paper-huggingface-yellow" alt="HF Paper"/> </a> </p> </div> ## 🔥 Overview We introduce **SKILL0**, an in-context reinforcement learning framework designed for *skill internalization*. <div align="center" style="display:flex; justify-content:center; gap:20px; align-items:flex-start;"> <img src="docs/skillzero/motivation.png" alt="motivation" style="width:40%;"> <img src="docs/skillzero/method.png" alt="method" style="width:58%;"> </div> SKILL0 achieves substantial improvements over the standard RL baseline on ALFWorld and Search-QA. <div align="center"> <img src="docs/skillzero/metric.png" alt="Logo" style="width:80%;"> </div> ## 🗞️ News - **`2026-5-15`**: 🔥🔥 Our new work was released: [SDAR](https://github.com/ZJU-REAL/SDAR), which introduces Self-Distilled Agentic Reinforcement Learning. - **`2026-5-07`**: 🔥 Our new work was released: [SKILL1](https://github.com/AlphaLab-USTC/Skill1), which evloves skill-augmented agents in **one** unified policy. - **`2026-4-03`**: We release our paper and code. ## 🛠️ Installation ### Python environment ```bash conda create -n skillzero python=3.12 -y conda activate skillzero pip install vllm==0.10.0 pip install flash-attn==2.7.4.post1 --no-build-isolation --no-cache-dir pip install -e . ``` Log in to Weights & Biases if you use WandB logging (scripts pass `trainer.logger=['console','wandb']` in many cases):

话题

探索更多

数据来自 GitHub,同步时间:2026-07-03