Claude Skill

darkrishabh/agent-skills-eval

A TypeScript CLI tool for evaluating AI agent skills in agentskills.io format. Supports JSONL/YAML tests and OpenAI-compatible LLM evals.

Overview

Stars490
Forks17
LanguageTypeScript
Last pushed2026-05-13
Last synced2026-05-15
View on GitHub

Repository

Ownerdarkrishabh
Repositoryagent-skills-eval
Full namedarkrishabh/agent-skills-eval
Repo ID1,230,541,272

🚀 Install this Skill

openclaw install darkrishabh/agent-skills-eval

Summary

A TypeScript-based test runner for evaluating AI agent skills in the agentskills.io format. It supports CLI usage, JSONL and YAML test definitions, and OpenAI-compatible LLM evaluations.

Chinese description

agentskills.io 风格 AI 代理技能的测试运行器

Key features

  • Runs agent skill evaluations from agentskills.io-style definitions
  • Supports JSONL and YAML test file formats
  • OpenAI-compatible LLM evaluation integration
  • Command-line interface (CLI) for easy automation
  • Built with TypeScript for type safety and reliability

Use cases

  • Evaluating AI agent performance on standardized skill tests
  • Automating LLM evaluation pipelines in CI/CD workflows
  • Benchmarking different AI agents against agentskills.io tasks
  • Developing and testing new agent skills with reproducible evals

Topics

Explore more

Data from GitHub. Synced on 2026-05-15