Search boosters…

Open Source First·Updated Hourly·5 Booster Types

ImAiFox

Explore AI Boosters

76 boosters for "eval" — open source, verified from GitHub, ready to install

1181 Skills617 Plugins662 Agents878 MCP Servers1442 Prompts

Clear filters

Active:

"eval"

MCP Server

Claude Historian Mcp MCP Server

An MCP server for conversation history search and retrieval in Claude Code

by Vvkmnn

claudeclaude-code

216

PreviousPage 2 of 4Next

CCCD

Skill

swr

SWR is a React hook library for efficient data fetching with built-in caching, revalidation, and real-time updates. Developers building API-driven applications benefit from simplified server state management and automatic synchronization.

Prism Mcp Server MCP Server

"name": "prism-mcp-server", "mcpName": "io.github.dcostenco/prism-mcp", "description": "The Mind Palace for AI Agents — persistent memory (SQLite/Supabase), behavioral learning & IDE rules sync, multimodal VLM image captioning, pluggable LLM providers (OpenAI/Anthropic/Gemini/Ollama), OpenTelemetry

stash

This skill enables developers to save and retrieve Git changes across Claude Code sessions by linking stash entries to session IDs, maintaining continuity when resuming work. It benefits developers who work on code iteratively across multiple conversations and need to preserve work-in-progress state.

double-shot-latte

"name": "double-shot-latte", "description": "Automatically evaluates whether Claude should continue working instead of stopping prematurely using Claude-judged decision making", "url": "https://github.com/anthropics"

brain-in-the-fish

Brain in the Fish evaluates documents (essays, policies, contracts, clinical reports, surveys) against evaluation criteria using a panel of AI agents. Each agent's mental state exists as OWL ontology. Scoring is grounded in an Evidence Density Scorer (EDS) that makes hallucination mathematically det

Semantic Knowledge Retrieval

You are tasked with retrieving relevant knowledge from the Obsidian vault using multi-layer semantic search. 1. First Layer - Initial Search: 2. Second Layer - Direct Associations:

by Abilityai

7819

Skill

audit-context

Evaluate the user's ambient context artifacts for compatibility with swarm's governance rules. You are a read-only diagnostic — never modify any files. 1. CLAUDE.md files. Read the project's (working directory root). If exists, read that too. Also check (global config) — it loads into every sessi

by DheerG

agent-swarmsagent-teams

776

Skill

skill-optimizer

is an eval workbench for agent skills. It runs a model in an isolated Docker directory, provides skills/references as normal workspace files, captures an agent trace, and grades deterministic local outcomes. Use this skill as the source of truth for authoring eval suites in this repo. Detailed sche

buyer-eval

Use AskUserQuestion to ask the buyer: Tell the user the version was updated, then re-read the EVALUATION.md file from the updated directory and proceed with the skill. After the preamble, read the full evaluation methodology:

clawmem

Two tiers: hooks handle automatic context flow (surfacing, extraction, compaction survival). MCP tools handle explicit recall, write, and lifecycle operations. Three instances for neural inference. The wrapper defaults to . All three models auto-download via if no server is running (Metal on Appl

by yoloshii

ai-agent-memoryai-agents

515

Skill

social-media-paper-triage

Turn social media paper recommendations into actionable research items. Use platform-specific tools to fetch the full content: From the extracted content, identify all referenced papers:

by jxtse

academic-researchagent-skill

484

Skill

local-vault

Turn a folder of raw files into a Markdown vault that an LLM can grep, and then answer questions over that vault responsibly. source file, carrying retrieval frontmatter (abstract / tags / synonyms) + a

by genli-ai

agent-skillsclaude-code

445

Skill

architecture-copilot

1. 架构不是「画」出来的,是从约束里「逼」出来的。没搞清约束就画图,画什么都是瞎画。 2. 没有银弹,只有取舍。任何决策本质都是「用 A 换 B」。一个「没有缺点」的方案,不是完美,是没想清楚。 3. 没有「最好的架构」,只有「在这组约束下最合适的架构」。同样是聊天,内部工具和微信的答案天差地别。

phdtaketaketake

"name": "phdtaketaketake", "description": "Connection-first PhD advisor matcher — finds the right advisor by network strength, not h-index. Evidence-first: every signal traces to a real source the agent fetched. Best-supported for physics / MSE; extensible to other STEM with field-specific caveats."

by powerofjinbo

academic-networkadvisor-matching

283

Agent

evaluator

An agent designed to evaluate other agents and tasks, with library-first constraints and multi-tool integration across Claude platforms. Useful for teams building quality assurance workflows into their Claude-based systems.

rag-code-mcp — Windsurf Rules

RagCode MCP is a semantic code navigation tool that integrates RAG-powered code search into Windsurf and other IDEs, enabling developers to intelligently query and understand multi-language codebases using local LLMs. It's ideal for developers working with Laravel, Go, Python, and PHP who need fast, context-aware code exploration without leaving their IDE.

by doITmagic

claude-desktopcode-navigation

201

Plugin

warden

"description": "Smart command safety filter for Claude Code — parses shell pipelines and evaluates per-command safety rules to auto-approve safe commands and block dangerous ones",

AgentAsJudge — System Prompt

AgentAsJudge is an agentic evaluation framework that enables AI systems to critically review educational introductions by validating them against specified quality metrics and providing constructive feedback. It benefits educators, instructional designers, and developers building AI-assisted learning platforms who need reliable, fair assessment of educational content.

AgentAsJudge — System Prompt

AgentAsJudge is an agentic evaluation framework that enables AI to systematically assess and compare the quality of multiple-choice questions across educational value, clarity, and answerability. It benefits educators, content creators, and assessment teams looking to automate quality control of exam and quiz questions.

ai-iq

"version": "5.10.0", "description": "Memory → Evaluation → Credential → Access Control for AI agents. Persistent memory with W3C Verifiable Credentials, capability-based access control, drift detection, and FSRS-6 spaced repetition.", "name": "kobie3717",

cortex

"description": "Persistent memory for Claude Code — remembers across sessions automatically. Install and forget. Scientific retrieval backed by 41 published papers.", "name": "Clement Deust", "email": "admin@ai-architect.tools"

by cdeust

agent-memory-systemagent-skills

122

Plugin

vibe-science

"name": "vibe-science", "description": "Scientific research plugin with tracked claim/review/seed lifecycle, citation verification gates, strict integrity, benchmark recording, and retrieval closure.", "name": "Vibe Science Contributors",

by th3vib3coder

adversarial-reviewai-agents

Plugin

open-academic-paper-machine

"name": "open-academic-paper-machine", "description": "Open Academic Paper Machine — Autonomous academic paper production system with idea evaluation gate and paper-vs-code audit. NEW in v6.4: /audit-paper command and audit-engine skill — static audit of a paper's empirical claims (datasets, models,

by TobiasBlask

academic-writingarxiv