Explore AI Boosters

6 boosters for "benchmarking" — open source, verified from GitHub, ready to install

975 Skills461 Plugins635 Agents839 MCP Servers1330 Prompts

Active:

"benchmarking"

Toolathlon — System Prompt

A system prompt that transforms an AI assistant into a job search agent capable of managing Notion databases, filtering job opportunities, and automating application workflows. Useful for job seekers and recruiters seeking to streamline application tracking.

CCCDCuWS

solana-qa-engineer

Testing and quality assurance specialist for Solana programs. Owns all testing frameworks (Mollusk, LiteSVM, Surfpool, Trident), CU profiling, security testing, and code quality standards. Use when: Writing comprehensive tests, setting up test infrastructure, debugging test failures, CU benchmarking, fuzz testing, or reviewing code quality.

ArmBench-LLM — System Prompt

ArmBench-LLM is a system prompt framework for evaluating large language models on Armenian language tasks through structured multiple-choice questions. It's designed for developers and AI researchers who need standardized benchmarking tools across popular coding assistants and chat platforms.

ArmBench-LLM — System Prompt

ArmBench-LLM is a system prompt for benchmarking large language models using Armenian character-to-numeric matching tasks. It's designed for developers evaluating LLM performance across multiple coding platforms.

Custom Agents

PrismBench enables developers to create specialized LLM agents through YAML configuration for systematic evaluation of model capabilities using Monte Carlo Tree Search. Useful for ML engineers, researchers, and teams building production LLM systems who need comprehensive benchmarking and evaluation frameworks.

by PrismBench

automated-testingbenchmarking

3

CCCD

Agent

Custom Agents

PrismBench enables developers to create specialized LLM agents through YAML configuration for comprehensive benchmarking and evaluation of language model capabilities. Teams building AI evaluation systems and ML testing pipelines benefit from its systematic Monte Carlo Tree Search approach and containerized deployment.

by CommissarSilver

automated-testingbenchmarking

3

CCCD