8 boosters for "multimodal" — open source, verified from GitHub, ready to install
Transformers.js enables running state-of-the-art machine learning models directly in JavaScript, both in browsers and Node.js environments, with no server required. Use this skill when you need to: The pipeline API is the easiest way to use models. It groups together preprocessing, model inference,
Human MCP enables Claude coding agents to leverage human-like capabilities including vision, debugging, and multimodal interactions through the Model Context Protocol. Developers building AI coding assistants and agents will benefit from enhanced human-centered debugging and visual analysis features.
VT.ai provides Copilot-specific coding instructions for a multimodal AI chat application, establishing standards for Python development including naming conventions, style guides, and testing practices. Developers building AI-powered features with language models will benefit from these standardized guidelines.
"name": "frontend-dev", "description": "Automatic closed-loop frontend development with visual testing, browser automation, and iterative refinement using multimodal AI", "name": "hemangjoshi37a",
Helps project managers and team leads quickly assess repository health by identifying blockers in issues/PRs, syncing Google Drive documents, and managing team assignments. Useful for teams coordinating structural biology projects with mixed version control and cloud storage workflows.
"description": "Gemini CLI bridge for multimodal tasks — lets Claude Code delegate vision, summarization, and code analysis to Gemini", "keywords": ["gemini", "multimodal", "vision", "bridge"]
An expert AI engineer agent that helps developers build production-ready LLM applications, RAG systems, and intelligent agents with deep knowledge of modern AI stacks. Ideal for teams building chatbots, AI-powered features, and enterprise AI integrations.
Heuristic scoring (no AI key configured).