Skip to content
Skill

zai-vision

by Dwsy

AI Summary

zai-vision enables Claude Code to dynamically access vision capabilities through an MCP server, allowing developers to convert UI screenshots into code/specs and extract text via OCR. It benefits developers building AI-powered tools that need to analyze visual content without loading all tools upfront.

Install

Copy this and paste it into Claude Code, Cursor, or any AI assistant:

I want to install the "zai-vision" skill in my project.

Please run this command in my terminal:
# Install skill into the correct directory
mkdir -p .claude/skills/zai-vision-skill && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/zai-vision-skill/SKILL.md "https://raw.githubusercontent.com/Dwsy/zai-vision-skill/main/SKILL.md"

Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.

Description

Dynamic access to zai-vision MCP server (8 tools, transport: stdio)

Available Tools

ui_to_artifact - Convert UI screenshots into various artifacts: code, prompts, design specifications, or descriptions. extract_text_from_screenshot - Extract and recognize text from screenshots using advanced OCR capabilities. diagnose_error_screenshot - Diagnose and analyze error messages, stack traces, and exception screenshots. understand_technical_diagram - Analyze and explain technical diagrams including architecture diagrams, flowcharts, UML, ER diagrams, and system design diagrams. analyze_data_visualization - Analyze data visualizations, charts, graphs, and dashboards to extract insights and trends. ui_diff_check - Compare two UI screenshots to identify visual differences and implementation discrepancies. analyze_image - General-purpose image analysis for scenarios not covered by specialized tools. analyze_video - Analyze video content using advanced AI vision models.

Usage Pattern

When the user's request matches this skill's capabilities: Step 1: Identify the right tool from the list above Step 2: Generate a tool call in this JSON format: `json { "tool": "tool_name", "arguments": { "param1": "value1", "param2": "value2" } } ` Step 3: Execute via bash: `bash cd $SKILL_DIR python3 executor.py --call 'YOUR_JSON_HERE' ` ⚠️ 重要: Replace $SKILL_DIR with the actual discovered path of this skill directory.

Example 1: List all tools

`bash cd $SKILL_DIR python3 executor.py --list `

zai-vision Skill

This skill provides dynamic access to the zai-vision MCP server with progressive disclosure loading.

Discussion

0/2000
Loading comments...

Health Signals

MaintenanceCommitted 2mo ago
Active
AdoptionUnder 100 stars
0 ★ · Niche
DocsMissing or thin
Undocumented

GitHub Signals

Issues0
Updated2mo ago
View on GitHub
MIT License

My Fox Den

Community Rating

Sign in to rate this booster

Works With

Claude Code