How do I install web-content-fetcher?

web-content-fetcher is a Skill hosted on GitHub at https://github.com/shirenchuang/web-content-fetcher. Visit the ImAiFox page at https://imaifox.com/boosters/shirenchuang-web-content-fetcher-web-content-fetcher for the AI-ready install prompt you can copy directly into Claude Code, Cursor, or Windsurf.

How popular is web-content-fetcher?

web-content-fetcher has 492 GitHub stars and 60 forks. The repository has not had recent commits.

Is web-content-fetcher free?

Yes — web-content-fetcher is open source and free to use. The source code is publicly available on GitHub at https://github.com/shirenchuang/web-content-fetcher.

Skill

web-content-fetcher

Name: web-content-fetcher
Author: shirenchuang

by shirenchuang

AI Summary

Given a URL, return its main content as clean Markdown — headings, links, images, lists, code blocks all preserved. Always try one method per URL — don't cascade blindly. Pick the right one upfront. is the directory where this SKILL.md lives. Resolve it before calling the script.

Install

Copy this and paste it into Claude Code, Cursor, or any AI assistant:

I want to install the "web-content-fetcher" skill in my project.

Please run this command in my terminal:
# Install skill into your project
mkdir -p .claude/skills/web-content-fetcher && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/web-content-fetcher/SKILL.md "https://raw.githubusercontent.com/shirenchuang/web-content-fetcher/main/SKILL.md"

Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.

skill

Description

Extract article content from any URL as clean Markdown. Uses Scrapling script as primary method (with auto fast→stealth fallback), Jina Reader as alternative for simple pages. Preserves headings, links, images, lists, and code blocks. Use this skill whenever the user wants to fetch, read, extract, scrape, or summarize content from a URL — including blog posts, news articles, WeChat articles (微信公众号), documentation pages, or any web page. Also trigger when the user says things like "帮我读一下这篇文章", "抓取这个网页", "提取正文", or "read this page for me".

Web Content Fetcher

Given a URL, return its main content as clean Markdown — headings, links, images, lists, code blocks all preserved.

Extraction Strategy

Always try one method per URL — don't cascade blindly. Pick the right one upfront. ` URL │ ├─ 1. Scrapling script (preferred) │ Run fetch.py — check the domain routing table to decide fast vs --stealth. │ Works for most sites. Returns clean Markdown directly. │ └─ 2. Jina Reader (fallback — only if Scrapling fails or dependencies not installed) web_fetch("https://r.jina.ai/<url>") Free tier: 200 req/day. Fast (~1-2s), good Markdown output. Does NOT work for: WeChat (403), some Chinese platforms. `

Scrapling script

`bash python3 <SKILL_DIR>/scripts/fetch.py "<url>" [max_chars] [--stealth] ` <SKILL_DIR> is the directory where this SKILL.md lives. Resolve it before calling the script. The script has two modes built in: • Default (fast): HTTP fetch, ~1-3s, works for most sites • --stealth: Headless browser, ~5-15s, for JS-rendered or anti-scraping sites When run without --stealth, the script automatically falls back to stealth if the fast result has too little content. So you rarely need to specify --stealth manually — the only reason to force it is when you already know the site needs it (see routing table), which saves the initial fast attempt.

Domain Routing

Use this table to pick the right mode on the first call: | Domain | Command | Why | |--------|---------|-----| | mp.weixin.qq.com | fetch.py <url> --stealth | JS-rendered content | | zhuanlan.zhihu.com | fetch.py <url> --stealth | Anti-scraping + JS | | juejin.cn | fetch.py <url> --stealth | JS-rendered SPA | | sspai.com | fetch.py <url> | Static HTML | | blog.csdn.net | fetch.py <url> | Static HTML | | ruanyifeng.com | fetch.py <url> | Static blog | | openai.com | fetch.py <url> | Static HTML | | blog.google | fetch.py <url> | Static HTML | | Everything else | fetch.py <url> | Auto-fallback handles it |

Discussion

0/2000

Loading comments...

Health Signals

MaintenanceCommitted 3mo ago

◐ Stale

Adoption100+ stars on GitHub

492 ★ · Growing

DocsREADME + description

Well-documented

GitHub Signals

Stars492

Forks60

Issues2

Updated3mo ago

View on GitHub

No License

My Fox Den

Community Rating

Works With

Claude Code

Related Skills

executing-plans

Skill

writing-plans

Skill

writing-skills

Skill

pptx

Skill

View all Skills →