AI SummaryPipeline for downloading academic paper full-text at scale. Handles the three classes of sources that exist in 2026: 1. Publisher TDM APIs (Elsevier / Wiley / Springer) — for paywalled content where the institution has a subscription 2. OA aggregators (Unpaywall / OpenAlex / Crossref) — for Open Acc
Install
Copy this and paste it into Claude Code, Cursor, or any AI assistant:
I want to install the "Paper Full-text Harvest" skill in my project. Please run this command in my terminal: # Install skill into your project mkdir -p .claude/skills/paper-fulltext-harvest && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/paper-fulltext-harvest/SKILL.md "https://raw.githubusercontent.com/jxtse/scientific-research-skills/main/skills/paper-fulltext-harvest/SKILL.md" Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.
Description
Skills and tools for scientific research with LLM agents. 给 Agent(如 Claude Code、Openclaw)提供的科研 skills
Paper Full-text Harvest
Pipeline for downloading academic paper full-text at scale. Handles the three classes of sources that exist in 2026: • Publisher TDM APIs (Elsevier / Wiley / Springer) — for paywalled content where the institution has a subscription • OA aggregators (Unpaywall / OpenAlex / Crossref) — for Open Access copies regardless of publisher • Browser fallback (logged-in user profile) — for paywalled publishers without a TDM API (ACS / RSC / IEEE / AIP / IOP / APS / T&F / many CN journals) The publisher router (auto_paper_download/publishers.py) recognises **25 DOI prefixes** across 19 families, each annotated with the right downstream path (TDM client / OA aggregator / browser fallback) and a support tier. The router is shared with the standalone auto-paper-harvester CLI — see SUPPORTED_PUBLISHERS.md there for the full per-publisher table.
Decision tree
` Have a DOI list? ├── DOIs from Elsevier (10.1016, 10.1006, 10.1011) │ └── Use ElsevierClient (TDM XML API) → §1 ├── DOIs from Wiley (10.1002, 10.1111) │ └── Use WileyClient (TDM PDF API) → §1 ├── DOIs from Springer/Nature (10.1007, 10.1038, 10.1186, 10.1147) │ ├── OA papers → SpringerClient OA API → §1 │ └── Subscription papers → fall through to OA/browser ├── Browser-only publishers without TDM API │ (10.1021 ACS, 10.1039 RSC, 10.1126 Science, 10.1109 IEEE, │ 10.1063 AIP, 10.1088/10.1143 IOP, 10.1103 APS, 10.1146 Annual Reviews, │ 10.1080 T&F, 10.1116 AVS, 10.1149 ECS, 10.1364 Optica, 10.3938 KPS) │ ├── Try OA first via Unpaywall/OpenAlex → §2 │ └── Last resort: browser fallback → §3 ├── OA-leaning publishers (10.1073 PNAS, 10.3762 Beilstein) │ └── OpenAlex/Unpaywall usually works → §2 └── Mixed list (typical case) └── Use the orchestrated CLI (handles all of the above) → §0 `
§0. Quick start (orchestrated CLI)
For a typical mixed list of DOIs from Web of Science / Scopus export: `bash
Setup once
cp scripts/.env.example .env
Discussion
Health Signals
My Fox Den
Community Rating
Sign in to rate this booster