Skip to content
Skill

doc-cleaner

by notoriouslab

AI Summary

Convert documents (PDF, DOCX, XLSX, TXT) to clean, structured Markdown. The flag prints a JSON summary to stdout after processing: PDF (native, scanned, encrypted), DOCX, XLSX, XLS, CSV, TXT, MD

Install

Copy this and paste it into Claude Code, Cursor, or any AI assistant:

I want to install the "doc-cleaner" skill in my project.

Please run this command in my terminal:
# Install skill into your project
mkdir -p .claude/skills/doc-cleaner && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/doc-cleaner/SKILL.md "https://raw.githubusercontent.com/notoriouslab/doc-cleaner/main/SKILL.md"

Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.

Description

Convert PDF, DOCX, XLSX, and text files to clean, structured Markdown. CJK-friendly, table-friendly, privacy-first.

doc-cleaner

Convert documents (PDF, DOCX, XLSX, TXT) to clean, structured Markdown.

When to use

• User asks to convert a document to Markdown • User wants to extract text or tables from PDF/DOCX/XLSX files • User wants to clean up bank statements or financial documents • User asks to process a batch of documents in a directory

Convert a single file (no AI, fastest)

`bash python3 {baseDir}/cleaner.py --input "{{file_path}}" --ai none `

Convert a single file with AI structuring

`bash python3 {baseDir}/cleaner.py --input "{{file_path}}" --ai gemini `

Discussion

0/2000
Loading comments...

Health Signals

MaintenanceCommitted 2d ago
Active
Adoption100+ stars on GitHub
160 ★ · Growing
DocsREADME + description
Well-documented

GitHub Signals

Stars160
Forks16
Issues1
Updated2d ago
View on GitHub
MIT License

My Fox Den

Community Rating

Sign in to rate this booster

Works With

Claude Code