How do I install speech?

speech is a Skill hosted on GitHub at https://github.com/openai/skills. Visit the ImAiFox page at https://imaifox.com/boosters/openai-skills-speech for the AI-ready install prompt you can copy directly into Claude Code, Cursor, or Windsurf.

How popular is speech?

speech has 22,593 GitHub stars and 1,535 forks. It is actively maintained with recent commits.

Yes — speech is open source and free to use. The source code is publicly available on GitHub at https://github.com/openai/skills.

Skill

speech

Name: speech
Author: openai

by openai

AI Summary

Generates spoken audio from text using OpenAI's TTS API, supporting single clips, batch operations, and accessibility reads. Developers building voiceovers, IVR systems, or accessible content will find this directly useful.

Install

Copy this and paste it into Claude Code, Cursor, or any AI assistant:

I want to install the "speech" skill in my project.

Please run this command in my terminal:
# Install skill into your project (16 files)
mkdir -p .claude/skills/speech && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/SKILL.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/SKILL.md" && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/LICENSE.txt "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/LICENSE.txt" && mkdir -p .claude/skills/speech/agents && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/agents/openai.yaml "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/agents/openai.yaml" && mkdir -p .claude/skills/speech/assets && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/assets/speech-small.svg "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/assets/speech-small.svg" && mkdir -p .claude/skills/speech/assets && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/assets/speech.png "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/assets/speech.png" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/accessibility.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/accessibility.md" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/audio-api.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/audio-api.md" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/cli.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/cli.md" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/codex-network.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/codex-network.md" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/ivr.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/ivr.md" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/narration.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/narration.md" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/prompting.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/prompting.md" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/sample-prompts.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/sample-prompts.md" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/voice-directions.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/voice-directions.md" && mkdir -p .claude/skills/speech/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/references/voiceover.md "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/references/voiceover.md" && mkdir -p .claude/skills/speech/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/speech/scripts/text_to_speech.py "https://raw.githubusercontent.com/openai/skills/main/skills/.curated/speech/scripts/text_to_speech.py"

Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.

api ai prompt

Description

Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; run the bundled CLI (`scripts/text_to_speech.py`) with built-in voices and require `OPENAI_API_KEY` for live calls. Custom voice creation is out of scope.

Speech Generation Skill

Generate spoken audio for the current project (narration, product demo voiceover, IVR prompts, accessibility reads). Defaults to gpt-4o-mini-tts-2025-12-15 and built-in voices, and prefers the bundled CLI for deterministic, reproducible runs.

When to use

• Generate a single spoken clip from text • Generate a batch of prompts (many lines, many files)

Decision tree (single vs batch)

• If the user provides multiple lines/prompts or wants many outputs -> batch • Else -> single

Workflow

• Decide intent: single vs batch (see decision tree above). • Collect inputs up front: exact text (verbatim), desired voice, delivery style, format, and any constraints. • If batch: write a temporary JSONL under tmp/ (one job per line), run once, then delete the JSONL. • Augment instructions into a short labeled spec without rewriting the input text. • Run the bundled CLI (scripts/text_to_speech.py) with sensible defaults (see references/cli.md). • For important clips, validate: intelligibility, pacing, pronunciation, and adherence to constraints. • Iterate with a single targeted change (voice, speed, or instructions), then re-check. • Save/return final outputs and note the final text + instructions + flags used.

Discussion

0/2000

Loading comments...

Health Signals

MaintenanceCommitted 1mo ago

● Active

Adoption1K+ stars on GitHub

22.6k ★ · Popular

DocsREADME + description

Well-documented

GitHub Signals

Stars22.6k

Forks1.5k

Issues277

Updated1mo ago

View on GitHub

No License

My Fox Den

Community Rating

Works With

Claude Code

Related Skills

Openclaw MCP Server

MCP Server

receiving-code-review

Skill

dispatching-parallel-agents

Skill

using-git-worktrees

Skill

View all Skills →