AI SummaryA multi-tenant voice assistant platform that handles speech-to-text, RAG-based retrieval, and text-to-speech synthesis to answer company-specific questions. Useful for developers building customer-facing IVR systems or voice interfaces.
Install
Copy this and paste it into Claude Code, Cursor, or any AI assistant:
I want to install the "voice-assistant-platform" skill in my project. Please run this command in my terminal: # Install skill into the correct directory mkdir -p .claude/skills/customer-service-assistant && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/customer-service-assistant/SKILL.md "https://raw.githubusercontent.com/papdawin/customer-service-assistant/master/skill.md" Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.
Description
Multi-tenant, callable voice assistant platform for company-specific information.
Purpose
Build and operate a callable, company-facing voice assistant that answers questions about a business (location, opening hours, contact methods, reachability, services, and policies). The platform is designed for multi-tenant deployments: each company maintains its own knowledge base, while core speech and language services are shared to keep operations efficient.
What This System Does
• Accepts live or recorded audio from callers. • Detects speech segments to avoid sending silence and noise downstream. • Transcribes speech into text with STT. • Retrieves company-specific knowledge with RAG. • Generates a concise, accurate answer with the shared LLM. • Synthesizes spoken replies with TTS. • Returns audio and text results with timing data for monitoring and UX feedback.
Multi-Tenant Model
• Each company is a tenant with its own data and retrieval index. • The RAG service is instantiated per tenant and points at tenant data sources. • STT, TTS, VAD, and the LLM are shared services across all tenants. • The backend gateway routes requests to the correct tenant RAG based on deployment config.
Core Services
• RAG: Per-tenant retrieval service. Companies can update their own information without changing core services. • STT: Shared speech-to-text service (Whisper). Converts audio to text. • TTS: Shared text-to-speech service (Piper). Converts responses into audio. • VAD: Shared voice activity detection. Identifies speech segments to improve accuracy and efficiency. • Backend: Orchestrates the pipeline and exposes HTTP + WebSocket APIs. • Frontend: Serves the UI for testing or operational use.
Discussion
Health Signals
My Fox Den
Community Rating
Sign in to rate this booster