Initial commit: workspace setup with skills, memory, config
This commit is contained in:
104
skills/kimi-tts-custom/SKILL.md
Normal file
104
skills/kimi-tts-custom/SKILL.md
Normal file
@@ -0,0 +1,104 @@
|
||||
---
|
||||
name: kimi-tts-custom
|
||||
description: Custom TTS handler for Kimi that generates voice messages with custom filenames (Kimi-XXX.ogg) and optionally suppresses text output. Use when user wants voice-only responses with branded filenames instead of default OpenClaw TTS behavior.
|
||||
---
|
||||
|
||||
# Kimi TTS Custom
|
||||
|
||||
## Overview
|
||||
|
||||
Custom TTS wrapper for local Kokoro that:
|
||||
- Generates voice with custom filenames (Kimi-XXX.ogg)
|
||||
- Can send voice-only (no text transcript)
|
||||
- Uses local Kokoro TTS at 10.0.0.228:8880
|
||||
|
||||
## When to Use
|
||||
|
||||
- User wants voice responses with "Kimi-" prefixed filenames
|
||||
- User wants voice-only mode (no text displayed)
|
||||
- Default TTS behavior needs customization
|
||||
|
||||
## Voice-Only Mode
|
||||
|
||||
**⚠️ CRITICAL: Generation ≠ Delivery**
|
||||
|
||||
Simply generating a voice file does NOT send it. You must use proper delivery method:
|
||||
|
||||
### Correct Way: Use voice_reply.py
|
||||
```bash
|
||||
python3 /root/.openclaw/workspace/skills/kimi-tts-custom/scripts/voice_reply.py "1544075739" "Your message here"
|
||||
```
|
||||
|
||||
This script:
|
||||
1. Generates voice file with Kimi-XXX.ogg filename
|
||||
2. Sends via Telegram API immediately
|
||||
3. Cleans up temp file
|
||||
|
||||
### Wrong Way: Text Reference
|
||||
❌ Do NOT do this:
|
||||
```
|
||||
[Voice message attached: Kimi-20260205-185016.ogg]
|
||||
```
|
||||
This does not attach the actual audio file — user receives no voice message.
|
||||
|
||||
### Alternative: Manual Send (if needed)
|
||||
If you already generated the file:
|
||||
```bash
|
||||
# Use OpenClaw CLI
|
||||
openclaw message send --channel telegram --target 1544075739 --media /path/to/Kimi-XXX.ogg
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Set in `messages.tts.custom`:
|
||||
```json
|
||||
{
|
||||
"messages": {
|
||||
"tts": {
|
||||
"custom": {
|
||||
"enabled": true,
|
||||
"voiceOnly": true,
|
||||
"filenamePrefix": "Kimi",
|
||||
"kokoroUrl": "http://10.0.0.228:8880/v1/audio/speech",
|
||||
"voice": "af_bella"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Scripts
|
||||
|
||||
### scripts/generate_voice.py
|
||||
Generates voice file with custom filename and returns path for sending.
|
||||
|
||||
**⚠️ Note**: This only creates the file. Does NOT send to Telegram.
|
||||
|
||||
Usage:
|
||||
```bash
|
||||
python3 generate_voice.py "Text to speak" [--voice af_bella] [--output-dir /tmp]
|
||||
```
|
||||
|
||||
Returns: JSON with `filepath`, `filename`, `duration`
|
||||
|
||||
### scripts/voice_reply.py (RECOMMENDED)
|
||||
Combined script: generates voice + sends via Telegram in one command.
|
||||
|
||||
**This is the correct way to send voice replies.**
|
||||
|
||||
Usage:
|
||||
```bash
|
||||
python3 voice_reply.py "1544075739" "Your message here" [--voice af_bella]
|
||||
```
|
||||
|
||||
This generates the voice file and sends it immediately (voice-only, no text).
|
||||
|
||||
## Key Rule
|
||||
|
||||
| Task | Use |
|
||||
|------|-----|
|
||||
| Generate voice file only | `generate_voice.py` |
|
||||
| Send voice reply to user | `voice_reply.py` |
|
||||
| Text reference to file | ❌ Does NOT work |
|
||||
|
||||
**Remember**: Generation and delivery are separate steps. Use `voice_reply.py` for complete voice reply workflow.
|
||||
86
skills/kimi-tts-custom/scripts/generate_voice.py
Executable file
86
skills/kimi-tts-custom/scripts/generate_voice.py
Executable file
@@ -0,0 +1,86 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Generate voice with custom Kimi-XXX filename using local Kokoro TTS
|
||||
Usage: generate_voice.py "Text to speak" [--voice af_bella] [--output-dir /tmp] [--speed 1.3]
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
import urllib.request
|
||||
from datetime import datetime
|
||||
|
||||
def generate_voice(text, voice="af_bella", output_dir="/tmp", model="tts-1", speed=1.3):
|
||||
"""Generate voice file with Kimi-XXX filename"""
|
||||
|
||||
# Generate unique filename: Kimi-YYYYMMDD-HHMMSS.ogg
|
||||
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
|
||||
filename = f"Kimi-{timestamp}.ogg"
|
||||
filepath = os.path.join(output_dir, filename)
|
||||
|
||||
# Call local Kokoro TTS
|
||||
tts_url = "http://10.0.0.228:8880/v1/audio/speech"
|
||||
|
||||
data = json.dumps({
|
||||
"model": model,
|
||||
"input": text,
|
||||
"voice": voice,
|
||||
"speed": speed
|
||||
}).encode()
|
||||
|
||||
req = urllib.request.Request(
|
||||
tts_url,
|
||||
data=data,
|
||||
headers={"Content-Type": "application/json"}
|
||||
)
|
||||
|
||||
try:
|
||||
with urllib.request.urlopen(req) as response:
|
||||
audio_data = response.read()
|
||||
|
||||
# Save to file
|
||||
with open(filepath, "wb") as f:
|
||||
f.write(audio_data)
|
||||
|
||||
# Estimate duration (rough: ~150 chars per minute at normal speed, adjusted for speed)
|
||||
estimated_duration = max(1, len(text) / 150 * 60 / speed)
|
||||
|
||||
result = {
|
||||
"filepath": filepath,
|
||||
"filename": filename,
|
||||
"size_bytes": len(audio_data),
|
||||
"estimated_duration_seconds": round(estimated_duration, 1),
|
||||
"voice": voice,
|
||||
"speed": speed,
|
||||
"text": text
|
||||
}
|
||||
|
||||
print(json.dumps(result))
|
||||
return result
|
||||
|
||||
except Exception as e:
|
||||
error_result = {
|
||||
"error": str(e),
|
||||
"filepath": None,
|
||||
"filename": None
|
||||
}
|
||||
print(json.dumps(error_result), file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description="Generate voice with Kimi-XXX filename")
|
||||
parser.add_argument("text", help="Text to convert to speech")
|
||||
parser.add_argument("--voice", default="af_bella",
|
||||
help="Voice ID (default: af_bella)")
|
||||
parser.add_argument("--output-dir", default="/tmp",
|
||||
help="Output directory (default: /tmp)")
|
||||
parser.add_argument("--model", default="tts-1",
|
||||
help="TTS model (default: tts-1)")
|
||||
parser.add_argument("--speed", type=float, default=1.3,
|
||||
help="Speech speed multiplier (default: 1.3)")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
generate_voice(args.text, args.voice, args.output_dir, args.model, args.speed)
|
||||
119
skills/kimi-tts-custom/scripts/voice_reply.py
Executable file
119
skills/kimi-tts-custom/scripts/voice_reply.py
Executable file
@@ -0,0 +1,119 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Generate voice with Kimi-XXX filename and send via Telegram (voice-only, no text)
|
||||
Usage: voice_reply.py <chat_id> "Text to speak" [--voice af_bella] [--speed 1.3] [--bot-token TOKEN]
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import subprocess
|
||||
import tempfile
|
||||
import urllib.request
|
||||
from datetime import datetime
|
||||
|
||||
def generate_voice(text, voice="af_bella", output_dir="/tmp", model="tts-1", speed=1.3):
|
||||
"""Generate voice file with Kimi-XXX filename"""
|
||||
|
||||
# Generate unique filename: Kimi-YYYYMMDD-HHMMSS.ogg
|
||||
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
|
||||
filename = f"Kimi-{timestamp}.ogg"
|
||||
filepath = os.path.join(output_dir, filename)
|
||||
|
||||
# Call local Kokoro TTS
|
||||
tts_url = "http://10.0.0.228:8880/v1/audio/speech"
|
||||
|
||||
data = json.dumps({
|
||||
"model": model,
|
||||
"input": text,
|
||||
"voice": voice,
|
||||
"speed": speed
|
||||
}).encode()
|
||||
|
||||
req = urllib.request.Request(
|
||||
tts_url,
|
||||
data=data,
|
||||
headers={"Content-Type": "application/json"}
|
||||
)
|
||||
|
||||
try:
|
||||
with urllib.request.urlopen(req) as response:
|
||||
audio_data = response.read()
|
||||
|
||||
with open(filepath, "wb") as f:
|
||||
f.write(audio_data)
|
||||
|
||||
return filepath, filename
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error generating voice: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
def send_voice_telegram(chat_id, audio_path, bot_token=None):
|
||||
"""Send voice message via Telegram"""
|
||||
|
||||
# Get bot token from env or config
|
||||
if not bot_token:
|
||||
bot_token = os.environ.get("TELEGRAM_BOT_TOKEN")
|
||||
|
||||
if not bot_token:
|
||||
# Try to get from openclaw config
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["openclaw", "config", "get", "channels.telegram.botToken"],
|
||||
capture_output=True, text=True
|
||||
)
|
||||
bot_token = result.stdout.strip()
|
||||
except:
|
||||
pass
|
||||
|
||||
if not bot_token:
|
||||
print("Error: No bot token found. Set TELEGRAM_BOT_TOKEN or provide --bot-token", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Use openclaw CLI to send
|
||||
cmd = [
|
||||
"openclaw", "message", "send",
|
||||
"--channel", "telegram",
|
||||
"--target", chat_id,
|
||||
"--media", audio_path
|
||||
]
|
||||
|
||||
try:
|
||||
result = subprocess.run(cmd, capture_output=True, text=True)
|
||||
if result.returncode == 0:
|
||||
print(f"✅ Voice sent successfully to {chat_id}")
|
||||
return True
|
||||
else:
|
||||
print(f"Error sending voice: {result.stderr}", file=sys.stderr)
|
||||
return False
|
||||
except Exception as e:
|
||||
print(f"Error: {e}", file=sys.stderr)
|
||||
return False
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description="Generate and send voice-only reply")
|
||||
parser.add_argument("chat_id", help="Telegram chat ID to send to")
|
||||
parser.add_argument("text", help="Text to convert to speech")
|
||||
parser.add_argument("--voice", default="af_bella", help="Voice ID (default: af_bella)")
|
||||
parser.add_argument("--speed", type=float, default=1.3, help="Speech speed multiplier (default: 1.3)")
|
||||
parser.add_argument("--bot-token", help="Telegram bot token (or set TELEGRAM_BOT_TOKEN)")
|
||||
parser.add_argument("--keep-file", action="store_true", help="Don't delete temp file after sending")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
print(f"Generating voice for: {args.text[:50]}...")
|
||||
filepath, filename = generate_voice(args.text, args.voice, speed=args.speed)
|
||||
print(f"Generated: {filename}")
|
||||
|
||||
print(f"Sending to {args.chat_id}...")
|
||||
success = send_voice_telegram(args.chat_id, filepath, args.bot_token)
|
||||
|
||||
if success and not args.keep_file:
|
||||
os.remove(filepath)
|
||||
print(f"Cleaned up temp file")
|
||||
elif success:
|
||||
print(f"Kept file at: {filepath}")
|
||||
|
||||
sys.exit(0 if success else 1)
|
||||
Reference in New Issue
Block a user