Local STT (Qwen3-ASR), VLM (Gemma 4 26B-A4B), and TTS (Spark-TTS) running on Apple Silicon via MLX, with bracket-tag action system for nod, shake, wiggle, dance, photo, and pre-recorded emotions.
27 lines
926 B
Bash
Executable File
27 lines
926 B
Bash
Executable File
#!/bin/bash
|
|
# Make Reachy Mini speak a sentence via local Spark-TTS on the Mac,
|
|
# then play it back through the robot's speaker over SSH.
|
|
#
|
|
# Usage: ./speak.sh "Hello world"
|
|
|
|
TEXT="${1:-Hello, I am Reachy Mini.}"
|
|
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
|
|
# Generate audio with Spark-TTS (mlx-audio) on the Mac.
|
|
"$SCRIPT_DIR/.venv/bin/python" -m mlx_audio.tts.generate \
|
|
--model mlx-community/Spark-TTS-0.5B-bf16 \
|
|
--text "$TEXT" \
|
|
--file_prefix /tmp/reachy_speech
|
|
|
|
# Copy to the robot and play it through its speaker.
|
|
sshpass -p 'root' scp -o StrictHostKeyChecking=no /tmp/reachy_speech_000.wav pollen@reachy-mini.local:/tmp/speech.wav
|
|
sshpass -p 'root' ssh -o StrictHostKeyChecking=no pollen@reachy-mini.local "/venvs/mini_daemon/bin/python -c \"
|
|
import time
|
|
from reachy_mini import ReachyMini
|
|
with ReachyMini() as mini:
|
|
mini.media.play_sound('/tmp/speech.wav')
|
|
time.sleep(10)
|
|
\""
|
|
|
|
echo "Spoke: $TEXT"
|