home.social

#qwen3_tts — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #qwen3_tts, aggregated by home.social.

  1. ----------------

    🎛️ Tool — Voicebox

    Voicebox is an open-source, local-first desktop studio for voice cloning and speech synthesis. The project centers on providing on-device voice model downloads, rapid voice cloning from a few seconds of audio, and a digital-audio-workstation style editor for composing multi-voice projects. Key implementation details disclosed by the release: the primary synthesis backend is Qwen3-TTS, the application is built with Tauri (Rust) rather than Electron, and an MLX backend provides native Metal acceleration to speed inference on Apple Silicon.

    Architecture and capabilities
    • The application exposes an API-first surface for integration with other projects while retaining a full desktop UI for studio-style workflows.
    • Voice modeling: Qwen3-TTS is used for high-fidelity cloning; the project notes planned support for additional models such as XTTS and Bark.
    • Editing features: multi-track timeline, audio trimming, conversation mixing, and per-voice profiles derived from short audio samples.
    • Platform builds: packaged releases were provided for macOS (Apple Silicon and Intel) and Windows; Linux builds are noted as forthcoming but blocked by CI runner disk space constraints.

    Performance and privacy
    • Local-first design keeps models and audio data on-device, avoiding cloud-based storage or inference by default.
    • On Apple Silicon, MLX with Metal acceleration is reported to deliver multiple-fold faster inference compared with generic paths, improving responsiveness for generation and cloning workflows.

    Limitations and scope
    • Current model support centers on Qwen3-TTS; multi-model support is listed as planned but not yet available.
    • Linux availability is pending; CI limitations were explicitly cited as the blocker for those builds.
    • No cloud collaboration or managed hosting is part of the announced feature set; the project emphasizes offline, local operation.

    This release documents a desktop-focused, privacy-oriented approach to voice cloning with clear statements on model choice, runtime acceleration, and editing capabilities. #voicebox #qwen3_tts #TTS #voice_cloning

    🔗 Source: github.com/jamiepine/voicebox?

  2. わずか数秒録音したサンプル音声でテキストを読み上げてくれる無料ツール「Voicebox」/Alibaba製の「Qwen3 TTS」をWindowsで手軽に体験【レビュー】
    forest.watch.impress.co.jp/doc

    #forest_watch_impress #音声生成AI #Voicebox #Alibaba #Qwen3_TTS #genai #その他

  3. わずか数秒録音したサンプル音声でテキストを読み上げてくれる無料ツール「Voicebox」/Alibaba製の「Qwen3 TTS」をWindowsで手軽に体験【レビュー】
    forest.watch.impress.co.jp/doc

    #forest_watch_impress #音声生成AI #Voicebox #Alibaba #Qwen3_TTS #genai #その他

  4. わずか数秒録音したサンプル音声でテキストを読み上げてくれる無料ツール「Voicebox」/Alibaba製の「Qwen3 TTS」をWindowsで手軽に体験【レビュー】
    forest.watch.impress.co.jp/doc

    #forest_watch_impress #音声生成AI #Voicebox #Alibaba #Qwen3_TTS #genai #その他