home.social

#diffusiontransformer — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #diffusiontransformer, aggregated by home.social.

  1. 🔊 #F5TTS: New non-autoregressive #TextToSpeech system

    • Uses flow matching with #DiffusionTransformer (#DiT)
    • Employs #ConvNeXt for refined text representation
    • Introduces Sway Sampling strategy for improved performance & efficiency
    • Achieves 0.15 Real-Time Factor (#RTF), faster than state-of-the-art diffusion-based TTS models
    • Trained on 100K hours multilingual dataset
    • Demonstrates zero-shot ability, code-switching capability, and speed control

    Key features:
    📊 Faster training
    🌐 Multilingual support
    🔄 Seamless code-switching
    ⏩ Efficient speed control

    Demo, code, and checkpoints available at: swivid.github.io/F5-TTS

    #AI #MachineLearning #Speech #NLP