#deepspeech — Public Fediverse posts on home.social

[email protected] @[email protected] · 2025-06-27 · 18:23 UTC

DeepSpeech ist eine quelloffene, lokale #Sprache-zu-Text-Engine, die mit #Machine #Learning unter Verwendung des #TensorFlow-Frameworks erstellt wurde. Es ist in Verbindung mit der ebenfalls von #Mozilla zusammengestellten Stimmdatenbank Common Voice Corpus bereits auf einem Raspberry Pi 4 in Echtzeit lauffähig.

https://linuxnews.de/mozilla-stellt-deepspeech-ein/

#mozilla #deepspeech #sprache #machine #learning #tensorflow

[email protected] @[email protected] · 2025-06-27 · 18:23 UTC

#Mozilla stellt #DeepSpeech ein.

DeepSpeech ist eine quelloffene, lokale #Sprache-zu-Text-Engine, die mit #Machine #Learning unter Verwendung des #TensorFlow-Frameworks erstellt wurde. Es ist in Verbindung mit der ebenfalls von #Mozilla zusammengestellten Stimmdatenbank Common Voice Corpus bereits auf einem Raspberry Pi 4 in Echtzeit lauffähig.

https://linuxnews.de/mozilla-stellt-deepspeech-ein/

#mozilla #deepspeech #sprache #machine #learning #tensorflow

[email protected] @[email protected] · 2025-06-27 · 18:23 UTC

#Mozilla stellt #DeepSpeech ein.

DeepSpeech ist eine quelloffene, lokale #Sprache-zu-Text-Engine, die mit #Machine #Learning unter Verwendung des #TensorFlow-Frameworks erstellt wurde. Es ist in Verbindung mit der ebenfalls von #Mozilla zusammengestellten Stimmdatenbank Common Voice Corpus bereits auf einem Raspberry Pi 4 in Echtzeit lauffähig.

https://linuxnews.de/mozilla-stellt-deepspeech-ein/

#mozilla #deepspeech #sprache #machine #learning #tensorflow

[email protected] @[email protected] · 2025-06-27 · 18:23 UTC

#Mozilla stellt #DeepSpeech ein.

DeepSpeech ist eine quelloffene, lokale #Sprache-zu-Text-Engine, die mit #Machine #Learning unter Verwendung des #TensorFlow-Frameworks erstellt wurde. Es ist in Verbindung mit der ebenfalls von #Mozilla zusammengestellten Stimmdatenbank Common Voice Corpus bereits auf einem Raspberry Pi 4 in Echtzeit lauffähig.

https://linuxnews.de/mozilla-stellt-deepspeech-ein/

#tensorflow #learning #machine #sprache #deepspeech #mozilla

[email protected] @[email protected] · 2025-06-27 · 18:23 UTC

#Mozilla stellt #DeepSpeech ein.

DeepSpeech ist eine quelloffene, lokale #Sprache-zu-Text-Engine, die mit #Machine #Learning unter Verwendung des #TensorFlow-Frameworks erstellt wurde. Es ist in Verbindung mit der ebenfalls von #Mozilla zusammengestellten Stimmdatenbank Common Voice Corpus bereits auf einem Raspberry Pi 4 in Echtzeit lauffähig.

https://linuxnews.de/mozilla-stellt-deepspeech-ein/

#mozilla #deepspeech #sprache #machine #learning #tensorflow

Benjamin Carr, Ph.D. 👨🏻‍💻🧬 @[email protected] · 2025-06-27 · 10:04 UTC

#Mozilla Formally Discontinues Its #DeepSpeech Project
#MozillaDeepSpeech was a #speechtotext engine with great performance for real-time communication even when running on #RaspberryPi and other low-power systems.
Mozilla discontinuing DeepSpeech sadly doesn't as surprise. Last tagged release was 0.9.3 back in December 2020 and there hadn't been any Git activity since 2021.
Even in 2020 DeepSpeech was considered at risk of ceasing development following Mozilla layoffs.
https://www.phoronix.com/news/Mozilla-DeepSpeech-Discontinued

#mozilla #deepspeech #mozilladeepspeech #speechtotext #raspberrypi

Benjamin Carr, Ph.D. 👨🏻‍💻🧬 @[email protected] · 2025-06-27 · 10:04 UTC

#Mozilla Formally Discontinues Its #DeepSpeech Project
#MozillaDeepSpeech was a #speechtotext engine with great performance for real-time communication even when running on #RaspberryPi and other low-power systems.
Mozilla discontinuing DeepSpeech sadly doesn't as surprise. Last tagged release was 0.9.3 back in December 2020 and there hadn't been any Git activity since 2021.
Even in 2020 DeepSpeech was considered at risk of ceasing development following Mozilla layoffs.
https://www.phoronix.com/news/Mozilla-DeepSpeech-Discontinued

#mozilla #deepspeech #mozilladeepspeech #speechtotext #raspberrypi

Benjamin Carr, Ph.D. 👨🏻‍💻🧬 @BenjaminHCCarr · 2025-06-27 · 10:04 UTC

#Mozilla Formally Discontinues Its #DeepSpeech Project
#MozillaDeepSpeech was a #speechtotext engine with great performance for real-time communication even when running on #RaspberryPi and other low-power systems.
Mozilla discontinuing DeepSpeech sadly doesn't as surprise. Last tagged release was 0.9.3 back in December 2020 and there hadn't been any Git activity since 2021.
Even in 2020 DeepSpeech was considered at risk of ceasing development following Mozilla layoffs.
https://www.phoronix.com/news/Mozilla-DeepSpeech-Discontinued

#mozilla #deepspeech #mozilladeepspeech #speechtotext #raspberrypi

Benjamin Carr, Ph.D. 👨🏻‍💻🧬 @[email protected] · 2025-06-27 · 10:04 UTC

#Mozilla Formally Discontinues Its #DeepSpeech Project
#MozillaDeepSpeech was a #speechtotext engine with great performance for real-time communication even when running on #RaspberryPi and other low-power systems.
Mozilla discontinuing DeepSpeech sadly doesn't as surprise. Last tagged release was 0.9.3 back in December 2020 and there hadn't been any Git activity since 2021.
Even in 2020 DeepSpeech was considered at risk of ceasing development following Mozilla layoffs.
https://www.phoronix.com/news/Mozilla-DeepSpeech-Discontinued

#raspberrypi #speechtotext #mozilladeepspeech #deepspeech #mozilla

Benjamin Carr, Ph.D. 👨🏻‍💻🧬 @[email protected] · 2025-06-27 · 10:04 UTC

#Mozilla Formally Discontinues Its #DeepSpeech Project
#MozillaDeepSpeech was a #speechtotext engine with great performance for real-time communication even when running on #RaspberryPi and other low-power systems.
Mozilla discontinuing DeepSpeech sadly doesn't as surprise. Last tagged release was 0.9.3 back in December 2020 and there hadn't been any Git activity since 2021.
Even in 2020 DeepSpeech was considered at risk of ceasing development following Mozilla layoffs.
https://www.phoronix.com/news/Mozilla-DeepSpeech-Discontinued

#mozilla #deepspeech #mozilladeepspeech #speechtotext #raspberrypi

Paolo Redaelli @[email protected] · 2025-06-26 · 05:42 UTC

#Mozilla Formally Discontinues Its #DeepSpeech speech-to-text Project - Slashdot 😱 😭

https://tech.slashdot.org/story/25/06/25/1851201/mozilla-formally-discontinues-its-deepspeech-project

#mozilla #deepspeech

Paolo Redaelli @[email protected] · 2025-06-26 · 05:42 UTC

#Mozilla Formally Discontinues Its #DeepSpeech speech-to-text Project - Slashdot 😱 😭

https://tech.slashdot.org/story/25/06/25/1851201/mozilla-formally-discontinues-its-deepspeech-project

#mozilla #deepspeech

Paolo Redaelli @[email protected] · 2025-06-26 · 05:42 UTC

#Mozilla Formally Discontinues Its #DeepSpeech speech-to-text Project - Slashdot 😱 😭

https://tech.slashdot.org/story/25/06/25/1851201/mozilla-formally-discontinues-its-deepspeech-project

#mozilla #deepspeech

Paolo Redaelli @[email protected] · 2025-06-26 · 05:42 UTC

#Mozilla Formally Discontinues Its #DeepSpeech speech-to-text Project - Slashdot 😱 😭

https://tech.slashdot.org/story/25/06/25/1851201/mozilla-formally-discontinues-its-deepspeech-project

#deepspeech #mozilla

Paolo Redaelli @[email protected] · 2025-06-26 · 05:42 UTC

#Mozilla Formally Discontinues Its #DeepSpeech speech-to-text Project - Slashdot 😱 😭

https://tech.slashdot.org/story/25/06/25/1851201/mozilla-formally-discontinues-its-deepspeech-project

#mozilla #deepspeech

Marco Giannini :tux: @[email protected] · 2025-06-25 · 20:23 UTC

Mozilla interrompe lo sviluppo di DeepSpeech

#deepspeech #mozilla

https://www.marcosbox.com/2025/06/25/mozilla-interrompe-lo-sviluppo-di-deepspeech/

#deepspeech #mozilla

Marco Giannini :tux: @[email protected] · 2025-06-25 · 20:23 UTC

Mozilla interrompe lo sviluppo di DeepSpeech

#deepspeech #mozilla

https://www.marcosbox.com/2025/06/25/mozilla-interrompe-lo-sviluppo-di-deepspeech/

#deepspeech #mozilla

Marco Giannini :tux: @[email protected] · 2025-06-25 · 20:23 UTC

Mozilla interrompe lo sviluppo di DeepSpeech

#deepspeech #mozilla

https://www.marcosbox.com/2025/06/25/mozilla-interrompe-lo-sviluppo-di-deepspeech/

#mozilla #deepspeech

Marco Giannini :tux: @[email protected] · 2025-06-25 · 20:23 UTC

Mozilla interrompe lo sviluppo di DeepSpeech

#deepspeech #mozilla

https://www.marcosbox.com/2025/06/25/mozilla-interrompe-lo-sviluppo-di-deepspeech/

#deepspeech #mozilla

N-gated Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

🌟✨BREAKING NEWS✨🌟 Mozilla's #DeepSpeech is so cutting-edge that it's been cut entirely! 😂 Now you can enjoy the sound of silence on your Raspberry Pi 4 without the distraction of real-time speech-to-text. Maybe next time they'll invent something that doesn't get #discontinued faster than you can say "GitHub Copilot"! 🚀
https://github.com/mozilla/DeepSpeech #Mozilla #RaspberryPi #TechNews #SpeechToText #HackerNews #ngated

#deepspeech #discontinued #mozilla #raspberrypi #technews #speechtotext

N-gated Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

🌟✨BREAKING NEWS✨🌟 Mozilla's #DeepSpeech is so cutting-edge that it's been cut entirely! 😂 Now you can enjoy the sound of silence on your Raspberry Pi 4 without the distraction of real-time speech-to-text. Maybe next time they'll invent something that doesn't get #discontinued faster than you can say "GitHub Copilot"! 🚀
https://github.com/mozilla/DeepSpeech #Mozilla #RaspberryPi #TechNews #SpeechToText #HackerNews #ngated

#deepspeech #discontinued #mozilla #raspberrypi #technews #speechtotext

N-gated Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

🌟✨BREAKING NEWS✨🌟 Mozilla's #DeepSpeech is so cutting-edge that it's been cut entirely! 😂 Now you can enjoy the sound of silence on your Raspberry Pi 4 without the distraction of real-time speech-to-text. Maybe next time they'll invent something that doesn't get #discontinued faster than you can say "GitHub Copilot"! 🚀
https://github.com/mozilla/DeepSpeech #Mozilla #RaspberryPi #TechNews #SpeechToText #HackerNews #ngated

#deepspeech #discontinued #mozilla #raspberrypi #technews #speechtotext

N-gated Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

🌟✨BREAKING NEWS✨🌟 Mozilla's #DeepSpeech is so cutting-edge that it's been cut entirely! 😂 Now you can enjoy the sound of silence on your Raspberry Pi 4 without the distraction of real-time speech-to-text. Maybe next time they'll invent something that doesn't get #discontinued faster than you can say "GitHub Copilot"! 🚀
https://github.com/mozilla/DeepSpeech #Mozilla #RaspberryPi #TechNews #SpeechToText #HackerNews #ngated

#ngated #hackernews #speechtotext #technews #raspberrypi #mozilla

N-gated Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

🌟✨BREAKING NEWS✨🌟 Mozilla's #DeepSpeech is so cutting-edge that it's been cut entirely! 😂 Now you can enjoy the sound of silence on your Raspberry Pi 4 without the distraction of real-time speech-to-text. Maybe next time they'll invent something that doesn't get #discontinued faster than you can say "GitHub Copilot"! 🚀
https://github.com/mozilla/DeepSpeech #Mozilla #RaspberryPi #TechNews #SpeechToText #HackerNews #ngated

#deepspeech #discontinued #mozilla #raspberrypi #technews #speechtotext

Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

DeepSpeech Is Discontinued

https://github.com/mozilla/DeepSpeech

#HackerNews #DeepSpeech #Discontinued #Mozilla #AI #SpeechRecognition #MachineLearning

#hackernews #deepspeech #discontinued #mozilla #ai #speechrecognition

Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

DeepSpeech Is Discontinued

https://github.com/mozilla/DeepSpeech

#HackerNews #DeepSpeech #Discontinued #Mozilla #AI #SpeechRecognition #MachineLearning

#hackernews #deepspeech #discontinued #mozilla #ai #speechrecognition

Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

DeepSpeech Is Discontinued

https://github.com/mozilla/DeepSpeech

#HackerNews #DeepSpeech #Discontinued #Mozilla #AI #SpeechRecognition #MachineLearning

#hackernews #deepspeech #discontinued #mozilla #ai #speechrecognition

Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

DeepSpeech Is Discontinued

https://github.com/mozilla/DeepSpeech

#HackerNews #DeepSpeech #Discontinued #Mozilla #AI #SpeechRecognition #MachineLearning

#machinelearning #speechrecognition #ai #mozilla #discontinued #deepspeech

Hacker News @[email protected] · 2025-06-25 · 19:09 UTC

DeepSpeech Is Discontinued

https://github.com/mozilla/DeepSpeech

#HackerNews #DeepSpeech #Discontinued #Mozilla #AI #SpeechRecognition #MachineLearning

#hackernews #deepspeech #discontinued #mozilla #ai #speechrecognition

Strypey @[email protected] · 2024-11-22 · 12:44 UTC

"Coqui, a conversational AI startup, on Wednesday (January 3, 2023), announced that it is shutting down its operation ...

[It] specialises in building open source models and applications in the area of quick voice cloning, text-to-voice, etc. The former employees of Mozilla, left the company after it stopped developing their own Speech-to-text engine, DeepSpeech to begin Coqui.”

#KLKrithika, 2024

https://analyticsindiamag.com/industry-insights/ai-startups/conversational-ai-startup-coqui-shuts-down

#translation #MachineTranslation #ASR #TTS #Coqui #Mozilla #DeepSpeech

#klkrithika #translation #machinetranslation #asr #tts #coqui

Strypey @[email protected] · 2024-11-22 · 12:44 UTC

"Coqui, a conversational AI startup, on Wednesday (January 3, 2023), announced that it is shutting down its operation ...

[It] specialises in building open source models and applications in the area of quick voice cloning, text-to-voice, etc. The former employees of Mozilla, left the company after it stopped developing their own Speech-to-text engine, DeepSpeech to begin Coqui.”

#KLKrithika, 2024

https://analyticsindiamag.com/industry-insights/ai-startups/conversational-ai-startup-coqui-shuts-down

#translation #MachineTranslation #ASR #TTS #Coqui #Mozilla #DeepSpeech

#klkrithika #translation #machinetranslation #asr #tts #coqui

Strypey @[email protected] · 2024-11-22 · 12:44 UTC

"Coqui, a conversational AI startup, on Wednesday (January 3, 2023), announced that it is shutting down its operation ...

[It] specialises in building open source models and applications in the area of quick voice cloning, text-to-voice, etc. The former employees of Mozilla, left the company after it stopped developing their own Speech-to-text engine, DeepSpeech to begin Coqui.”

#KLKrithika, 2024

https://analyticsindiamag.com/industry-insights/ai-startups/conversational-ai-startup-coqui-shuts-down

#translation #MachineTranslation #ASR #TTS #Coqui #Mozilla #DeepSpeech

#deepspeech #mozilla #coqui #tts #asr #machinetranslation

Strypey @[email protected] · 2024-11-22 · 12:44 UTC

"Coqui, a conversational AI startup, on Wednesday (January 3, 2023), announced that it is shutting down its operation ...

[It] specialises in building open source models and applications in the area of quick voice cloning, text-to-voice, etc. The former employees of Mozilla, left the company after it stopped developing their own Speech-to-text engine, DeepSpeech to begin Coqui.”

#KLKrithika, 2024

https://analyticsindiamag.com/industry-insights/ai-startups/conversational-ai-startup-coqui-shuts-down

#translation #MachineTranslation #ASR #TTS #Coqui #Mozilla #DeepSpeech

#klkrithika #translation #machinetranslation #asr #tts #coqui

unfa🇺🇦 @[email protected] · 2023-10-23 · 20:36 UTC

Common Voice is a project by Mozilla to build an extensive ethically-sourced dataset of spoken word in various languages to help push forward open-source voice recognition technology like DeepVoice (also by Mozilla).

I just recorded a dozen or so sentences :)

https://commonvoice.mozilla.org/

#OpenSource #SpeechRecognition #SpeechToText #Mozilla #DeepSpeech

#speechrecognition #speechtotext #mozilla #deepspeech #opensource

unfa🇺🇦 @[email protected] · 2023-10-23 · 20:36 UTC

Common Voice is a project by Mozilla to build an extensive ethically-sourced dataset of spoken word in various languages to help push forward open-source voice recognition technology like DeepVoice (also by Mozilla).

I just recorded a dozen or so sentences :)

https://commonvoice.mozilla.org/

#OpenSource #SpeechRecognition #SpeechToText #Mozilla #DeepSpeech

#opensource #speechrecognition #speechtotext #mozilla #deepspeech

unfa🇺🇦 @[email protected] · 2023-10-23 · 20:36 UTC

Common Voice is a project by Mozilla to build an extensive ethically-sourced dataset of spoken word in various languages to help push forward open-source voice recognition technology like DeepVoice (also by Mozilla).

I just recorded a dozen or so sentences :)

https://commonvoice.mozilla.org/

#OpenSource #SpeechRecognition #SpeechToText #Mozilla #DeepSpeech

#opensource #speechrecognition #speechtotext #mozilla #deepspeech

unfa🇺🇦 @[email protected] · 2023-10-23 · 20:36 UTC

Common Voice is a project by Mozilla to build an extensive ethically-sourced dataset of spoken word in various languages to help push forward open-source voice recognition technology like DeepVoice (also by Mozilla).

I just recorded a dozen or so sentences :)

https://commonvoice.mozilla.org/

#OpenSource #SpeechRecognition #SpeechToText #Mozilla #DeepSpeech

#deepspeech #mozilla #speechtotext #speechrecognition #opensource

unfa🇺🇦 @[email protected] · 2023-10-23 · 20:36 UTC

Common Voice is a project by Mozilla to build an extensive ethically-sourced dataset of spoken word in various languages to help push forward open-source voice recognition technology like DeepVoice (also by Mozilla).

I just recorded a dozen or so sentences :)

https://commonvoice.mozilla.org/

#OpenSource #SpeechRecognition #SpeechToText #Mozilla #DeepSpeech

#opensource #speechrecognition #speechtotext #mozilla #deepspeech

Valvin (framapiaf) @[email protected] · 2023-06-25 · 10:28 UTC

je me relance dans mes investigations Speech-to-text et Text-to-speech. Bizarrement c'est quelque chose qui revient régulièrement. Est ce que des gens savent ce qu'est devenu #DeepSpeech et si #MozillaVoice est toujours maintenu?
De mon côté je me base sur #mycroft mais j'ai le sentiment que le projet est un peu à l'arrêt mais je me trompe peut-être.

#mycroft #mozillavoice #deepspeech

mkiol @[email protected] · 2023-06-18 · 18:20 UTC

If you have to do Speech-to-Text and Text-to-Speech tasks and don't want to send your data to the Internet, I recommend you to try Speech Note (Linux desktop app).

It is easy to use, works offline and supports 57 languages!

Speech Note works thanks to powerful #STT and #TTS engines underneath: #DeepSpeech #Coqui #Vosk #Whisper #Piper #eSpeak #MBROLA #RHVoice

You can download #SpeechNote from #Flathub: https://flathub.org/apps/net.mkiol.SpeechNote

Video demo: https://youtu.be/EhUPvaHvssw

#stt #tts #deepspeech #coqui #vosk #whisper

mkiol @[email protected] · 2023-06-18 · 18:20 UTC

If you have to do Speech-to-Text and Text-to-Speech tasks and don't want to send your data to the Internet, I recommend you to try Speech Note (Linux desktop app).

It is easy to use, works offline and supports 57 languages!

Speech Note works thanks to powerful #STT and #TTS engines underneath: #DeepSpeech #Coqui #Vosk #Whisper #Piper #eSpeak #MBROLA #RHVoice

You can download #SpeechNote from #Flathub: https://flathub.org/apps/net.mkiol.SpeechNote

Video demo: https://youtu.be/EhUPvaHvssw

#stt #tts #deepspeech #coqui #vosk #whisper

mkiol @[email protected] · 2023-06-18 · 18:20 UTC

If you have to do Speech-to-Text and Text-to-Speech tasks and don't want to send your data to the Internet, I recommend you to try Speech Note (Linux desktop app).

It is easy to use, works offline and supports 57 languages!

Speech Note works thanks to powerful #STT and #TTS engines underneath: #DeepSpeech #Coqui #Vosk #Whisper #Piper #eSpeak #MBROLA #RHVoice

You can download #SpeechNote from #Flathub: https://flathub.org/apps/net.mkiol.SpeechNote

Video demo: https://youtu.be/EhUPvaHvssw

#stt #tts #deepspeech #coqui #vosk #whisper

mkiol @[email protected] · 2023-06-18 · 18:20 UTC

If you have to do Speech-to-Text and Text-to-Speech tasks and don't want to send your data to the Internet, I recommend you to try Speech Note (Linux desktop app).

It is easy to use, works offline and supports 57 languages!

Speech Note works thanks to powerful #STT and #TTS engines underneath: #DeepSpeech #Coqui #Vosk #Whisper #Piper #eSpeak #MBROLA #RHVoice

You can download #SpeechNote from #Flathub: https://flathub.org/apps/net.mkiol.SpeechNote

Video demo: https://youtu.be/EhUPvaHvssw

#flathub #speechnote #rhvoice #mbrola #espeak #piper

mkiol @[email protected] · 2023-06-18 · 18:20 UTC

If you have to do Speech-to-Text and Text-to-Speech tasks and don't want to send your data to the Internet, I recommend you to try Speech Note (Linux desktop app).

It is easy to use, works offline and supports 57 languages!

Speech Note works thanks to powerful #STT and #TTS engines underneath: #DeepSpeech #Coqui #Vosk #Whisper #Piper #eSpeak #MBROLA #RHVoice

You can download #SpeechNote from #Flathub: https://flathub.org/apps/net.mkiol.SpeechNote

Video demo: https://youtu.be/EhUPvaHvssw

#stt #tts #deepspeech #coqui #vosk #whisper

Fabio Manganiello @[email protected] · 2023-05-08 · 12:05 UTC

It's good that other people are also bringing up the elephant in the room: why do you need to pay money for one more electronic gadget that listens to you 24/7, when voice assistants aren't supposed to be rocket science in 2023 anymore? https://news.ycombinator.com/item?id=35857631

I wrote two articles on how to build custom #VoiceAssistants using just a Raspberry Pi and a microphone, one in 2019 https://blog.platypush.tech/article/Build-your-customizable-voice-assistant-with-Platypush and one in 2020 https://blog.platypush.tech/article/Build-custom-voice-assistants.
It's definitely doable and I still have my own custom assistants in the house. However, I had to get around with a #Snowboy model for hotword detection (and Snowboy is now basically abandoned), Mozilla #DeepSpeech model for speech-to-text (and that's quite heavy), and #Mycroft's mimic3 text-to-speech model (and Mycroft is now basically bankrupt). Then writing the integration is relatively easy - I used #Platypush, but it can definitely be done with Home Assistant and OpenHAB too.

Compared to 3-4 years ago, I think we're now in a state where the content is no longer the issue (just plug into a LLM, and all of your text requests will get an answer), nor integrations are a problem (just write a Platypush event hook on speech detected, and you can connect it to everything, no need for "Works with Google/Alexa" labels). Text-to-speech synthesis has also become cheap and ubiquitous.

But the hotword detection and speech-to-text models are still IMHO the bottleneck. Hotword detection is a field where you need a very small and lightweight model that only detects a specific word or phrase in a very reliable way. Snowboy was an amazing FOSS project - which also came with this cool idea of "crowd-funded models", where in order to download a model for a certain hotword you were first supposed to provide three audio tracks where you say that word in order to improve the model. But it's now discontinued because it cost the volunteers too much to run the infra.

And Mozilla DeepSpeech is a relatively good choice for general-purpose speech-to-text, but it's heavy (it takes 100% of the CPU when it runs on a Raspberry Pi) and it's mostly optimized for English - even support for other Western languages is patchy. OpenAI's recent Whisper model seems like a solid alternative, but it's also plagued by the 100% CPU issue - also, I no longer trust anything that comes from OpenAI, no matter how noble some of their efforts may look.

If there are other open-source alternatives that solve these problems, I'd be very happy to learn about them. Once these blockers are removed, there should be really no reason for anyone to feed their audio streams to Google or Amazon.

In the meantime, I'm planning to spend some time playing with some self-hosted LLM model to see if I can replace the Google Assistant library on the last Raspberry Pi that runs it in my home.

#voiceassistants #snowboy #deepspeech #mycroft #platypush

Fabio Manganiello @[email protected] · 2023-05-08 · 12:05 UTC

It's good that other people are also bringing up the elephant in the room: why do you need to pay money for one more electronic gadget that listens to you 24/7, when voice assistants aren't supposed to be rocket science in 2023 anymore? https://news.ycombinator.com/item?id=35857631

I wrote two articles on how to build custom #VoiceAssistants using just a Raspberry Pi and a microphone, one in 2019 https://blog.platypush.tech/article/Build-your-customizable-voice-assistant-with-Platypush and one in 2020 https://blog.platypush.tech/article/Build-custom-voice-assistants.
It's definitely doable and I still have my own custom assistants in the house. However, I had to get around with a #Snowboy model for hotword detection (and Snowboy is now basically abandoned), Mozilla #DeepSpeech model for speech-to-text (and that's quite heavy), and #Mycroft's mimic3 text-to-speech model (and Mycroft is now basically bankrupt). Then writing the integration is relatively easy - I used #Platypush, but it can definitely be done with Home Assistant and OpenHAB too.

Compared to 3-4 years ago, I think we're now in a state where the content is no longer the issue (just plug into a LLM, and all of your text requests will get an answer), nor integrations are a problem (just write a Platypush event hook on speech detected, and you can connect it to everything, no need for "Works with Google/Alexa" labels). Text-to-speech synthesis has also become cheap and ubiquitous.

But the hotword detection and speech-to-text models are still IMHO the bottleneck. Hotword detection is a field where you need a very small and lightweight model that only detects a specific word or phrase in a very reliable way. Snowboy was an amazing FOSS project - which also came with this cool idea of "crowd-funded models", where in order to download a model for a certain hotword you were first supposed to provide three audio tracks where you say that word in order to improve the model. But it's now discontinued because it cost the volunteers too much to run the infra.

And Mozilla DeepSpeech is a relatively good choice for general-purpose speech-to-text, but it's heavy (it takes 100% of the CPU when it runs on a Raspberry Pi) and it's mostly optimized for English - even support for other Western languages is patchy. OpenAI's recent Whisper model seems like a solid alternative, but it's also plagued by the 100% CPU issue - also, I no longer trust anything that comes from OpenAI, no matter how noble some of their efforts may look.

If there are other open-source alternatives that solve these problems, I'd be very happy to learn about them. Once these blockers are removed, there should be really no reason for anyone to feed their audio streams to Google or Amazon.

In the meantime, I'm planning to spend some time playing with some self-hosted LLM model to see if I can replace the Google Assistant library on the last Raspberry Pi that runs it in my home.

#voiceassistants #snowboy #deepspeech #mycroft #platypush

Fabio Manganiello @[email protected] · 2023-05-08 · 12:05 UTC

It's good that other people are also bringing up the elephant in the room: why do you need to pay money for one more electronic gadget that listens to you 24/7, when voice assistants aren't supposed to be rocket science in 2023 anymore? https://news.ycombinator.com/item?id=35857631

I wrote two articles on how to build custom #VoiceAssistants using just a Raspberry Pi and a microphone, one in 2019 https://blog.platypush.tech/article/Build-your-customizable-voice-assistant-with-Platypush and one in 2020 https://blog.platypush.tech/article/Build-custom-voice-assistants.
It's definitely doable and I still have my own custom assistants in the house. However, I had to get around with a #Snowboy model for hotword detection (and Snowboy is now basically abandoned), Mozilla #DeepSpeech model for speech-to-text (and that's quite heavy), and #Mycroft's mimic3 text-to-speech model (and Mycroft is now basically bankrupt). Then writing the integration is relatively easy - I used #Platypush, but it can definitely be done with Home Assistant and OpenHAB too.

Compared to 3-4 years ago, I think we're now in a state where the content is no longer the issue (just plug into a LLM, and all of your text requests will get an answer), nor integrations are a problem (just write a Platypush event hook on speech detected, and you can connect it to everything, no need for "Works with Google/Alexa" labels). Text-to-speech synthesis has also become cheap and ubiquitous.

But the hotword detection and speech-to-text models are still IMHO the bottleneck. Hotword detection is a field where you need a very small and lightweight model that only detects a specific word or phrase in a very reliable way. Snowboy was an amazing FOSS project - which also came with this cool idea of "crowd-funded models", where in order to download a model for a certain hotword you were first supposed to provide three audio tracks where you say that word in order to improve the model. But it's now discontinued because it cost the volunteers too much to run the infra.

And Mozilla DeepSpeech is a relatively good choice for general-purpose speech-to-text, but it's heavy (it takes 100% of the CPU when it runs on a Raspberry Pi) and it's mostly optimized for English - even support for other Western languages is patchy. OpenAI's recent Whisper model seems like a solid alternative, but it's also plagued by the 100% CPU issue - also, I no longer trust anything that comes from OpenAI, no matter how noble some of their efforts may look.

If there are other open-source alternatives that solve these problems, I'd be very happy to learn about them. Once these blockers are removed, there should be really no reason for anyone to feed their audio streams to Google or Amazon.

In the meantime, I'm planning to spend some time playing with some self-hosted LLM model to see if I can replace the Google Assistant library on the last Raspberry Pi that runs it in my home.

#voiceassistants #snowboy #deepspeech #mycroft #platypush

Fabio Manganiello @[email protected] · 2023-05-08 · 12:05 UTC

It's good that other people are also bringing up the elephant in the room: why do you need to pay money for one more electronic gadget that listens to you 24/7, when voice assistants aren't supposed to be rocket science in 2023 anymore? https://news.ycombinator.com/item?id=35857631

I wrote two articles on how to build custom #VoiceAssistants using just a Raspberry Pi and a microphone, one in 2019 https://blog.platypush.tech/article/Build-your-customizable-voice-assistant-with-Platypush and one in 2020 https://blog.platypush.tech/article/Build-custom-voice-assistants.
It's definitely doable and I still have my own custom assistants in the house. However, I had to get around with a #Snowboy model for hotword detection (and Snowboy is now basically abandoned), Mozilla #DeepSpeech model for speech-to-text (and that's quite heavy), and #Mycroft's mimic3 text-to-speech model (and Mycroft is now basically bankrupt). Then writing the integration is relatively easy - I used #Platypush, but it can definitely be done with Home Assistant and OpenHAB too.

Compared to 3-4 years ago, I think we're now in a state where the content is no longer the issue (just plug into a LLM, and all of your text requests will get an answer), nor integrations are a problem (just write a Platypush event hook on speech detected, and you can connect it to everything, no need for "Works with Google/Alexa" labels). Text-to-speech synthesis has also become cheap and ubiquitous.

But the hotword detection and speech-to-text models are still IMHO the bottleneck. Hotword detection is a field where you need a very small and lightweight model that only detects a specific word or phrase in a very reliable way. Snowboy was an amazing FOSS project - which also came with this cool idea of "crowd-funded models", where in order to download a model for a certain hotword you were first supposed to provide three audio tracks where you say that word in order to improve the model. But it's now discontinued because it cost the volunteers too much to run the infra.

And Mozilla DeepSpeech is a relatively good choice for general-purpose speech-to-text, but it's heavy (it takes 100% of the CPU when it runs on a Raspberry Pi) and it's mostly optimized for English - even support for other Western languages is patchy. OpenAI's recent Whisper model seems like a solid alternative, but it's also plagued by the 100% CPU issue - also, I no longer trust anything that comes from OpenAI, no matter how noble some of their efforts may look.

If there are other open-source alternatives that solve these problems, I'd be very happy to learn about them. Once these blockers are removed, there should be really no reason for anyone to feed their audio streams to Google or Amazon.

In the meantime, I'm planning to spend some time playing with some self-hosted LLM model to see if I can replace the Google Assistant library on the last Raspberry Pi that runs it in my home.

#platypush #mycroft #deepspeech #snowboy #voiceassistants

Fabio Manganiello @[email protected] · 2023-05-08 · 12:05 UTC

It's good that other people are also bringing up the elephant in the room: why do you need to pay money for one more electronic gadget that listens to you 24/7, when voice assistants aren't supposed to be rocket science in 2023 anymore? https://news.ycombinator.com/item?id=35857631

I wrote two articles on how to build custom #VoiceAssistants using just a Raspberry Pi and a microphone, one in 2019 https://blog.platypush.tech/article/Build-your-customizable-voice-assistant-with-Platypush and one in 2020 https://blog.platypush.tech/article/Build-custom-voice-assistants.
It's definitely doable and I still have my own custom assistants in the house. However, I had to get around with a #Snowboy model for hotword detection (and Snowboy is now basically abandoned), Mozilla #DeepSpeech model for speech-to-text (and that's quite heavy), and #Mycroft's mimic3 text-to-speech model (and Mycroft is now basically bankrupt). Then writing the integration is relatively easy - I used #Platypush, but it can definitely be done with Home Assistant and OpenHAB too.

Compared to 3-4 years ago, I think we're now in a state where the content is no longer the issue (just plug into a LLM, and all of your text requests will get an answer), nor integrations are a problem (just write a Platypush event hook on speech detected, and you can connect it to everything, no need for "Works with Google/Alexa" labels). Text-to-speech synthesis has also become cheap and ubiquitous.

But the hotword detection and speech-to-text models are still IMHO the bottleneck. Hotword detection is a field where you need a very small and lightweight model that only detects a specific word or phrase in a very reliable way. Snowboy was an amazing FOSS project - which also came with this cool idea of "crowd-funded models", where in order to download a model for a certain hotword you were first supposed to provide three audio tracks where you say that word in order to improve the model. But it's now discontinued because it cost the volunteers too much to run the infra.

And Mozilla DeepSpeech is a relatively good choice for general-purpose speech-to-text, but it's heavy (it takes 100% of the CPU when it runs on a Raspberry Pi) and it's mostly optimized for English - even support for other Western languages is patchy. OpenAI's recent Whisper model seems like a solid alternative, but it's also plagued by the 100% CPU issue - also, I no longer trust anything that comes from OpenAI, no matter how noble some of their efforts may look.

If there are other open-source alternatives that solve these problems, I'd be very happy to learn about them. Once these blockers are removed, there should be really no reason for anyone to feed their audio streams to Google or Amazon.

In the meantime, I'm planning to spend some time playing with some self-hosted LLM model to see if I can replace the Google Assistant library on the last Raspberry Pi that runs it in my home.

#voiceassistants #snowboy #deepspeech #mycroft #platypush

Kelley Graham @[email protected] · 2023-04-27 · 19:10 UTC

Today’s #process. Testing #DeepSpeech inference training. deepspeech.readthe...

#deepspeech #process