speech-recognition

3 articles
sort: new top best
clear filter
0 1/10

Dograh is an open-source, self-hosted visual drag-and-drop platform for building production voice AI agents with integrated telephony, STT/TTS, LLM support, and knowledge base capabilities—eliminating per-minute API fees and deployment overhead.

Vapi Pipecat LiveKit Dograh Deepgram Cartesia OpenAI Speechmatics Sarvam Gemini Groq Openrouter Azure Twilio Vonage Cloudonix Asterisk
github.com · pritesh1908 · 1 day ago · details · hn
0 1/10

This is a language learning tool, not a security article. It describes Lingle, a desktop application for serious language learners that uses AI to provide real-time conversation practice with detailed feedback on naturalness and pragmatics.

Lingle Andrew
lingle.ai · andrewfhou · 2 days ago · details · hn
0 2/10

RunAnywhere released MetalRT, a Metal GPU-optimized inference engine for Apple Silicon that achieves 1.67x faster LLM decode than llama.cpp and 4.6x faster speech-to-text than mlx-whisper through custom GPU shaders and zero-allocation inference. They also open-sourced RCLI, a voice AI pipeline combining STT, LLM, and TTS with sub-600ms end-to-end latency entirely on-device.

RunAnywhere MetalRT RCLI YC W26 Sanchit Shubham llama.cpp Apple MLX Ollama sherpa-onnx mlx-whisper Qwen3 LFM2.5
github.com · sanchitmonga22 · 3 days ago · details · hn