Show HN: Push-to-talk dictation for Android apps and terminal workflows

pol_avec · 1 day ago · view on HN · tool
quality 2/10 · low quality
0 net
AI Summary

An open-source Android app providing push-to-talk dictation with local on-device or cloud-based transcription (via OpenAI API), supporting terminal command workflows and text insertion into any app via floating overlay.

Entities
phone-whisper OpenAI Gemini SwiftKey Termux MacWhisper
I built this because MacWhisper is not available on Android and voice typing on Android is pretty bad. Moreover Gemini does not allow you to edit transcripts before they are auto-sent.

I like my SwiftKey keyboard though, so I did not want to replace that. So the only way was to make a floating push-to-talk button on top of any app.

You tap the overlay, speak, tap again, transcribe, and insert text into the currently focused field.

It supports local on-device transcription, cloud transcription with your own OpenAI key, and optional post-processing/cleanup for punctuation, formatting, prompts, commands, etc.

A nice use case for me has been Termux / terminal workflows on Android. You have a "dev mode" where you can just say "command mode" and anything after it will be converted into a proper CLI command.

The app is open source. No backend — in cloud mode requests go directly from the phone to OpenAI using the user's own API key.

Repo: https://github.com/kafkasl/phone-whisper APK: https://github.com/kafkasl/phone-whisper/releases