How STT.ai protects your audio and transcripts. Client-Side Encrypted Storage

Security & Privacy - Client-Side Encrypted Storage | STT.ai Security & Privacy How STT.ai protects your audio and transcripts with client-side encryption, HTTPS, and transparent data handling. Client-Side Encrypted Storage When you enable Privacy Mode, your transcript is encrypted in your browser before being stored on our servers. The encryption key is derived from your password — we never see it, store it, or have access to it. Note: during transcription, our GPU server processes your audio and returns the transcript in plaintext. The encryption protects what is stored, not what is processed. What this protects: If our database is ever breached, your stored transcripts are unreadable without your password. What it doesn't protect: The server sees your audio and transcript during processing before encryption. Audit the encryption code yourself (open-source, MIT license) How Client-Side Encrypted Storage Works 1 You upload audio Your audio file is sent to our GPU for transcription. The audio is processed in memory and immediately deleted after transcription — never stored on disk. 2 Transcript returned to your browser The raw transcript (text, timestamps, speakers) is sent back to your browser over HTTPS (TLS 1.3, encrypted in transit). 3 Your browser encrypts the transcript Using AES-256-GCM encryption with a key derived from your password via PBKDF2 (100,000 iterations). The key never leaves your browser. We never see it. 4 Encrypted blob stored on our servers We store only the encrypted data. It looks like random bytes to us. We cannot decrypt it. Our database admins cannot read it. If our servers are breached, your data is safe. 5 Only you can decrypt When you view your transcript, your browser derives the key from your password again and decrypts locally. Nobody else — including STT.ai staff — can read your transcripts. Technical Details Encryption algorithm AES-256-GCM (authenticated encryption) Key derivation PBKDF2 with SHA-256, 100,000 iterations Key salt User's email address (unique per user) IV (nonce) Random 12 bytes per encryption (never reused) Key storage Never stored — derived from password on each session Transport encryption TLS 1.3 (HTTPS) Audio retention Deleted immediately after processing (never stored on disk) Implementation Web Crypto API (browser-native, no external libraries) Source code github.com/sttaigit/stt-encryption (MIT license) What We Can and Can't See We CANNOT see Your transcript text Speaker names or labels Timestamps or word-level data Your encryption key Your audio (deleted after processing) We CAN see File name and size (metadata) Audio duration Language detected Model used Timestamp of transcription Privacy Mode Trade-offs Client-side encrypted storage is opt-in because it limits some features. With encryption enabled: Works with encryption Viewing your transcripts Exporting (TXT, SRT, VTT, etc.) Downloading Editing (decrypted in browser) Not available with encryption Server-side search across transcripts AI summaries (server can't read data) Sharing via link (recipient needs key) Team workspace collaboration Need True End-to-End Privacy? For organizations that need audio to never leave their infrastructure, we offer dedicated and self-hosted options. Private Cloud $299 /mo Your own dedicated GPU server. Audio never leaves your infrastructure. True end-to-end privacy. Dedicated A100 GPU Isolated server — no shared infrastructure Audio processed on your hardware only Full API access + SLA Get Started — $299/mo Self-Hosted License $49 /mo Run STT.ai on your own hardware. Docker image, your servers, your rules. Docker image — runs on any NVIDIA GPU Air-gapped support — no internet required Model updates included Full control over your data Get Started — $49/mo Data Handling (All Users) Even without Privacy Mode enabled, we follow strict data handling practices: Audio files are never stored permanently. They are processed in GPU memory and deleted immediately after transcription completes. They are processed in GPU memory and deleted immediately after transcription completes. Your data is never used for training unless you explicitly opt-in via Voice Lab. Paid plan data is never used. unless you explicitly opt-in via Voice Lab. Paid plan data is never used. All traffic is encrypted in transit via TLS 1.3 (HTTPS). via TLS 1.3 (HTTPS). You can delete all your data at any time from Privacy Settings. at any time from Privacy Settings . We don't sell your data. Ever. To anyone. For any reason. Ever. To anyone. For any reason. Open-Source Encryption Our encryption library is fully open-source under the MIT license. Audit it yourself. Verify that we're doing what we say. No trust required — just math. View on GitHub | View Source Directly Ready to transcribe securely? Upload your first file free. Client-side encryption included on all plans. Start Transcribing Frequently Asked Questions How do I transcribe audio? Upload your audio or video file to STT.ai. Select your preferred AI model and options, then click Transcribe. Your transcript will be ready in minutes. Export as TXT, SRT, VTT, DOCX, JSON, or PDF. Is transcription free? Yes! STT.ai offers 600 free minutes per month for all users. No signup required for your first transcription. Paid plans with more minutes and features start at $5/month. How accurate is the transcription? Accuracy depends on the AI model you choose and audio quality. Our best models achieve a 5-7% Word Error Rate on benchmarks, meaning 93-95%+ accuracy. Clear audio with minimal background noise produces the best results. What AI models can I use? STT.ai offers 10+ models including Whisper Large V3, NVIDIA Canary, and more. You can compare results from different models on the same file. Can I get subtitles and captions? Yes. After transcribing, export your transcript as SRT or VTT subtitle files. These work with YouTube, Vimeo, and all major video platforms. Does it detect different speakers? Yes. STT.ai automatically identifies and labels different speakers using AI speaker diarization. Works across all models and languages. How long does transcription take? Most files are transcribed in under 5 minutes. A 1-hour audio file typically takes 2-3 minutes with our fastest models. What file formats are supported? STT.ai supports 20+ audio and video formats including MP3, WAV, M4A, FLAC, OGG, MP4, MKV, MOV, WebM, and AVI. Export as TXT, SRT, VTT, DOCX, JSON, or PDF. Is my audio data kept private? Yes. Audio files are processed and deleted after transcription. Your data is never used for training. Client-side encryption is free on all plans — it encrypts stored transcripts with a key only you have. During processing, the server handles your audio in plaintext. Learn about our security . Can I access transcription via API? Yes. STT.ai offers a REST API with Python and Node.js SDKs. Free tier includes 100 minutes/month. Can I edit the transcript after? Yes. STT.ai includes a built-in transcript editor where you can correct errors, rename speakers, and adjust timestamps. How do I share my transcript? Every transcript gets a unique shareable link. Export to DOCX or PDF for email. Pro plans offer password-protected and permanent links. Get 600 Free Minutes Sign up for a free account and get 600 minutes of transcription every month. Sign up free Already have an account? Log in