Documentation | SpeakToCode

Getting Started

System Requirements

Windows 10+

RAM

4 GB min (8 GB recommended)

Disk

200 MB + model files (75 MB – 2.9 GB)

Audio

Any microphone input device

Installation

1.Download

Download the latest installer for Windows.

2.Install

Windows: Run SpeakToCode-Setup.exe. SmartScreen may prompt you — click More info, then Run anyway.

3.First model

On first launch, the Tiny Whisper model (~75 MB) downloads automatically. This only happens once.

4.Microphone access

Grant microphone permission when your OS prompts you.

First Transcription

1Open any text editor, IDE, or input field.
2Press Ctrl+Alt+Space to start recording (or hold Ctrl+Space for push-to-talk).
3Speak naturally into your microphone.
4Press Ctrl+Alt+Space again (or release Ctrl+Space). Your speech is transcribed locally and pasted at your cursor.

How It Works

SpeakToCode runs entirely on your local machine. No cloud API calls, no audio uploads, no third-party services.

Record

Your microphone captures audio locally as a temporary WAV file.

Transcribe

Audio is processed by Whisper.cpp running natively on your CPU/GPU.

Paste

Transcribed text is copied to clipboard and auto-pasted at your cursor.

The entire pipeline completes in 1–15 seconds depending on the model and recording length.

Keyboard Shortcuts

Shortcut	Action
`Ctrl+Alt+Space`	Toggle recording on/off
`Ctrl+Space`	Push-to-talk: hold to record, release to transcribe

Shortcuts work globally from any application. Customize them in Settings.

Recording Modes

Toggle Mode

Press Ctrl+Alt+Space to start. Press again to stop and transcribe. Best for longer dictation sessions.

Push-to-Talk

Hold Ctrl+Space while speaking. Release to transcribe. Best for quick bursts like variable names, commit messages, short comments.

Both modes show a floating microphone indicator so you always know when SpeakToCode is listening.

Whisper Models

SpeakToCode supports all 7 official Whisper models. Larger models are more accurate but slower.

Model	Size	Speed	Accuracy
Tiny	75 MB	~1s	Good
Base	142 MB	~1.5s	Better
Small	466 MB	~3s	Great
Medium	1.5 GB	~7s	Excellent
Large-v1	2.9 GB	~15s	Best
Large-v2	2.9 GB	~15s	Best
Large-v3	2.9 GB	~15s	Best

Which model should I use?

Tiny / Base Quick dictation, low latency. Ideal for limited RAM or when speed matters most.
Small Best balance of speed and accuracy. Recommended starting point.
Medium Excellent accuracy. Best for 8+ GB RAM.
Large (v1/v2/v3) Maximum accuracy, slower. For 16+ GB RAM. v3 is the latest.

Switch models anytime in Settings → Model. Models download once and are cached locally.

Settings

Access settings via the gear icon in the sidebar or the Settings page in the app.

Audio

Input Device: Choose your microphone (defaults to system default)
Silence Detection: Auto-stop recording after a silence period

Model

Whisper Model: Select transcription model size
Language: Set expected language or leave on Auto

Hotkeys

Toggle Shortcut: Customize toggle recording key (default: Ctrl+Alt+Space)
Push-to-Talk: Customize push-to-talk key (default: Ctrl+Space)

General

Launch at Startup: Start with Windows
Minimize to Tray: Keep running in system tray
Theme: Light, dark, or system preference
Cloud Sync (Pro): Sync across devices

Integrations

SpeakToCode works with any application that accepts text input. It pastes transcribed text at your cursor via the system clipboard.

IDEs

VS Code, JetBrains, Vim/Neovim, Sublime Text, Notepad++

Terminals

Windows Terminal, PowerShell, CMD, WSL

Browsers

Chrome, Edge, Firefox, any text field

Communication

Slack, Discord, Teams, Outlook

Productivity

Notion, Google Docs, Word, Obsidian

IDE tip: Place your cursor where you want text, then use push-to-talk for variable names, commit messages, and short comments.

Privacy & Security

Privacy is a core design principle, not an afterthought.

100% local processing

All transcription happens on your machine. Audio is never sent to any server.

No telemetry

No usage analytics, crash reports, or behavioral data collected.

No cloud dependency

Works fully offline. Internet only needed for model downloads and optional sync.

Local storage

History and audio stored in %LOCALAPPDATA%\SpeakToCode\ on your machine.

Open source engine

Whisper.cpp is open source and auditable.

For full details, see our Privacy Policy.

Troubleshooting

Microphone not detected

Check mic is plugged in and set as default in Windows Sound Settings.
Verify SpeakToCode has mic permission: Windows Settings → Privacy → Microphone.
Try selecting a specific input device in Settings → Audio.

Poor transcription accuracy

Try a larger model (Small or Medium for significant improvement).
Reduce background noise or use a noise-canceling mic.
Speak clearly. Fast speech or mumbling reduces accuracy.
Set language explicitly rather than using auto-detect.

High CPU usage during transcription

This is normal. Whisper uses significant CPU during transcription and returns to idle after.
Use a smaller model (Tiny or Base) to reduce load.
Keep recordings under 30 seconds.

Model download fails

Check your internet connection. Downloads from Hugging Face may be blocked by firewalls.
Delete partial downloads in %LOCALAPPDATA%\SpeakToCode\models\ and retry.

Text not pasting into target application

Make sure the target window is focused before starting transcription.
Some apps with custom paste handling may need their own paste menu.
Check that your clipboard isn't locked by another application.

Windows SmartScreen warning

Click "More info" then "Run anyway". The app is safe. Code signing is coming soon.

FAQ

Is SpeakToCode free?

Yes! SpeakToCode is free during the beta with all features unlocked. Just create an account, download the app, and start dictating. No credit card required.

Does it work offline?

Yes. Once you've downloaded a model, SpeakToCode works completely offline.

What languages are supported?

Whisper supports 99+ languages. Set your preferred language in Settings for best results.

Can I use it for coding?

Yes! It works with all major IDEs and editors. Dictate comments, docs, variable names, commit messages, and more.

Does it support Linux?

SpeakToCode is currently available for Windows. macOS and Linux support is planned. Follow our changelog for updates.

How do I uninstall?

Windows Settings → Apps → Installed apps, find SpeakToCode, click Uninstall. To remove local data too, delete %LOCALAPPDATA%\SpeakToCode\.

Where can I get help?

Discord community, GitHub Issues, or contact us directly.