Speechmatics
Enterprise speech-to-text with very broad language coverage and real on-prem options, for teams who self-host.
Paid link, we may earn a commission. How this works.
Scored on the same voice-agent rubric as the full platforms, so a building block like this scores low on the axes it does not address. Read its value score against its job.
See how it stacks up · Full rankings →The languages-and-deployment specialist. Speechmatics turns speech into text in 55+ languages and will run inside your own data centre, not just its cloud. It is one building block though, not a whole phone agent. No voice, no language model, no phone line.
About $0.00 to 0.01 for a minute of conversation, once the phone line and the AI are added in.
That's roughly $0.24–0.72 an hour. Plans: $0/mo (Free).
Pricing
Show the cost breakdown
| What the platform charges to run the agent, before the phone line and the AI usage are added on. | — |
|---|---|
| The step that turns what the caller says out loud into text the AI can read. | $0.00 /min |
| The AI 'brain' that reads what the caller said and works out what to say back. | — |
| The step that turns the AI's written reply back into a spoken voice. | — |
| The phone line itself: the service that connects the call to a real phone number. Usually billed on top of the platform. | — |
| The total you actually pay for one minute of conversation once every piece is added up: the platform, the AI, the voice and the phone line. | $0.00–0.01 /min |
Speechmatics prices per HOUR of audio, not per minute. The pricing page states the Pro plan starts from $0.24/hr, which is about $0.004 a minute, the headline figure carried here. That is the floor: per-service rates run higher and vary by model and mode. A third-party 2026 breakdown (PulseSignal) lists Pro batch at $0.0050/min Standard and $0.0083/min Enhanced, and real-time at $0.0067/min Standard and $0.0117/min Enhanced, which is roughly $0.30 to $0.70 an hour depending on model and mode. We treat the vendor's own from $0.24/hr as primary and the per-service split as indicative until confirmed from the account portal. The free tier is 480 minutes a month (8 hours). A 20% volume discount applies above 500 hours a month per service. This is speech-to-text only: there is no language model, no text-to-speech and no telephony, so those components are 0 here. To run a full phone agent you add an LLM, a voice engine and a phone line separately, each a cost on top.
Every plan in one place: the monthly fee, what each one includes, and the features it unlocks. Anything beyond a plan's allowance, or on a pay-as-you-go tier, is billed at the per-minute rate above. A blank in the features means the vendor's plan page does not state it for that plan, not that it is unavailable.
| Free | Pro | Enterprise | |
|---|---|---|---|
| Price | Free | — | Custom |
| Included | 480 minutes | Pay per use | — |
| Plan notes | 480 free minutes per month, no card required, 2 concurrent real-time sessions | Pay-as-you-go on usage, from $0.24/hr, capped at 6,000 hours/month, 50 concurrent real-time sessions | Custom pricing, no rate limits, on-prem/container deployment, volume discounts from 24,000 hours/year |
| What each plan unlocks | |||
| API access | Yes | Yes | — |
| Concurrent calls | 2 real-time sessions | 50 real-time sessions | — |
| Priority support | — | — | Custom deployment + volume pricing |
- Free Free480 minutes
480 free minutes per month, no card required, 2 concurrent real-time sessions
- API access
- Yes
- Concurrent calls
- 2 real-time sessions
- Priority support
- —
- Pro —Pay per use
Pay-as-you-go on usage, from $0.24/hr, capped at 6,000 hours/month, 50 concurrent real-time sessions
- API access
- Yes
- Concurrent calls
- 50 real-time sessions
- Priority support
- —
- Enterprise Custom—
Custom pricing, no rate limits, on-prem/container deployment, volume discounts from 24,000 hours/year
- API access
- —
- Concurrent calls
- —
- Priority support
- Custom deployment + volume pricing
Each plan bundles a set amount of talk time a month.
Prices in USD as set by the vendor · last checked 2026-06-03 · vendor pricing →
At a glance
- Speech-to-text
- Speechmatics Ursa
- Text-to-speech
- Languages
- en, es, fr, de, it, pt, nl, pl, ru, ar, hi, zh, ja, ko, cy
- Integrations
- Real-time API (streaming), Batch API (recorded files), On-prem containers (CPU / GPU), Kubernetes self-host, Virtual Appliance (on-prem VM), Native SDKs
Compliance
Our full take
Speechmatics is a speech-to-text engine, and that is the whole point to get straight first. It listens to audio and writes down the words. It does not generate a reply, it does not speak back, and it does not dial a phone. So if you are shopping for a finished voice agent that answers your calls, this is not that. It is one of the parts you would build that agent from, and it is a good one.
Where it earns its place is languages. Speechmatics transcribes 55+ languages off a single model, which means you get the regional accents and dialects (Brazilian Portuguese, Canadian French, and so on) without bolting on a separate pack for each. Most of the cheaper speech-to-text engines top out around seven or ten languages. Deepgram, the closest building-block vendor we cover, lists seven. If your callers speak Tagalog, Welsh, Swahili or Urdu, that gap is the entire reason to look here.
The second reason is where it runs. Most speech-to-text APIs only run in the vendor’s cloud, you send them audio and they send back text. Speechmatics will also run inside your own data centre, as a container on your own hardware (CPU or GPU), on Kubernetes, or as a pre-built virtual machine they call a Virtual Appliance. For a hospital or a bank that cannot let call audio leave the building, that on-premises option (meaning it runs on your own servers, not someone else’s cloud) is often a hard requirement, not a nice-to-have. It is the kind of thing you cannot retrofit, so it matters that it is there from the start.
Now the pricing, and here is the bit that trips people up. Speechmatics bills per hour of audio, not per minute like the agent platforms. The pricing page lists the Pro plan as starting from $0.24 an hour, which works out at about $0.004 a minute, the figure shown at the top of this page. Treat that as the floor, not the average. The headline is the cheapest service at the lowest tier, and the real rate climbs with the model you pick and the mode you run.
Here is roughly how it splits, so you can see the workings. A third-party 2026 breakdown puts Pro batch transcription (processing a recorded file after the fact) at about $0.0050 a minute on the Standard model and $0.0083 on the higher-accuracy Enhanced model. Real-time transcription (live, as the audio streams in) runs around $0.0067 Standard and $0.0117 Enhanced. In per-hour terms that is roughly $0.30 to $0.70 an hour depending on model and mode. We are flagging that split as indicative rather than gospel, because we are sourcing it from a pricing aggregator, not from the vendor’s own rate card, which loads behind the account portal. The vendor’s own from $0.24/hr is what we are treating as primary. There is a free tier of 480 minutes a month (8 hours) to test with, no card needed, and a 20% volume discount once you cross 500 hours a month.
One honest caveat on cost. That per-minute number looks tiny next to a $0.06-a-minute agent platform, and it is, but it is not comparing like for like. Speechmatics is charging you for one job, the transcription. The platforms are charging for transcription plus the language model plus the voice plus the phone line bundled together. To build a full phone agent on Speechmatics you still have to pay for an LLM, a text-to-speech engine and telephony separately. Add those up and the real per-minute cost lands a lot closer to the bundled platforms than the $0.004 headline suggests.
On compliance, Speechmatics is unusually well-documented for a building block. Its own security page states SOC 2 Type II, ISO/IEC 27001:2022, GDPR and full HIPAA compliance, with AES 256 encryption at rest and TLS 1.2 or higher in transit, plus a public trust centre where you can pull the actual reports. We have ticked HIPAA, SOC 2 Type II and GDPR here because the vendor states them directly. We left SOC 2 Type I unticked: the page names Type II, not Type I, and we do not assume one from the other. For a regulated buyer, that combination of on-prem deployment plus written certifications is the strong card.
My read: Speechmatics is the one you reach for when language coverage or on-premises deployment is non-negotiable, and you have the engineering to assemble the rest of the agent around it. The voice-quality and ease-of-use scores sit lower here than for a finished platform, and that is fair, this is infrastructure, not a product you switch on. If you just want calls answered without standing up your own stack, a bundled platform will get you there faster. If you need to transcribe twenty languages, or keep the audio on your own servers, very little else competes.
The 1 to 10 scores on this page are an editorial preview, our provisional read to get the framework in place, not a measured result. We have not run Speechmatics through our own test calls yet, so there is no Voxrater latency figure here. The pricing, language, deployment and compliance detail is sourced from Speechmatics’ own pricing, security, languages and deployments pages plus one third-party pricing breakdown, captured 2026-05-31.
Alternatives to Speechmatics
Other platforms that overlap with Speechmatics on the same kind of work, ranked by how many capabilities they share, then by cheaper all-in cost per minute. Compare any of them side by side on the compare page.
Tracking Speechmatics? Get the next test result
We re-test and re-price the platforms we cover. Join the list and the next dated update lands in your inbox.
Newsletter launching soon.
Sources
- Speechmatics pricing verified 2026-06-02: Pro from $0.24/hr (= $0.004/min), 2,400 free minutes/mo; speech-to-text only, so no per-minute voice-output rate. · captured 2026-06-02
- Speechmatics pricing page re-captured 2026-06-02 for the quarterly re-verification (screenshot in evidence/). · captured 2026-06-02
- Speechmatics pricing page: per-plan features (Free, Pro, Enterprise), Pro from $0.24/hr, Free 480 min/month, 6,000 hr/month cap, 24,000 hr/year enterprise discount · captured 2026-05-31
- Third-party 2026 per-service breakdown: batch $0.0050/$0.0083 per min, real-time $0.0067/$0.0117 per min (Standard/Enhanced) · captured 2026-05-31
- Speechmatics security page: SOC 2 Type II, ISO/IEC 27001:2022, GDPR and HIPAA claims, AES 256 / TLS 1.2+, Azure + on-prem · captured 2026-05-31
- Speechmatics languages page: 55+ languages for speech-to-text with accent/dialect coverage · captured 2026-05-31
- Features and deployments: SaaS, on-prem containers (CPU/GPU), Kubernetes, Virtual Appliance, real-time + batch · captured 2026-05-31
- Speechmatics partner programme: build/market/sell tracks plus partner marketplace · captured 2026-05-31