Release Notes

Follow new updates and improvements to Speechmatics.

May 14th, 2026

Realtime Kubernetes

Updates


  • Updates Redis dependency to use 8.6.3-alpine image.

    • Redis is deployed using community image from a dependent helm chart (“sm-redis”) instead of Bitnami image and helm chart.

  • Updates Realtime container version to 15.7.0. For more information, see the 15.7.0 container release notes.

  • Updates sessiongroups CustomResourceDefinitions and controller. Refer to the sm-realtime Helm chart RELEASE_NOTES.md for upgrade details.

Security fixes


Vulnerability Management

  • Software Bill of Materials (SBOM) is available for download from the corresponding release page in our Support Portal.

  • libgstreamer (CVE-2025-3887): This component has been manually patched to address CVE-2025-3887. Note: Security scanners may still flag this component as vulnerable because the base version string remains unchanged

Non-Applicable CVEs

The following vulnerabilities were reviewed and determined to have no security impact on this release due to specific configurations or the use of closed, trusted environments:

Component

Identified CVEs

python-multipart

CVE-2026-42561

urllib3

CVE-2026-44431, CVE-2026-44432

May 12th, 2026

Batch SaaS

Fixes

Fixed transcription job failures when speaker diarization was enabled for WAV files with missing or incorrect duration metadata.

  • Bug introduced in release version: 2026.04.23

Updated Orchestrator Version: 2026.05.08+3477f55380+15.9.0

May 7th, 2026

Batch SaaS

Fixes

Fixed failure to generate transcripts for a small number of stereo files when one channel contains leading silence.

  • Resolves the following tickets: 32372

  • Bug introduced in release version: 2026.04.23

Updated Orchestrator Version: 2026.05.05+c019d07f67+15.8.0

May 1st, 2026

Batch Container

Realtime Container

GPU Transcription Inference Container

GPU Translation Inference Container

Version 15.7.0 is now available for Batch Container, Real-Time Container, GPU Transcription Inference Container and GPU Translation Inference Container.

New


GPU & CPU

  • HTTP Batch Transcription – Processes multiple jobs using persistent workers, reducing turnaround time and improving CPU/GPU utilization. See the documentation for more details.

Improvements


GPU

  • New model (Enhanced Operating Point) for English (en) improves accuracy across:

    • Numbers, spellouts, and other alphanumerics

    • Medical measurements and terminology

    • Mixed spoken alphanumeric sequences of numbers and characters

    • Character sequences, for example spell outs of names or abbreviations

    • Formatting consistency for letter sequences, now returned as upper case letters

    • Email addresses and web URLs

    • Large and compound monetary amounts

Category

Relative Improvement (WER)

Numbers

69%

Spellouts

89%

Mixed alphanumerics

42%

Updates


GPU

  • Updated Japanese (ja) model for Enhanced Operating Point

Security fixes


Vulnerability Management

  • Software Bill of Materials (SBOM) is available for download from the corresponding release page in our Support Portal.

  • libgstreamer (CVE-2025-3887): This component has been manually patched to address CVE-2025-3887. Note: Security scanners may still flag this component as vulnerable because the base version string remains unchanged

April 24th, 2026

Realtime SaaS

Improvements

  • New model (Enhanced Operating Point) for English (en) improves accuracy across:

    • Medical measurements and terminology

    • Mixed spoken alphanumeric sequences of numbers and characters

    • Character sequences, for example spell outs of names or abbreviations

    • Formatting consistency for letter sequences, now returned as upper case letters

    • Email addresses and web URLs

    • Large and compound monetary amounts

  • Improved accuracy at low latency when using ForceEndOfUtterance

Updates

  • Updated Japanese (ja) model for Enhanced Operating Point

Updated Orchestrator Version: 2026.04.21+fd908134bc+15.7.0

April 16th, 2026

Batch SaaS

Improvements

New model (Enhanced Operating Point) for English (en) improves accuracy across:

  • Medical measurements and terminology

  • Mixed spoken alphanumeric sequences of numbers and characters

  • Character sequences, for example spell outs of names or abbreviations

  • Formatting consistency for letter sequences, now returned as upper case letters

  • Email addresses and web URLs

  • Large and compound monetary amounts

Updates

Updated Japanese (ja) model for Enhanced Operating Point

Updated Orchestrator Version: 2026.04.10+192b655fa8+15.5.0

March 12th, 2026

Realtime SaaS

Improvements

New model (Enhanced Operating Point) for English (en) improves accuracy across numbers, spellouts, and other alphanumerics.

Category

Relative Improvement (WER)

Numbers

69%

Spellouts

89%

Mixed alphanumerics

42%

Updated Orchestrator Version: 2026.02.27+2ce3ed4fc8+15.2.0

March 9th, 2026

Batch SaaS

Improvements

  • Faster processing for audio files up to 12 minutes when using a custom dictionary.

    • Supported languages: Arabic, Catalan, Dutch, English, French, German, Greek, Hebrew, Hindi, Japanese, Norwegian, Persian, Portuguese, Russian, Spanish, and Swedish.

  • New model (Enhanced Operating Point) for English (en) improves accuracy across numbers, spellouts, and other alphanumerics.

Category

Relative Improvement (WER)

Numbers

69%

Spellouts

89%

Mixed alphanumerics

42%

Updated Orchestrator Version: 2026.02.27+2ce3ed4fc8+15.2.0

March 5th, 2026

Realtime Kubernetes

Updates


  • Enables repopulation to automatically restore cluster state in the event of Redis data loss.

  • Supports language and operating-point based model cost generation for the custom inference server recipe. Refer to the sm-realtime Helm chart README.md for configuration details.

  • Updates Realtime container version to 15.0.0. For more information, see the 15.0.0 container release notes.

Security fixes


Vulnerability Management

  • Software Bill of Materials (SBOM) is available for download from the corresponding release page in our Support Portal.

  • libgstreamer (CVE-2025-3887): This component has been manually patched to address CVE-2025-3887. Note: Security scanners may still flag this component as vulnerable because the base version string remains unchanged

  • Redis versioning: This release by default uses Redis version 8.2.1 with publicly disclosed security vulnerabilities (CVEs) that have been assessed to not impact this product. Users who require additional security controls or remediation for these CVEs can choose to deploy a different Redis version which remediates issues of concern.

Non-Applicable CVEs

The following vulnerabilities (including those reported for the third-party package Redis version) were reviewed and determined to have no security impact on this release due to specific configurations or the use of closed, trusted environments:

Component

Identified CVEs

redis

CVE-2025-49844, CVE-2025-46817, CVE-2025-46818, CVE-2025-46819, CVE-2025-62507

stdlib

CVE-2025-58183, CVE-2025-61726, CVE-2025-61728, CVE-2025-61729, CVE-2025-68121

protobuf

CVE-2026-0994

crytptography

CVE-2026-26007

General Libs

libc (CVE-2025-4802, CVE-2026-0861), libpam (CVE-2025-6020), libssl3 (CVE-2025-15467, CVE-2025-69419, CVE-2025-69421)zlib (CVE-2023-45853), gpgv(CVE-2025-68973, CVE-2026-24882)

Others

perl (CVE-2023-31484)

March 4th, 2026

Realtime Appliance

New


GPU & CPU

  • Word Replacement - Enables words in the transcript to be modified using a search and replace pattern. Refer to documentation here for details.

  • Supports the prefer_current_speaker configurable parameter to reduce the likelihood of incorrectly switching between similar sounding speakers. Refer to documentation for more details.

  • Supports End of Utterance feature. When enabled, this helps detect the end of turn in a conversation. This can benefit use cases such as Voice Agents, dictation and translation, reducing latency. Refer to the documentation for more details.

  • Supports speaker_sensitivity parameter to configure the sensitivity of speaker detection. Refer to documentation for more details.

  • New Tagalog language is now available. Supports code-switching between Filipino and English for bilingual speech

GPU

  • Channel diarization – Enables perfect speaker separation when there is one speaker per channel. Refer to documentation here for details

  • Channel and Speaker diarization – Enables separation of multiple speakers per channel. Refer to documentation here for details

  • Force end of utterance – Enables the client to force finalise transcription at the end of speech for faster finals (200ms), ideal when using external VAD or turn detection models for voice agents. Refer to documentation here for details

  • Standard and Enhanced operating point

    • New bilingual transcription languages: Malay English (en_ms), Tamil English (en_ta), Mandarin English (cmn_en), Arabic English (ar_en). Refer to documentation here for details.

    • New Multilingual transcription language pack Mandarin Malay Tamil English (cmn_en_ms_ta) available now. Refer to the documentation for more details.

  • Enhanced operating point

    • New medical domain-specific models for Danish, Dutch, English, Finnish, French, German, Norwegian, Spanish and Swedish giving the highest accuracy for healthcare use cases. Refer to the documentation for more details.

Domain

Language

Relative Improvement

Medical

Danish

46%

Medical

Dutch

70%

Medical

English

14%

Medical

Finnish

40%

Medical

French

51%

Medical

German

36%

Medical

Norwegian

42%

Medical

Spanish

63%

Medical

Swedish

60%

Improvements


GPU & CPU

  • Speaker Diarization – Improved speaker change detection accuracy for long audio streams (1+ hours)

GPU

  • Significant increase in session density for GPU inference.

  • Standard Operating Point

    • Enables faster transcription and higher throughput, refer to the documentation for more details

    • New models released with notable accuracy uplifts for the below languages:

      Domain

      Language

      Relative Improvement

      General

      Belarusian

      68.9%

      General

      Bulgarian

      13.1%

      General

      Catalan

      57.5%

      General

      Croatian

      5.0%

      General

      Czech

      3.4%

      General

      Danish

      34.3%

      General

      Esperanto

      37.9%

      General

      Estonian

      5.1%

      General

      Persian

      9.7%

      General

      Finnish

      38.8%

      General

      French

      3.1%

      General

      Galician

      85.4%

      General

      Greek

      7.2%

      General

      Hebrew

      2.1%

      General

      Hindi

      9.3%

      General

      Hungarian

      15.8%

      General

      Indonesian

      19.1%

      General

      Japanese

      2.1%

      General

      Korean

      17.1%

      General

      Latvian

      25.1%

      General

      Lithuanian

      11.2%

      General

      Malay

      4.2%

      General

      Marathi

      28.8%

      General

      Mongolian

      19.6%

      General

      Norwegian

      18.5%

      General

      Polish

      2.0%

      General

      Romanian

      8.5%

      General

      Slovak

      4.1%

      General

      Slovenian

      9.8%

      General

      Swedish

      5.4%

      General

      Thai

      22.1%

      General

      Turkish

      33.2%

      General

      Ukrainian

      29.9%

      General

      Urdu

      16.1%

      General

      Vietnamese

      110.9%

      General

      Welsh

      20.3%

  • Enhanced Operating Point

    • New models for English with improved accuracy in transcribing initialisms

    • New models released with notable accuracy uplifts for the below languages:

      Domain

      Language

      Relative Improvement

      General

      Danish

      20.6%

      General

      Dutch

      16.0%

      General

      Spanish

      4.1%

      General

      Finnish

      31.0%

      General

      French

      2.1%

      General

      Interlingua

      8.7%

      General

      Japanese

      5.8%

      General

      Malay

      6.8%

      General

      Maltese

      5.5%

      General

      Norwegian

      9.3%

      General

      Swedish

      10.4%

      General

      Uyghur

      8.0%

CPU

  • Standard Operating Point

    • New models released with notable accuracy uplifts for the below languages:

      Domain

      Language

      Relative Improvement

      General

      Belarusian

      61.3%

      General

      Bosnian

      12.2%

      General

      Bulgarian

      6.4%

      General

      Catalan

      19.9%

      General

      Welsh

      5.4%

      General

      Danish

      2.8%

      General

      Esperanto

      43.9%

      General

      Greek

      4.1%

      General

      Persian

      5.9%

      General

      Finnish

      40.9%

      General

      Galician

      64.3%

      General

      Irish

      3.1%

      General

      Hindi

      4.0%

      General

      Interlingua

      24.9%

      General

      Korean

      7.6%

      General

      Latvian

      4.2%

      General

      Malay

      2.9%

      General

      Maltese

      3.5%

      General

      Mongolian

      19.4%

      General

      Marathi

      11.1%

      General

      Norwegian

      4.1%

      General

      Romanian

      2.7%

      General

      Swedish

      5.3%

      General

      Thai

      10.1%

      General

      Ukrainian

      19.3%

      General

      Vietnamese

      15.1%

  • Enhanced Operating Point

    • Improved Speaker Diarization accuracy for Enhanced Operating Point

    • New models released with notable accuracy uplifts for the below languages:

    Domain

    Language

    Relative Improvement

    General

    Belarusian

    61.7%

    General

    Bosnian

    8.7%

    General

    Bulgarian

    7.4%

    General

    Catalan

    6.2%

    General

    Welsh

    3.3%

    General

    Danish

    27.4%

    General

    Greek

    3.2%

    General

    Esperanto

    41.8%

    General

    Persian

    4.7%

    General

    Finnish

    55.0%

    General

    French

    4.5%

    General

    Irish

    5.7%

    General

    Galician

    72.8%

    General

    Hindi

    3.1%

    General

    Interlingua

    35.1%

    General

    Korean

    4.1%

    General

    Latvian

    3.1%

    General

    Mongolian

    21.4%

    General

    Marathi

    12.4%

    General

    Dutch

    12.9%

    General

    Norwegian

    11.7%

    General

    Swedish

    5.9%

    General

    Thai

    7.0%

    General

    Ukrainian

    18.7%

    General

    Uyghur

    2.0%

    General

    Vietnamese

    20.1%

Fixes


  • Fix failure to process some audio files starting with non-speech audio when speaker diarization is enabled.

  • Fix for failure to generate Final transcripts for numbers when punctuation is disabled.

  • Fix for Japanese to address decimals being occasionally transcribed as full stops.

  • Fix for number formatting in Malay & English bilingual (en_ms) and Tamil & English bilingual (en_ta) language packs

    • Resolves the following tickets: 30055

  • Fix for incorrect casing in Japanese (ja) transcription output

    • Resolves the following ticket: 28276

  • Fixed a session failure when a custom dictionary’s first item is only a hyphen.

Security fixes


Vulnerability Management

  • A Software Bill of Materials (SBOM) is available for download from the corresponding release page in our Support Portal.

  • libgstreamer (CVE-2025-3887): This component has been manually patched to address CVE-2025-3887. Note: Security scanners may still flag this component as vulnerable because the base version string remains unchanged

Non-Applicable CVEs

The following vulnerabilities were reviewed and determined to have no security impact on this release due to specific configurations or the use of closed, trusted environments:

Component

Identified CVEs

stdlib

CVE-2025-68121, CVE-2025-61726, CVE-2025-61729, CVE-2025-58183, CVE-2025-61728, CVE-2025-61730, CVE-2025-47907, CVE-2025-22874

OpenSSL (libcrypto3, libssl3)

CVE-2025-15467, CVE-2025-69419, CVE-2025-69421

gpgv

CVE-2025-68973

linux headers

CVE-2024-35870, CVE-2024-53179, CVE-2025-37899, CVE-2025-37849, CVE-2025-38118

jose2go

CVE-2025-63811

expr

CVE-2025-68156

runc / selinux

CVE-2025-31133, CVE-2025-52565, CVE-2025-52881

oauth2

CVE-2025-22868

jaraco.context

CVE-2026-23949

pyasn1

CVE-2026-23490

urllib3

CVE-2025-66418, CVE-2025-66471, CVE-2026-21441

wheel

CVE-2026-24049

coredns

CVE-2023-28452, CVE-2025-47950

libc6

CVE-2026-0861

crypto

CVE-2025-22869

azure-core

CVE-2026-21226

cryptography

CVE-2026-26007

protobuf

CVE-2026-0994

python-multipart

CVE-2026-24486

zipp

CVE-2024-5569