July 1st, 2025
Batch Container
Real-Time Container
GPU Transcription Inference Container
GPU Translation Inference Container
Version 13.4.0 is now available for Batch Container, Real-Time Container, GPU Transcription Inference Container and GPU Translation Inference Container.
Real-time transcription:
Supports End of Utterance feature. When enabled, this helps detect the end of turn in a conversation. This can benefit use cases such as Voice Agents, dictation and translation, reducing latency. Refer to the documentation for more details.
Supports speaker_sensitivity parameter to configure the sensitivity of speaker detection. Refer to documentation for more details.
Batch and Real-time transcription:
Supports the prefer_current_speaker configurable parameter to reduce the likelihood of incorrectly switching between similar sounding speakers. Refer to documentation for more details.
Batch and Real-time transcription
Standard Operating Point: New models released with notable accuracy uplifts for the below languages:
Relative improvements: Bashkir (ba) - 8%, Belarusian (be) - 37%, Bulgarian (bg) - 6%, Catalan (ca) - 18%, Esperanto (eo) - 30%, Galician (gl) - 41%, Interlingua (ia) - 21%, Korean (ko) - 7%, Latvian (lv) - 12%, Marathi (mr) - 9%, Mongolian (mn) - 5%, Thai (th) - 6%, Ukrainian (uk) - 18%, Vietnamese (vi) - 14%
Enhanced Operating Point: Updated Mandarin (cmn) models give up to 5% accuracy improvement.
Real-time transcription
Standard Operating Point: Significant increase in session density for GPU inference.
Standard Operating Point: New models released with notable accuracy uplifts for the below languages:
Relative improvements: Bashkir (ba) - 10%, Belarusian (be) - 38%, Bulgarian (bg) - 6%, Catalan (ca) - 16%, Esperanto (eo) - 30%, Galician (gl) - 39%, Interlingua (ia) - 19%, Korean (ko) - 7%, Latvian (lv) - 4%, Marathi (mr) - 10%, Mongolian (mn) - 9%, Thai (th) - 9%, Ukrainian (uk) - 15%, Vietnamese (vi) - 13%
Enhanced Operating Point: New models released with notable accuracy uplifts for the below languages:
Relative improvements: Bashkir (ba) - 7%, Belarusian (be) - 38%, Bulgarian (bg) - 7%, Catalan (ca) - 5%, Esperanto (eo) - 30%, Galician (gl) - 42%, Interlingua (ia) - 26%, Korean (ko) - 4%, Latvian (lv) - 3%, Marathi (mr) - 10%, Mongolian (mn) - 4%, Thai (th) - 7%, Ukrainian (uk) - 15%, Vietnamese (vi) - 16%
Batch and Real-time transcription: Fixed a session failure when a custom dictionary’s first item is only a hyphen.
Real-time transcription: Fix for failure to generate Final transcripts for numbers when punctuation is disabled.
A Software Bill of Materials (SBOM) is available for download from the corresponding release page in our Support Portal.
Patched libgstreamer to address CVE-2025-3887. However, scanners are likely to still report it as vulnerable due to the unchanged base version string.
💡GPU Transcription and GPU Translation - Inference Containers
GPU Inference Containers are released in sync with the Real-time/Batch containers they support. You should only rely on an Inference Container working with a Real-time/Batch container if it has the same version number.
For full details and a guide to implementation, see GPU Transcription Inference Container and GPU Translation Inference Container.