July 1st, 2025

Batch Container

Real-Time Container

GPU Transcription Inference Container

GPU Translation Inference Container

13.4.0 - Containers

Version 13.4.0 is now available for Batch Container, Real-Time Container, GPU Transcription Inference Container and GPU Translation Inference Container.

New

GPU & CPU

  • Real-time transcription:

    • Supports End of Utterance feature. When enabled, this helps detect the end of turn in a conversation. This can benefit use cases such as Voice Agents, dictation and translation, reducing latency. Refer to the documentation for more details.

    • Supports speaker_sensitivity parameter to configure the sensitivity of speaker detection. Refer to documentation for more details.

  • Batch and Real-time transcription:

    • Supports the prefer_current_speaker configurable parameter to reduce the likelihood of incorrectly switching between similar sounding speakers. Refer to documentation for more details.

Improvements

GPU

  • Batch and Real-time transcription

    • Standard Operating Point: New models released with notable accuracy uplifts for the below languages:

      • Relative improvements: Bashkir (ba) - 8%, Belarusian (be) - 37%, Bulgarian (bg) - 6%, Catalan (ca) - 18%, Esperanto (eo) - 30%, Galician (gl) - 41%, Interlingua (ia) - 21%, Korean (ko) - 7%, Latvian (lv) - 12%, Marathi (mr) - 9%, Mongolian (mn) - 5%, Thai (th) - 6%, Ukrainian (uk) - 18%, Vietnamese (vi) - 14%

    • Enhanced Operating Point: Updated Mandarin (cmn) models give up to 5% accuracy improvement.

  • Real-time transcription

    • Standard Operating Point: Significant increase in session density for GPU inference.

CPU

  • Standard Operating Point: New models released with notable accuracy uplifts for the below languages:

    • Relative improvements: Bashkir (ba) - 10%, Belarusian (be) - 38%, Bulgarian (bg) - 6%, Catalan (ca) - 16%, Esperanto (eo) - 30%, Galician (gl) - 39%, Interlingua (ia) - 19%, Korean (ko) - 7%, Latvian (lv) - 4%, Marathi (mr) - 10%, Mongolian (mn) - 9%, Thai (th) - 9%, Ukrainian (uk) - 15%, Vietnamese (vi) - 13%

  • Enhanced Operating Point: New models released with notable accuracy uplifts for the below languages:

    • Relative improvements: Bashkir (ba) - 7%, Belarusian (be) - 38%, Bulgarian (bg) - 7%, Catalan (ca) - 5%, Esperanto (eo) - 30%, Galician (gl) - 42%, Interlingua (ia) - 26%, Korean (ko) - 4%, Latvian (lv) - 3%, Marathi (mr) - 10%, Mongolian (mn) - 4%, Thai (th) - 7%, Ukrainian (uk) - 15%, Vietnamese (vi) - 16%

Fixes

GPU & CPU

  • Batch and Real-time transcription: Fixed a session failure when a custom dictionary’s first item is only a hyphen.

  • Real-time transcription: Fix for failure to generate Final transcripts for numbers when punctuation is disabled.

Security fixes

GPU & CPU

  • A Software Bill of Materials (SBOM) is available for download from the corresponding release page in our Support Portal.

  • Patched libgstreamer to address CVE-2025-3887. However, scanners are likely to still report it as vulnerable due to the unchanged base version string.

💡GPU Transcription and GPU Translation - Inference Containers

GPU Inference Containers are released in sync with the Real-time/Batch containers they support. You should only rely on an Inference Container working with a Real-time/Batch container if it has the same version number.

For full details and a guide to implementation, see GPU Transcription Inference Container and GPU Translation Inference Container.