March 16th, 2022

Batch SaaS

2022.03.16 - Batch SaaS

  • Improved accuracy for all 31 language packs. Gains will be for both Standard and Enhanced operating points
  • Biggest gains: Danish, Dutch, Norwegian, Lithuanian and Turkish
  • New Cantonese (yue) and Indonesian (id) language packs
  • Improved formatting of numeric entities such as dates, currencies and large numbers for 10 languages (cmn, de, en, es, fr, hi, it, ja, pt, ru, yue). Additional metadata about these entities can be requested by using the new enable_entities config parameter. For more information please see documentation here.
  • Improvements to Speaker Diarization functionality in scenarios where two speakers are labelled when it is only a single speaker
  • Improvements to custom dictionary functionality. Custom dictionary entries should now have less false positives
  • Languages updated with additional punctuation marks
  • Japanese (。 、)
  • Italian (. ? , !)
  • Portuguese (. ? , !)
  • Russian (. ? , !)
  • Mandarin (。 ? ! 、)
  • Hindi (। ? , !)
  • The JSON-v2 output version is now 2.7
  • Non-breaking spaces are now possible in a single word
  • Speaker Diarization sensitivity parameters (previously deprecated in March 2021) are now removed from the API
  • Jobs will now be rejected if these parameters are included in the job config
  • This includes speaker_diarization_params, new_speaker_sensitivity, segment_boundary_sensitivity