May 1st, 2024
Batch Appliance
14 new languages: Bashkir, Basque, Belarusian, Esperanto, Estonian, Galician, Interlingua, Marathi, Mongolian, Tamil, Thai, Uyghur, Vietnamese, and Welsh
The JSON-v2 output version is now 2.8, specific changes are:
Additional language pack information has been added to the metadata section of the transcription results. There is now more detailed information about properties of the language being used, such as writing direction and word delimiter.
We now also record the correct attachment direction for punctuation (e.g. before or after a space) in a new attaches_to field.
Improved accuracy for 20 languages: Latvian (lv), Swedish (sv), Hungarian (hu), Portuguese (pt), Polish (pl), Mandarin Chinese (cmn), Arabic (ar), Dutch (nl), Slovak (sk), Bulgarian (bg), Romanian (ro), Slovenian (sl), Lithuanian (It), Croatian (hr), Malay (ms), Catalan (ca), Czech (cs), Danish (da), Greek (el), Turkish (tr)
Improved formatting of numeric entities such as dates, currencies and large numbers for Swedish (sv), Norwegian (no), and Dutch (nl).
Fix for accurately handling "p" as "pence" when transcribing currency in English (en).
Fix for handling small denominator fractions in Italian (it) and not converting to similar English homonyms e.g. "un terzo" being converted to "1/3".
The following are known issues in this release: