May 1st, 2024

Batch Appliance

4.2.0 - Batch Appliance

New

  • 14 new languages: Bashkir, Basque, Belarusian, Esperanto, Estonian, Galician, Interlingua, Marathi, Mongolian, Tamil, Thai, Uyghur, Vietnamese, and Welsh

  • The JSON-v2 output version is now 2.8, specific changes are:

  • Additional language pack information has been added to the metadata section of the transcription results. There is now more detailed information about properties of the language being used, such as writing direction and word delimiter.

  • We now also record the correct attachment direction for punctuation (e.g. before or after a space) in a new attaches_to field.

Improvements

  • Improved accuracy for 20 languages: Latvian (lv), Swedish (sv), Hungarian (hu), Portuguese (pt), Polish (pl), Mandarin Chinese (cmn), Arabic (ar), Dutch (nl), Slovak (sk), Bulgarian (bg), Romanian (ro), Slovenian (sl), Lithuanian (It), Croatian (hr), Malay (ms), Catalan (ca), Czech (cs), Danish (da), Greek (el), Turkish (tr)

  • Improved formatting of numeric entities such as dates, currencies and large numbers for Swedish (sv), Norwegian (no), and Dutch (nl).

Fixes

  • Fix for accurately handling "p" as "pence" when transcribing currency in English (en).

  • Fix for handling small denominator fractions in Italian (it) and not converting to similar English homonyms e.g. "un terzo" being converted to "1/3".

Known Limitations

The following are known issues in this release:

Issue ID

Summary

Detailed Description and Possible Workarounds

REQ-1409

Proteus HCL with ### unk> causes out of memory error

A Custom Dictionary list that contains the word ### unk> causes the worker to crash.

REQ-7549

Memory leak affecting gRPC

There is a small memory leak in the gRPC Python server https://github.com/grpc/grpc/issues/5913.

REQ-10160

Advanced punctuation for Spanish (es) does not contain inverted marks.

Inverted marks [ ¿ ¡ ] are not currently available for Spanish advanced punctuation.

REQ-10627

Double full stops when acronym is at the end of the sentence

If there is an acronym at the end of the sentence, then a double full stop will be output, for example: "team G.B.."

REQ-10634

Putting "-" as an item in additional vocab configuration will cause the container to fail

Do not enter just a "-" on its own in Custom Dictionary either as an additional vocab item or in the sounds_like property. Hyphens are still supported when entered as part of phrases or words

REQ-14402

When running very large numbers of small jobs (less than 10 seconds) offline, this may cause some of the jobs to be rejected

If you encounter this issue, please ensure licensing is in offline mode when running the appliance offline