The Speech-to-Text solution supports various audio formats for flexible integration:

Audio FormatDescription
16k_int16Default format: Signed 16-bit, 16KHz sampling rate in WAV format
16k_uint8Unsigned 8-bit, 16KHz sampling rate in WAV format
8k_int16Signed 16-bit, 8KHz sampling rate in WAV format
8k_uint8Unsigned 8-bit, 8KHz sampling rate in WAV format
opus_16kOpus-encoded audio frames, 16KHz sampling rate
opus_8kOpus-encoded audio frames, 8KHz sampling rate
ogg_opusOpus-encoded audio frames in Ogg container
16k_ulawµ-Law audio frames, 16KHz sampling rate
8k_ulawµ-Law audio frames, 8KHz sampling rate