Setup
All connections to Streaming STT API start as a WebSocket request. On successful authorization, the client can start sending binary WebSocket messages containing audio data in one of the supported formats. As speech is detected, it returns the text format of the recognized speech content.
The appname in the query parameter will identify the API request. The API will recognize the default settings like transcription language, domain by the appid, and apikey for each customer account.
Note: On receiving your request, the REV-APP-ID and the REV-API-KEY are forwarded to your email ID to test the API.
Authentication
The Streaming STT API will use the App Id and API key to authenticate requests. If access_token is invalid or the query parameter is not present, the WebSocket connection will be closed.
Note: The App Id and API Key are shared by the Reverie team at the start of the project and these need to be assigned to appid and apikey, respectively.
How Does API work?
The process to transcribe the continuous audio input:
Open the connection with the STT service by defining apikey, appid, appname, domain, src_lang
In the API response, if the cause = Ready, then the connection is successfully established.
Write the speech data into the upstream and continuously receive the transcribed data.
Note: In the response, if the final = false, then the audio is partially transcribed, and the service is still processing the input data
Write --EOF-- into the upstream, to stop the recognition process.
Note: If you fail to write --EOF-- into the upstream, then the STT service will automatically terminate the recognition process.
Below are the scenarios when the service will auto-terminate the recognition process:
On connection timeout
After starting recording, if the user maintains the silence for more than the defined duration
In the API response, if the final = true, then the text received is considered as the final transcript.
Code Snippets for Integration
Initiating Speech Service
URL Elements
Successful Response - Establishing the connection
Partial Utterance - In-between an utterance
The Final Successful Response
Error Response
Last updated