Setup

All connections to Streaming STT API start as a WebSocket request. On successful authorization, the client can start sending binary WebSocket messages containing audio data in one of the supported formats. As speech is detected, it returns the text format of the recognized speech content.

The appname in the query parameter will identify the API request. The API will recognize the default settings like transcription language, domain by the appid, and apikey for each customer account.

Note: On receiving your request, the REV-APP-ID and the REV-API-KEY are forwarded to your email ID to test the API.

Authentication

The Streaming STT API will use the App Id and API key to authenticate requests. If access_token is invalid or the query parameter is not present, the WebSocket connection will be closed.

Note: The App Id and API Key are shared by the Reverie team at the start of the project and these need to be assigned to appid and apikey, respectively.

How Does API work?

The process to transcribe the continuous audio input:

  1. Open the connection with the STT service by defining apikey, appid, appname, domain, src_lang

  2. In the API response, if the cause = Ready, then the connection is successfully established.

  3. Write the speech data into the upstream and continuously receive the transcribed data.

    • Note: In the response, if the final = false, then the audio is partially transcribed, and the service is still processing the input data

  4. Write --EOF-- into the upstream, to stop the recognition process.

    • Note: If you fail to write --EOF-- into the upstream, then the STT service will automatically terminate the recognition process.

    • Below are the scenarios when the service will auto-terminate the recognition process:

      • On connection timeout

      • After starting recording, if the user maintains the silence for more than the defined duration

  5. In the API response, if the final = true, then the text received is considered as the final transcript.

Code Snippets for Integration

Initiating Speech Service

URL Elements

wss://revapi.reverieinc.com/stream?apikey=<your apikey>&appid=<your app id>&appname=stt_stream&src_lang=hi&domain=generic

Successful Response - Establishing the connection

{
  "id": "bb261bd789af4ba487a2667f8d942d4d7e0195fd1c8e4073",
  "success": true,
  "text": "",
  "display_text": "",
  "final": false,
  "confidence": 1,
  "cause": "ready"
}

Partial Utterance - In-between an utterance

{
  "id": "bb261bd789af4ba487a2667f8d942d4d7e0195fd1c8e4073",
  "success": true,
  "text": "आत्म निर्भर योजना इस योजना अथवा अभियान का उद्देश्य एक सौ तीस करोड़ भारतवासियो",
  "display_text": "आत्म निर्भर योजना इस योजना अथवा अभियान का उद्देश्य 130 करोड़ भारतवासियों",
  "final": false,
  "confidence": 0.797274,
  "cause": "partial"
}

The Final Successful Response

{
  "id": "bb261bd789af4ba487a2667f8d942d4d7e0195fd1c8e4073",
  "success": true,
  "text": "आत्म निर्भर योजना इस योजना अथवा अभियान का उद्देश्य एक सौ तीस करोड़ भारतवासियों को आत्मनिर्भर बनाना है ताकि देश का हर नागरिक संकट की इस घड़ी में कदम से कदम मिलाकर चल सके",
  "display_text": "आत्म निर्भर योजना इस योजना अथवा अभियान का उद्देश्य 130 करोड़ भारतवासियों को आत्मनिर्भर बनाना है",
  "final": true,
  "confidence": 0.743304,
  "cause": "EOF received"
}

Error Response

{
  "id": "2ab3a0b76c854953a022df742f6b3857a76494acd72e4489",
  "success": false,
  "text": "",
  "final": true,
  "confidence": 1,
  "cause": "no `domain` given"
}

Last updated