Speech-to-Text (Batch)

Getting Started

This guide will walk you through how to transcribe pre-recorded audio with the Reverie API. We provide two scenarios to try: transcribe a remote file and transcribe a local file.

Before you start, you’ll need to follow the steps in the Get your API Credentials to obtain your API key.

Install Dependencies

npm i @reverieit/reverie-client

Transcribe Audio from a Remote Stream

To transcribe pre-recorded audio using one of Reverie’s API, follow these steps.

const ReverieClient = require("reverie-client");

const reverieClient = new ReverieClient({
apiKey: "YOUR-API-KEY",
appId: "YOUR-APP-ID",
});

const response = await reverieClient.transcribeAudio({
audioFile: file,
language: lang,
subtitles: subtitles
});

console.log("Response from API:", response);

Results

In order to see the results from Reverie, you must run the application. Run your application from the terminal. Your transcripts will appear in your shell.

# Run your application using the file you created in the previous step
# Example:
npm start

Analyzing the Response

{
  "job_id": "e21f356d-cbf9-4d62-a960-1e9da1805d19",
  "code": "000",
  "message": "Transcript ready.",
  "result": {
    "transcript": "Hello. Welcome to Reverie.",
    "original_transcript": "HELLO. WELCOME TO REVERIE.",
    "channel_number": 1,
    "words": [
      [
        {
          "conf": 0.991683,
          "end": 0.21,
          "start": 0.09,
          "word": "HELLO"
        },
        {
          "conf": 1.0,
          "end": 0.6,
          "start": 0.21,
          "word": "WELCOME"
        },
        {
          "conf": 0.99723,
          "end": 0.72,
          "start": 0.6,
          "word": "TO"
        },
        {
          "conf": 1.0,
          "end": 1.320315,
          "start": 0.72,
          "word": "REVERIE"
        }
      ]
    ],
    "subtitles": "1\n00:00:00,090 --> 00:00:06,900\nHELLO. WELCOME TO REVERIE.\n\n"
  }
}

In this response we see:

job_id : A unique Identity number auto-assigned by the API for each request.
code : Provides a message code which can be used to look up the nature of the response returned by the API.
message :Provides a brief description about the response returned by the API.
result : An array of transcript objects including, channel_number, transcript, list of words with start time, end time and confidence.Please check the sample response.

Key Features

Real-time Transcription

Transcribe pre-recorded audio into text with high accuracy in real-time, even from lower-quality inputs.

Personalized Speech Model

Customize recognition for domain-specific terms to boost accuracy of unique words or phrases.

Noise Resistance

Decode moderately noisy audio from various environments without extra noise cancellation.

Content Filtering

Filter out inappropriate content with an obscenity detector for clean text output.

Cloud-based Deployment

Scalable and accessible from anywhere for dynamic, distributed teams.

On-premise Deployment

Secure and customizable to integrate with your existing infrastructure.

Sample Code

Python

Access Python SDK samples for real-time speech transcription on GitHub

JavaScript

Explore JavaScript SDK samples for speech-to-text streaming on GitHub

GoLang

Find GoLang SDK samples for speech transcription on GitHub

FAQs

What is the accuracy of real-time transcription?

Can I customize the speech model for my industry?

Does it work in noisy environments?

How do I deploy the solution?

Which domains are supported?

Supported Languages

The Speech-to-Text solution supports transcription in multiple languages, tailored for diverse regional and linguistic needs:

hi - Hindi
bn - Bengali
gu - Gujarati
kn - Kannada
ml - Malayalam
mr - Marathi
pa - Punjabi
ta - Tamil
te - Telugu
en - Indian English
as - Assamese
or - Odia

Supported Audio Formats

The Speech-to-Text solution supports various audio formats for flexible integration:

Audio Format	Description
16k_int16	Default format: Signed 16-bit, 16KHz sampling rate in WAV format
16k_uint8	Unsigned 8-bit, 16KHz sampling rate in WAV format
8k_int16	Signed 16-bit, 8KHz sampling rate in WAV format
8k_uint8	Unsigned 8-bit, 8KHz sampling rate in WAV format
opus_16k	Opus-encoded audio frames, 16KHz sampling rate
opus_8k	Opus-encoded audio frames, 8KHz sampling rate
ogg_opus	Opus-encoded audio frames in Ogg container
16k_ulaw	µ-Law audio frames, 16KHz sampling rate
8k_ulaw	µ-Law audio frames, 8KHz sampling rate

API Messages

Code	Message
000	Transcript ready
001	Invalid JOB ID
002	Invalid JOB ID
003	Your request is in the queue and will be processed shortly
004	Your request is being processed
005	Job failed. Please contact the developers
999	Unknown error

Getting Started

Usage Guides

API References

Use Cases

SDKs

Endpoints

Getting Started

Install Dependencies

Transcribe Audio from a Remote Stream

Results

Analyzing the Response

Key Features

Real-time Transcription

Personalized Speech Model

Noise Resistance

Content Filtering

Cloud-based Deployment

On-premise Deployment

Sample Code

Python

JavaScript

GoLang

FAQs

Supported Languages

Supported Audio Formats

API Messages

Getting Started

Usage Guides

API References

Use Cases

SDKs

Endpoints

​Getting Started

​Install Dependencies

​Transcribe Audio from a Remote Stream

​Results

​Analyzing the Response

​Key Features

Real-time Transcription

Personalized Speech Model

Noise Resistance

Content Filtering

Cloud-based Deployment

On-premise Deployment

​Sample Code

Python

JavaScript

GoLang

​FAQs

​Supported Languages

​Supported Audio Formats

​API Messages

Getting Started

Install Dependencies

Transcribe Audio from a Remote Stream

Results

Analyzing the Response

Key Features

Sample Code

FAQs

Supported Languages

Supported Audio Formats

API Messages