Speech-to-Text (Batch)
An introduction to getting transcription data from pre-recorded audio files.
Getting Started
This guide will walk you through how to transcribe pre-recorded audio with the Reverie API. We provide two scenarios to try: transcribe a remote file and transcribe a local file.
Before you start, you’ll need to follow the steps in the Get your API Credentials to obtain your API key.
Install Dependencies
Transcribe Audio from a Remote Stream
To transcribe pre-recorded audio using one of Reverie’s API, follow these steps.
Results
In order to see the results from Reverie, you must run the application. Run your application from the terminal. Your transcripts will appear in your shell.
Analyzing the Response
In this response we see:
job_id
: A unique Identity number auto-assigned by the API for each request.code
: Provides a message code which can be used to look up the nature of the response returned by the API.message
:Provides a brief description about the response returned by the API.result
: An array of transcript objects including, channel_number, transcript, list of words with start time, end time and confidence.Please check the sample response.
Key Features
Real-time Transcription
Transcribe pre-recorded audio into text with high accuracy in real-time, even from lower-quality inputs.
Personalized Speech Model
Customize recognition for domain-specific terms to boost accuracy of unique words or phrases.
Noise Resistance
Decode moderately noisy audio from various environments without extra noise cancellation.
Content Filtering
Filter out inappropriate content with an obscenity detector for clean text output.
Cloud-based Deployment
Scalable and accessible from anywhere for dynamic, distributed teams.
On-premise Deployment
Secure and customizable to integrate with your existing infrastructure.
Sample Code
Python
Access Python SDK samples for real-time speech transcription on GitHub
JavaScript
Explore JavaScript SDK samples for speech-to-text streaming on GitHub
GoLang
Find GoLang SDK samples for speech transcription on GitHub
FAQs
Supported Languages
The Speech-to-Text solution supports transcription in multiple languages, tailored for diverse regional and linguistic needs:
hi
- Hindibn
- Bengaligu
- Gujaratikn
- Kannadaml
- Malayalammr
- Marathipa
- Punjabita
- Tamilte
- Teluguen
- Indian Englishas
- Assameseor
- Odia
Supported Audio Formats
The Speech-to-Text solution supports various audio formats for flexible integration:
Audio Format | Description |
---|---|
16k_int16 | Default format: Signed 16-bit, 16KHz sampling rate in WAV format |
16k_uint8 | Unsigned 8-bit, 16KHz sampling rate in WAV format |
8k_int16 | Signed 16-bit, 8KHz sampling rate in WAV format |
8k_uint8 | Unsigned 8-bit, 8KHz sampling rate in WAV format |
opus_16k | Opus-encoded audio frames, 16KHz sampling rate |
opus_8k | Opus-encoded audio frames, 8KHz sampling rate |
ogg_opus | Opus-encoded audio frames in Ogg container |
16k_ulaw | µ-Law audio frames, 16KHz sampling rate |
8k_ulaw | µ-Law audio frames, 8KHz sampling rate |
API Messages
Code | Message |
---|---|
000 | Transcript ready |
001 | Invalid JOB ID |
002 | Invalid JOB ID |
003 | Your request is in the queue and will be processed shortly |
004 | Your request is being processed |
005 | Job failed. Please contact the developers |
999 | Unknown error |