Use Cases
Voice Agent
This Python script demonstrates how to capture real-time audio from a microphone and convert it into text using the Reverie SDK’s Speech-to-Text (STT) service. The process is asynchronous and handles audio streaming efficiently to provide real-time transcription.
Install Dependencies
How does it Works?
The main steps involved in the script are as follows:
- Environment Configuration:
The script loads API credentials (REVERIE_APP_ID and REVERIE_API_KEY) from environment variables using the dotenv package. These credentials are essential for interacting with the Reverie SDK’s ASR (Automatic Speech Recognition) service.
Python
- Real-Time Audio Capture:
The script uses the pyaudio library to stream audio from the user’s microphone in real-time. The audio is captured in 16kHz, mono format (16-bit signed integer), and streamed asynchronously to Reverie’s ASR service for transcription.
Python
- Speech-to-Text (STT) Conversion:
The captured audio is sent in small chunks to the Reverie ASR system, which transcribes the speech into text. The transcription is processed as it is received, with results updated in real-time.
Python
- Asynchronous Processing:
An asynchronous function handles the audio streaming and transcription. A callback function is used to manage the incoming transcription results, which are accumulated and processed until a final transcription result is achieved.
Python
- Error Handling:
The script includes robust error handling to manage any issues that may arise during the microphone input, audio streaming, or ASR transcription process.
Python
- Output:
Once the transcription is complete, the final text is printed out. The transcription happens in the source language (Hindi in this example), and it’s ready for further processing or translation if needed.