Use Cases
Voice Search
This Python script demonstrates how to implement a real-time voice search system using Reverie SDK’s Speech-to-Text (STT) service. The script captures audio input from the microphone, transcribes it into text, and processes the text for further use (such as conducting a search query, triggering actions, or further processing).
Install Dependencies
How does it Works?
The main steps involved in the script are as follows:
- Environment Configuration:
The script loads essential credentials (REVERIE_APP_ID and REVERIE_API_KEY) from environment variables using the dotenv package. These credentials are required for accessing the Reverie SDK and its services.
Python
- Real-Time Audio Capture::
The script uses the pyaudio library to stream audio data from the microphone. The audio is captured in real-time, with the system processing 16kHz, mono audio in 16-bit signed integer format.
Python
- Speech-to-Text (STT) Conversion:
The audio data is continuously streamed and sent to Reverie’s Automatic Speech Recognition (ASR) service for transcription. The system listens for spoken words and converts them into text in the specified source language (in this case, English).
Python
- Voice Search Use Case::
The stt_stream_async method is used with the “voice_search” domain. This allows the system to process and transcribe speech, making it ideal for voice search applications where users speak their search queries, and the system translates those queries into text for further action.
Python
- Real-Time Transcription::
The transcription occurs in real-time, with each segment of speech being converted into text and displayed immediately. Once the system recognizes a complete, final transcription, the result is returned and printed.
Python
- Error Handling::
The script includes error handling to manage potential issues during the audio capture, streaming, and transcription process, ensuring the system continues functioning smoothly in case of interruptions.
- Output:
The final transcribed text is printed to the console, which could then be used for various purposes, such as initiating a search query, performing a task, or further processing based on the transcription result.