The Real-Time Subtitles Generator is a Python-based application designed to transcribe spoken words into text instantly. It utilizes advanced speech recognition technology to convert live audio input into subtitles, providing an accessible tool for individuals with hearing impairments or for situations where visual comprehension of speech is needed.
Real-Time Audio Transcription:
Captures live audio input from a microphone and converts it into text in real time.
Provides an efficient way to display spoken content as text.
Noise Adaptation:
Automatically adjusts to ambient noise levels to improve transcription accuracy.
Continuous Listening:
The program listens for speech continuously until manually stopped.
Multi-Language Support (Optional):
Can transcribe in various languages by specifying a language code (e.g., "en-US" for English, "es-ES" for Spanish).
Error Handling:
Gracefully handles unrecognizable audio or network issues.
Tech Stack:
Language: Python
Libraries:
speech_recognition for capturing and processing audio.
PyAudio for interfacing with the microphone.
Audio Capture:
The application uses the microphone to capture real-time audio input.
Speech Recognition:
Processes the audio through the Google Web Speech API for transcription.
Subtitle Display:
Outputs the recognized text immediately in the terminal.
Built with