AI Video Transcription with OpenAI Whisper
A straightforward Python application that converts video content into text using OpenAI Whisper. Simply place video files in a designated folder, execute the script, and receive transcribed text output.
Key Features
- Video to text conversion - Accurate speech-to-text transcription
- Multiple audio formats - Supports mp3, wav, m4a and more
- Translation support - Translates non-English speech to English
- Subtitle generation - Creates .srt and .vtt subtitle files
- Multilingual - Handles content in various languages
Use Cases
- Transcribing voice recordings
- Converting lectures to text
- Creating subtitles for videos
- Translating foreign language content
- Accessibility improvements
Project Structure
├── app.py # Main transcription script
├── videos/ # Input folder for video files
├── transcripts/ # Output folder for transcriptions
└── requirements.txt
Quick Start
# Create virtual environment
python -m venv venv
source venv/bin/activate
# Install dependencies
pip install openai-whisper
# Install FFmpeg (macOS)
brew install ffmpeg
# Place videos in the videos folder, then run
python app.py
Technologies
- Python 3
- OpenAI Whisper (speech-to-text engine)
- FFmpeg (audio extraction)
- Virtual environment for dependency management
This tool provides an easy way to leverage state-of-the-art speech recognition for video content processing.