AI Video Transcription with OpenAI Whisper

View on GitHub

A straightforward Python application that converts video content into text using OpenAI Whisper. Simply place video files in a designated folder, execute the script, and receive transcribed text output.

Key Features

Video to text conversion - Accurate speech-to-text transcription
Multiple audio formats - Supports mp3, wav, m4a and more
Translation support - Translates non-English speech to English
Subtitle generation - Creates .srt and .vtt subtitle files
Multilingual - Handles content in various languages

Use Cases

Transcribing voice recordings
Converting lectures to text
Creating subtitles for videos
Translating foreign language content
Accessibility improvements

Project Structure

├── app.py          # Main transcription script
├── videos/         # Input folder for video files
├── transcripts/    # Output folder for transcriptions
└── requirements.txt

Quick Start

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install openai-whisper

# Install FFmpeg (macOS)
brew install ffmpeg

# Place videos in the videos folder, then run
python app.py

Technologies

Python 3
OpenAI Whisper (speech-to-text engine)
FFmpeg (audio extraction)
Virtual environment for dependency management

This tool provides an easy way to leverage state-of-the-art speech recognition for video content processing.