Convert a Video to Subtitles with Whisper AI on Ubuntu 24.04 LTS (AMD GPU)
When you’re creating video content or you have some videos that just need subtitles, it can be a hassle to create them manually. The solution is Whisper AI by OpenAI. Whisper AI will create an .srt file for you, transcribed from the audio of the .mp4 or .mkv video file.
To get this working on Ubuntu 24.04 LTS with an AMD GPU, you need a Python virtual environment with the ROCM version of Torch by AMD. I explained that here.
In order to run git commands and do other useful things, make sure you install the following tools.
Installation
- Open a terminal by pressing the Windows button. Then type ‘cmd’ and press Enter. Go to the home directory of the logged user:
cd ~
- Install
ffmpegin case it is not installed yet:
sudo apt install ffmpeg -y
- Clone the Whisper AI repository:
git clone https://github.com/openai/whisper
- Change the directory into the cloned
whisperrepository:
cd whisper
- Copy the
venvfolder from this article into the git repository:
cp -a ../ai-venv/venv .
- Activate the virtual environment:
source venv/bin/activate
- Install the Python requirements of
whisper:
pip install -r requirements.txt
- Then run:
pip install .
- Run the following command:
whisper video.mp4 --model small --language en --output_format srt --output_dir ./subtitles --device cuda
Explanation:
video.mp4: This is your video file.
--model: There are several models that can be used; tiny, base, small, medium, large
--language en: Force language detection to English; you can change it to other languages, like es (Spanish) for example.
--task translate: Translate non-English audio to English subtitles.
--output_dir ./subtitles: Choose output directory where you want to store the srt file.
--device cuda: This parameter makes sure the GPU is used during transcribing.
Member discussion