Convert a Video to Subtitles with Whisper AI on Ubuntu 24.04 LTS (AMD GPU)

When you’re creating video content or you have some videos that just need subtitles, it can be a hassle to create them manually. The solution is Whisper AI by OpenAI. Whisper AI will create an .srt
file for you, transcribed from the audio of the .mp4
or .mkv
video file.
To get this working on Ubuntu 24.04 LTS with an AMD GPU, you need a Python virtual environment with the ROCM version of Torch by AMD. I explained that here.
In order to run git
commands and do other useful things, make sure you install the following tools.
Installation
- Open a terminal by pressing the Windows button. Then type ‘cmd’ and press Enter. Go to the home directory of the logged user:
cd ~
- Install
ffmpeg
in case it is not installed yet:
sudo apt install ffmpeg -y
- Clone the Whisper AI repository:
git clone https://github.com/openai/whisper
- Change the directory into the cloned
whisper
repository:
cd whisper
- Copy the
venv
folder from this article into the git repository:
cp -a ../ai-venv/venv .
- Activate the virtual environment:
source venv/bin/activate
- Install the Python requirements of
whisper
:
pip install -r requirements.txt
- Then run:
pip install .
- Run the following command:
whisper video.mp4 --model small --language en --output_format srt --output_dir ./subtitles --device cuda
Explanation:
video.mp4
: This is your video file.
--model
: There are several models that can be used; tiny
, base
, small
, medium
, large
--language en
: Force language detection to English; you can change it to other languages, like es
(Spanish) for example.
--task translate
: Translate non-English audio to English subtitles.
--output_dir ./subtitles
: Choose output directory where you want to store the srt
file.
--device cuda
: This parameter makes sure the GPU is used during transcribing.
Member discussion