1 min read

Convert a Video to Subtitles with Whisper AI on Ubuntu 24.04 LTS (AMD GPU)

Convert a Video to Subtitles with Whisper AI on Ubuntu 24.04 LTS (AMD GPU)
TV

When you’re creating video content or you have some videos that just need subtitles, it can be a hassle to create them manually. The solution is Whisper AI by OpenAI. Whisper AI will create an .srt file for you, transcribed from the audio of the .mp4 or .mkv video file.

To get this working on Ubuntu 24.04 LTS with an AMD GPU, you need a Python virtual environment with the ROCM version of Torch by AMD. I explained that here.

In order to run git commands and do other useful things, make sure you install the following tools.

Installation

  • Open a terminal by pressing the Windows button. Then type ‘cmd’ and press Enter. Go to the home directory of the logged user:
cd ~
  • Install ffmpeg in case it is not installed yet:
sudo apt install ffmpeg -y
  • Clone the Whisper AI repository:
git clone https://github.com/openai/whisper
  • Change the directory into the cloned whisper repository:
cd whisper
  • Copy the venv folder from this article into the git repository:
cp -a ../ai-venv/venv .
  • Activate the virtual environment:
source venv/bin/activate
  • Install the Python requirements of whisper:
pip install -r requirements.txt
  • Then run:
pip install .
  • Run the following command:
whisper video.mp4 --model small --language en --output_format srt --output_dir ./subtitles --device cuda

Explanation:

video.mp4: This is your video file.

--model: There are several models that can be used; tiny, base, small, medium, large

--language en: Force language detection to English; you can change it to other languages, like es (Spanish) for example.

--task translate: Translate non-English audio to English subtitles.

--output_dir ./subtitles: Choose output directory where you want to store the srt file.

--device cuda: This parameter makes sure the GPU is used during transcribing.