Random Tech Archives - Arnold Biffna Portfolio

The audio in this video contains several languages and subtitles were generated using the process described in this post.

I recently watched a movie on Netflix with scenes in multiple languages: English, Korean, French, and Italian. During the foreign language scenes, there was no translation, just the name of the language spoken, like “[Korean]”. How disappointing…

In a nerdy fit of revenge I decided to fix this myself. So, I obtained an .mp4 video file of the movie and went to work. The tech I’m about to describe uses AI to listen to your movie’s audio, translate it from almost any language, and create subtitles. You could also use these tools for other tasks such as generating lyrics for music.

The tools involved are a combination of ffmpeg and mlx-whisper – a version of OpenAI’s Whisper model optimized to take advantage of Apple Silicon chips. The hour and a half movie I mentioned took less than 5 minutes to generate subtitles on my Apple M2 Max Macbook Pro with 32Gb of memory. I asked ChatGPT what makes mlx-whisper faster on Apple Silicon chips and this is what it said:

Unified Memory Architecture (UMA): Apple Silicon chips use a unified memory architecture, allowing the CPU, GPU, and Neural Engine to access the same memory pool. This reduces latency and improves data transfer speeds, which are crucial for machine learning tasks.
Metal Performance Shaders (MPS): MLX leverages Metal Performance Shaders, Apple’s framework optimized for machine learning on Apple Silicon. This ensures efficient execution of operations compared to traditional frameworks like PyTorch with MPS backend.
Specialized Machine Learning Accelerators: The Apple Silicon chips include dedicated machine learning accelerators that provide hardware-level support for common algorithms and models, significantly boosting performance for tasks like transcription in Whisper.
Batched Decoding: Implementations like Lightning Whisper MLX utilize features like batched decoding, which improves throughput by processing multiple inputs simultaneously.
Neural Engine: The Neural Engine on Apple Silicon is designed for AI tasks, offering high-speed performance for operations related to speech recognition and transcription.
Optimized Operations: MLX uses custom-implemented operations that are optimized for Apple Silicon, outperforming traditional CUDA and other GPU-based setups in some benchmarks.

What you’ll need

A modern Mac using Apple Silicon
The Terminal app

This is how you get ffmpeg and mlx-whisper on your Mac.

Brew
- https://brew.sh/
- On the web page, you can copy the install command for your terminal
- - ```
  /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
```
- Follow the resulting instructions displayed in the terminal to make brew into a command. These were mine, specific to my user name on the machine. Copy yours from the terminal
  - echo >> /Users/bubba/.bash_profile
  - echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> /Users/bubba/.bash_profile
  - eval "$(/opt/homebrew/bin/brew shellenv)"
- Give it a quick test- type brew and hit return to see if works
FFMpeg
- ```
brew install ffmpeg
```
- Give it a quick test by typing ffmpeg and hit return
Python
- https://www.python.org/
- Download and install from the package installer
Pip
- Download the script from https://bootstrap.pypa.io/get-pip.py into a folder you can run the terminal from. You can also right-click the link and save it
- ```
python3 get-pip.py
```
- Give it a quick test by typing pip and hit return
MLX-Whisper
- ```
pip install mlx-whisper
```
- Give it a quick test by typing mlx_whisper and hit return

LLM – a 3 Gigabyte Large Language Model

```
pip install huggingface_hub hf_transfer
```
```
export HF_HUB_ENABLE_HF_TRANSFER=1
```

huggingface-cli download --local-dir whisper-large-v3-mlx mlx-community/whisper-large-v3-mlx

use a folder where the video will reside

Now that that ffmpeg and mlx_whisper are installed, along with the LLM, lets assume you have a video to subtitle, called input.mp4.

To create an external subtitle file in the .srt format:

mlx_whisper input.mp4 --task translate  --model whisper-large-v3-mlx --output-format srt --verbose False   --condition-on-previous-text False

You can open the .srt file with a text editor and take a look, as well as make manual edits if desired. Now, you can either overlay the subtitle into the video, or add it as a track, so you could turn it on/off when viewing the video.

To overlay the subtitle into the video:

ffmpeg -i input.mp4 -vf subtitles=input.srt -c:a copy output.mp4

To add the subtitle as an optional track instead:

ffmpeg -i input.mp4 -i input.srt -c copy -c:s mov_text output.mp4

Now, suppose you wanted to do this to a folder of .mp4 files. You could loop through them with a shell script. I created this one and it worked for me:

#!/bin/bash

# Loop through all .mp4, .mkv, and .m4v files in the current directory
for video in *.mp4 *.mkv *.m4v; do
  # Skip if no matching files are found
  [[ -e "$video" ]] || continue

  # Extract the file extension and base name
  ext="${video##*.}"
  base="${video%.*}"
  subtitle="${base}.srt"

  echo "Subtitling: $video"
  mlx_whisper "$video" --task translate --model whisper-large-v3-mlx --output-format srt --verbose False --condition-on-previous-text False
  sleep 3

  # Check if the matching .srt file exists
  if [[ -f "$subtitle" ]]; then
    output="${base}_subtitled.${ext}"
    echo "Creating video: $output"
    echo " from subtitle: $subtitle"
    ffmpeg -i "$video" -i "$subtitle" -c copy -c:s mov_text "$output"
  else
    echo "Subtitle not found for $video"
  fi
done

Because my media player can play .mp4, .mkv, and .m4v files, and they all work with these commands, I also added those formats into the loop.

Earlier this year I watched the documentary “Citizen Four” about Edward Snowden’s revelations on government spying. Unlike any other documentary I’ve seen, this one had me on the edge of my seat, feeling tense, shocked, and violated all at the same time. Though I have no illegal activities to hide, I can not be comfortable with the level of access the spying agencies have to our computers, cellphones, and other connected devices. Also, the increase of private spying (hacking) is so rampant, it seemed that protection from agency spying might also be increased protection from hacking.

I decided to step up my game and see if I could maintain privacy in this Orwellian environment. A month of research later, I came up with a solution – VPN. A VPN, or Virtual Private Network – encrypts all your traffic between your device and the VPN server. Here is an example scenario of a regular connection vs a VPN connection:

Scenario: Regular Connection VS VPN

-You and I are at a mom and pop ice cream store
-We are both on our iPhones using the FREE WiFi
-mom and pop have a son with ambitions to be the next “Mr Robot” super hacker
-son set up the FREE WiFi network we are using
-son has taught himself enough Linux and Network Administrator skills to Port Scan, Traffic Sniff, and see everything you are doing on his network (urls, IMs, emails, and more)
-son can NOT see what I’m doing. All he can see is encrypted chunks of data going back and forth to one location, which he can not decrypt

This is VPN. All my data – including URLS, requests, responses, etc – flow through a server in an encrypted connection. When I visit a web page, the url is part of the encrypted data between me and the VPN server, which handles the request to that web page. “Son”, or anyone between me and the VPN (like my internet provider) can not see what pages I visit or what is in my data.

Not All VPN’s are Safe

So what if the VPN provider decides to spy on me? Or logs all my traffic and uploads it to the N.S.A. ? This was something I dug deeper into in my research. My wish list for a VPN service provider evolved into this:

– no logging of my data
– good encryption
– good performance ( bandwidth )
– useable on my computers, phones, tablets (and all at the same time)
– decent price
– good reputation for privacy and reliability

I had VPN connections before with my work, but real privacy is something you have to pay for. On a company VPN, the company can still see your unencrypted traffic, because they operate the VPN server.

Speed

My final choice was Private Internet Access ( which I’ll refer to as PIA ). I have it set up on my Mac, PC, iPhone, Android, and Linux box. You can see PIA’s supported clients here. When not using VPN, I can download up to 12 Megabytes per second on my 100 Megabit connection. On the VPN I’ve reached up to 4 Megabytes per second, but typically cap around 2. These are good speeds considering that the VPN provider has to service many other individuals simultaneously. This is also more than fast enough for YouTube and other video streaming.

Geolocation
I have a choice of servers all over the country and all over the world. This gives me better connections where ever I am, but also allows me to “be” in other places when I need to be. For example, when in Canada, certain web sites re-direct you to the Canadian .ca versions of the page. Your IP location is used for redirection behind the scenes. On PIA, I simply connect to a US VPN server and the problem is solved. This could also hold true for people in countries with censorship and other restrictions, as the agencies blocking certain URLs and IP addresses would never see them in the encrypted VPN traffic.

Hacker Proof?
Other than being digitally safe in the ice cream shop, and being able to spoof my location, VPN has other advantages. My true IP address is never revealed when I surf the internet on VPN. If a hacker was trying to get to my computer via internet, my IP address would appear as one of PIA’s VPN servers. Getting back to my computer via internet should technically be impossible. Although there may be other ways hackers can get to your computer, blocking the passage through the internet is a big step toward safety.

Category: Random Tech

Generate Subtitles for Your Videos Free with AI

Digital Privacy, Security, and How I’m Safer on the Internet Now.

Scenario: Regular Connection VS VPN

Not All VPN’s are Safe

Speed