Just Jeff

Converting video streams to audio-only and removing silence

A very simple, hand-drawn black ink sketch depicting the concept of converting a motion picture to audio. It features a crudely drawn film projector with uneven, wobbly lines, emitting a few irregularly shaped musical notes, set against a plain background. The sketch emphasizes minimalism and the charm of a human-drawn piece.

One night while awake in bed unable to sleep, I wondered if I could convert a favorite streamer's latest stream to audio only to occupy my mind while facilitating good sleep hygiene (no screen).

As a fun learning exercise, I set out the following day to see if I could accomplish this.

I was able to with a CLI utility called yt-dlp1 — please see my disclaimer below:

yt-dlp for Educational Use Under Fair Use Policy

My use of yt-dlp, a tool for downloading videos from online platforms, is solely for educational purposes and falls under the scope of fair use. This use is strictly non-commercial and is intended to facilitate learning and academic exploration.

I comply with the terms of service of the websites from which I download content. The downloaded material is not distributed, shared, or used for any commercial purposes. My usage of yt-dlp adheres to copyright and intellectual property laws, and respects the principles of fair use in an educational context.

Now, where were we… oh yes!

I created a shell function but you can also run this as a one-off command. I use aria2c2 to shorten the download time.

# Download audio-only content using yt-dlp and aria2c as the external downloader
# Usage: download_audio <video_url>
download_audio() {
    if [ -z "$1" ]; then
        echo "Usage: download_audio <video_url>"
        return 1
    fi

    yt-dlp --external-downloader aria2c --external-downloader-args "-x 16 -s 16" -f Audio_Only "$1"
}

The resulting file was an m4a audio file, perfect! I can Airdrop it to my iPhone and play it in Vox,3 my preferred application for playing local, offline audio files.

This particular stream series contains varying lengths of silence that I didn't want to listen to. Could I remove them? Yes!

Using SoX (Sound eXchange),4 another command line utility.

SoX is the Swiss Army Knife of sound processing utilities. It can convert audio files to other popular audio file types and also apply sound effects and filters during the conversion

SoX can't handle m4a (maybe licensing issue?) so I used ffmpeg5 to convert from m4a to opus6, a totally open, royalty-free, highly versatile audio codec.

ffmpeg -i input.m4a output.opus

Removing silence is possible7 with the proudly open source and cross-platform application, Audacity8 but my desire was to use the CLI if possible.

I had no clue how to do this with SoX but I was confident a certain LLM might.

Here's what it churned out:

sox "input.wav" "output.wav" silence 1 0.1 1% -1 1.0 1%

In a script it might look like this:

sox "$file_path" "$output_path" silence 1 0.1 "$volume_thresh"% -1 "$min_duration" "$volume_thresh"%

The silence volume threshold and minimum silence duration required trial-and-error before I landed on a satisfying result.

And the final product was an opus audio file that was significantly smaller in filesize and shorter in duration than the original m4a that yt-dlp spit out.

As mentioned earlier, this was only a I wonder if… educational endeavor. It was fun to unravel a solution.

I hope you enjoyed! Happy New Year! 🥳

  1. https://github.com/yt-dlp/yt-dlp

  2. https://aria2.github.io

  3. https://vox.rocks/iphone-music-player

  4. https://sourceforge.net/projects/sox/

  5. https://ffmpeg.org

  6. https://opus-codec.org

  7. https://manual.audacityteam.org/man/truncate_silence.html

  8. https://www.audacityteam.org

#aria2c #audacity #audio #ffmpeg #opus #sox #video #vox #yt-dlp