[How To] Set up YouTube Downloader with metadata.

Discussion on setting up QNAP NAS products.
Post Reply
CasaP
Know my way around
Posts: 210
Joined: Sat Feb 14, 2015 3:13 am

[How To] Set up YouTube Downloader with metadata.

Post by CasaP »

In this wonderful era of censorship / memory holing, I had a few channels I subscribed to for years "removed" for various reasons. (Beyond the scope of this discussion).

Anyway, I decided to look into ways of PRESERVING some fo these channels, ideally with specific play lists, and at the same time produce files that are useable with Plex, my goto media buddy.

Eventually I found a handy script called youtube-dl. It's a Swiss Army Knife of download managers, not limited to YouTube, but this is what we will concentrate on here.

Tools required;
  • 1 - A folder structure (from trial & error)
  • 2 - youtube-dl - Does the heavy lifting of downloading
  • 3 - AtomicParsley - Handles Poster art / metadata embedding
  • 4 - A channel list, preferably with playlists rather than a full channel
  • 5 - A custom script
  • 6 - Clean Up - Seasons glorious Seasons
  • 7 - Launch / Run Script

Instructions;
1 - Desired Folder Structure
The benefit of a "How To" is often the fact that the writer has done all of the trial and error(well most anyway).
I would suggest a file structure as follows;

Code: Select all

/share/Video/YouTube
		/+Scripts
			/ChannelList/channel_list.txt
			/DownloadedList/downloaded.txt
		/Channels
Basically, you have a root folder YouTube which contains two sub-folders, ChannelList (Your list of subscriptions text file) and DownloadedList where an auto generated list is made of the videos already downloaded.

Now that that is set up, lets start to get the scripts;

2 - Install YouTube-DL
The easiest way is to use pip, it's use is described in multiple threads and beyond the scope of this how to.

Code: Select all

sudo pip install --upgrade youtube_dl
This will pull the latest version of the script (it can be re-run to update the script also).

3 - Install AtomicParsley
AtomicParsley is essential for proper metadata handling, it is designed for embedding thumbnails etc.
I found the easiest way to install it in the same directory as ffmpeg (it does the conversions), you need to find where that is installed;

Code: Select all

which ffmpeg
This outputs the folder where ffmpeg resides, download AtomicParsley, unzip and copy AtomicParsley into to this folder.

That should be that.

4 - A Channel List
We need to tell the script what we want to download, to do this we create a simple text file called channel_list.txt located in the previously created YouTube/ChannelList folder.

This file looks like this;

Code: Select all

################################################################################################
#### /share/Video/YouTube/AtomicParsley/AtomicParsley
#### /share/Video/YouTube/+Scripts/youtube-dl.sh
################################################################################################


################################################################################################
#### Still It
#### A channel dedicated to the craft of home distillation
#### 
####
################################################################################################

#Playlists;

#Still It - S02 - Grain to Glass
https://www.youtube.com/playlist?list=PLZEGsXzw_6OBjjNO0hiNum6k8T846Rkf9
#Still It - S03 - T500
https://www.youtube.com/playlist?list=PLZEGsXzw_6ODTihSL9Ibr5OOXBLKJf_80
#Still It - S04 - Base Malt Co-Lab
https://www.youtube.com/playlist?list=PLZEGsXzw_6OD_pIRIXHspN4AVGEhvbA6u


################################################################################################
################################################################################################
#### Brothers of the Serpent
#### A podcast in which two brothers explore the mysteries of the ancients, the ages, and the modern day 
#### 
#### https://www.youtube.com/user/TsurugiMasamune/about
################################################################################################


#S00 - Podcast
https://www.youtube.com/playlist?list=PLBLpCRUwQvQr9m2PIE_UH3MV_jB6CCaiI
As you can see, each channel is added according to the playlist URL, which is given a different Season Number or if there are no playlists, just the video list of the channel. More on this later.

To grab these go to the channel, then either playlists or videos tabs and copy / paste the URL into your channel_list.txt.

Thats it. I would suggest starting with one small channel first to make sure all is good. If you do the Brothers of the Serpent first, you will never get anything done. MIND = BLOWN.
Next we set up the actual script that does all of the automation magic. My channel list now contains 100+ and growing. You can cull the downloads on review later.

5 - A Custom Script
Next we will write a script that will handle the downloading (from our channel_list.txt), formatting the file name, grabbing subtitles (if available) and embedding poster art as well as grabbing thumbnails of the video files.

My script looks like;

Code: Select all

#!/bin/sh

#!/bin/sh

cd /share/Video/YouTube/Channels
youtube-dl --yes-playlist \
		   --dateafter now-72months \
		   --download-archive /share/Video/YouTube/+Scripts/DownloadedList/downloaded.txt \
		   -f 'best[ext=mp4]/best' \
		   -i \
		   -o "%(uploader)s/%(playlist)s/%(uploader)s - S00E%(playlist_index)s - %(title)s [%(id)s].%(ext)s" \
		   --add-metadata \
		   --write-description \
		   --write-thumbnail \
		   --embed-thumbnail \
		   --write-info-json \
		   --sub-lang en \
		   --sub-format srt/vtt \
		   --write-auto-sub \
		   --batch-file=/share/Video/YouTube/+Scripts/ChannelList/channel_list.txt
What happens is this;
  • Script "youtube-dl.sh" is run
  • changes active directory to /share/Video/YouTube/Channels - Where we store channel_list.txt
  • grabs any video made within "72 months"
  • checks available videos against /Video/YouTube/+Scripts/DownloadedList/downloaded.txt to make sure it has not already been downloaded
  • if not already downloaded, it pulls the best mp4
Next metadata;
  • The script adds metadata;
  • pulls video description
  • gets & embeds thumbnail (using AtomicParsley)
  • gets subtitles (according to language choice)
So each download is saved as;

Uploader - S00E00 - Playlist - Title . EXT

e.g. files created;

Brothers of The Serpent - S00E203 - Podcast - Imposiblocks 101.mp4 - Best MP4 Video File
Brothers of The Serpent - S00E203 - Podcast - Imposiblocks 101.vtt - Subtitle file
Brothers of The Serpent - S00E203 - Podcast - Imposiblocks 101.json - Description of video
Brothers of The Serpent - S00E203 - Podcast - Imposiblocks 101.jpeg - Thumbnail (also embedded)

6 - Clean Up
You may notice that some play lists pull videos from other channels. Yo have a choice, manually rename the files to the channel you want, or add the other channel to your channel_list.txt file.
Also, you will have to manually rename the Season number according to the Playlist for proper indexing in Plex. Look at the "Still It" channel above, each playlist is given a different Season number, but the script as of yet cannot do this automatically.
Until you manually do this PLEX will just add a massive amount of "Series" due to the file name lacking a proper series number. This will make sense once you start downloading!!


7 - Launch / Run Script
My script is located at /share/Video/YouTube/+Scripts/youtube-dl.sh I have included this file location in the header of my channel_list.txt so I can find it easily.
We need to just run /share/Video/YouTube/+Scripts/youtube-dl.sh in Terminal to run the script.
It will go through the downloaded.txt to see whether a video has already been downloaded, otherwise it will download away, format the file name and store it in the relevant channel directory.
Thats it.


In summary, Enjoy.


Bedtime Reading

youtube-dl is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
for more details... https://github.com/ytdl-org/youtube-dl

AtomicParsley is a lightweight command line program for reading, parsing and setting metadata into MPEG-4 files, in particular, iTunes-style metadata.
for more information https://github.com/wez/atomicparsley
QNAP TS-453Pro w/ 4 * WD RED 4Tb (RAID5) - FW: QTS Build Latest (stable)
QNAP TS-853A w/ 8 * WD RED 6Tb (RAID5) - FW: QTS Build Latest (stable)

OSX 11.2.+
Post Reply

Return to “Turbo Station Installation & Setup”