SoundsDownloadScript.ps1 + genRSS.ps1

*Updated: 18-Nov-2024 22:21 GMT

These instructions are very crude and messy. Sorry for the windbag word vomit. I hope they help, though. If someone wants to convert this file into a markdown README.md for github, be my guest. The scripts are a little tedious to set up the first time, but once you get them working they're pretty reliable and easy to maintain.

I recommend you check the repo on github for the latest version of this README before continuing.

If you find a bug in the scripts or something is incorrect or incomplete in the documentation, feel free to open an issue on github.

Table of Contents


ABOUT:

SoundsDownloadScript.ps1 and genRSS.ps1 are Powershell scripts that can work together to download episodes from the BBC Sounds website and then publish a podcast feed.

SoundsDownloadScript.ps1 can work without genRSS.ps1 if you just want to download the audio files, but genRSS.ps1 won't really work with audio files tagged with other tools because they won't be tagged properly to build a podcast feed (see this note).


GETTING STARTED:

Package (latest from github)

Prerequisites

Optional

I believe there are Linux versions for all of these packages, but I've only ever used this on Windows. The script may work on Powershell for Linux, but it will probably take a lot of tweaking. I'm sure another language that's more appropriate. If you're up for the challenge, feel free to use the script logic as a guide and go for it!


HOW IT WORKS:

When called, SoundsDownloadScript.ps1 checks the program page for the BBC program you are requesting. It gets the name of the most recent episode and checks whether it has downloaded it already. If it hasn't already been downloaded, it calls yt-dlp to download it and then gets the meta data and cover art from the episode's BBC Sounds page. The script calls kid3 to set id3 tags on the audio file. If configured, the script will then clean up old episodes it has downloaded. After that, it can upload the file to a remote location using rclone. If the episode has already been downloaded, the script exits with no action.

genRSS.ps1 uses profile config files to create an RSS file. It scans your download directory of the program and uses kid3 to pull the id3 tags of each episode. It checks the date and time of the most recent file and then checks the date and time of the RSS file to decide whether it needs to update the RSS. If needed, it uses the tags to build an RSS file for a podcast feed. It does not append, it rewrites the whole file each time. It uses rclone to upload the RSS file to a remote location, if configured. If accessible, the url of the RSS feed can be put into a podcast app to be subscribed to.


INSTALLATION:

SoundsDownloadScript.ps1

  1. Copy SoundsDownloadScript.ps1 to a directory.
  2. Unpack ffmpeg to a directory (recommend a subdirectory inside the directory SoundsDownloadScript.ps1 is in).
  3. Unpack kid3 to a directory (recommend a subdirectory inside the directory SoundsDownloadScript.ps1 is in).
  4. Copy yt-dlp.exe to a directory (recommend inside the directory SoundsDownloadScript.ps1 is in).
  5. If using rclone, unpack rclone to a directory (recommend a subdirectory inside the directory SoundsDownloadScript.ps1 is in).
  6. If using OpenVPN, install it using the default options (should be everything except OpenSSL Utilities). Information and support can be found on OpenVPN's website. Note: This must be the Community edition, not OpenVPN Connect.
  7. Edit SoundsDownloadScript.ps1 before using and set the following variables:
  8. If using rclone to upload the files to a remote location, configure it by setting up remotes: rclone.exe config
    You'll need to know where the rclone configuration file is saved. You can run rclone.exe config file to get that location. On Windows, it's usually stored in the user's AppData folder. You'll probably want to move or copy it to the same directory that rclone is in, especially if you're going to run the script as a different user. When running the scripts, you'll need to specify the location by setting the $rcloneConfig even if it's in the default location. You can have multiple remotes in the same config file, or you can have a different config file for each remote. Either option works.
  9. If using OpenVPN, you'll need to download or create .ovpn config files. This will be different for each VPN provider. I use one called Private Internet Access (PIA), which offers multiple UK servers and makes pre-made .ovpn files available for download. Other providers are available. In the OpenVPN config directory, you'll also need to create a text file with your VPN username in the first line and your password in the second line. Then, in each .ovpn file you'll need to add:
    auth-user-pass "C:\\Program Files\\OpenVPN\\config\\[YourPasswordFile].txt" This will let OpenVPN connect without having to enter the credentials every time. Note the double back-slashes (\\) in the path.
  10. If using rclone, edit SoundsDownloadScript.ps1 and configure the script blocks to format the rclone.exe command line parameters to your needs. Script blocks must start with $remote_ and must be named the same as the specified remote config in order to be run. For example, if you have an rclone remote called poop, you should have a script block named $remote_poop. This is how the script will know which rclone command to use. You will need to be a little familiar with Powershell. Something like: $remote_poop = {
    & $rcloneExe sync $SaveDir $rcloneSyncDir --create-empty-src-dirs --progress --config $rcloneConfig -v $rcloneLoggingArgs
    }
    Your rclone commands should include all of the following parameters and variables: Different providers will need different rclone commands and options. Sometimes you'll need to include additional things like headers, depending on the remote. Read the rclone documentation for the remote you're using and be ready to troubleshoot. The rclone support forums are very helpful. If you are using rclone.exe copy or copyto, then you will need to call the Confirm-DownloadedMediaFile function at the top of your script block like below: $remote_poop2 = {
    Confirm-DownloadedMediaFile
    & $rcloneExe copyto $MediaFile $rcloneSyncDir --check-first --metadata --config $rcloneConfig --progress -v --dump headers $rcloneLoggingArgs
    }
    Calling Confirm-DownloadedMediaFile will exit the script block if there was no file downloaded. Otherwise if rclone runs and tries to copy a file that doesn't exist, it will throw a fit. It is not needed when using sync. Note: the way the script is set up, rclone will not delete files from a remote unless the sync option is used.

genRSS.ps1 (if using)

Note: If if you're using the files from a release, the animal name in the first line of genRSS should match the animal name in the first line of SoundsDownloadScript. This will ensure compatibility (the big issue is metatags in the media files). The main branch on github will not have animal name indicators. If you're updating from the main branch, you should update both files at the same time to avoid issues.

  1. Copy genRSS.ps1 to a directory (recommend inside the directory SoundsDownloadScript.ps1 is in).
  2. Edit genRSS.ps1 before use by setting the following variables:
  3. Copy the ProfileTemplate file to the directory that genRSS.ps1 is in. You should rename it to something descriptive.
  4. Edit the options in the copied file. Paths with a backslash (\) need to be escaped by using two (\\). Values may be surrounded by single, double, or no quotes.
  5. Profit! (actually, don't profit because I want this project to stay off the BBC's radar)

HOW TO RUN:

The scripts can be called manually from a Powershell console, but I use the Windows Task Scheduler unless I'm doing testing or downloading a one-off episode. Each show has its own task. For most weekly shows, I have it check for new episodes twice a day. Sometimes shows will do out-of-cycle specials around the holidays and this will catch those. For daily shows, I have it check around the time the show is done airing on the BBC schedule and then keep checking every 20 to 30 minutes. Caution: if you set the frequency to run too often, I've noticed that it can get hung up if the same task is still running while it keeps trying. Basically, make sure you allow enough of an interval to let the script finish downloading and uploading an audio file before setting it to retry. I blame Task Scheduler, but that's the way it is.

SoundsDownloadScript.ps1

Command line parameters:

TerminatingError(Invoke-WebRequest):

Sometimes the BBC updates the program page before it releases the episode on Sounds. This will cause SoundsDownloadScript.ps1 to try to download the episode that isn't available yet. In the Console+Vars log file, you'll see:

PS>TerminatingError(Invoke-WebRequest):"


BBC Sounds

followed by a lot of html and the message Sorry, the page you are looking for cannot be found!. Don't freak out if this happens. Just wait and run the script again later after the episode has been fully released.

genRSS.ps1 (if using)

Command line parameters:

Windows Task Scheduler Examples

If you're using Windows Task Scheduler, here are some examples of how to format the actions:

Basic download:
Uploading to CloudFlare R2 and generating an RSS:
Using VPN and uploading to archive.org:

DOCUMENTATION:

Feel free to mess around in the code if you're comfortable with Powershell.

SoundsDownloadService.ps1

I tried to make SoundsDownloadScript.ps1 fairly modular and include comments to help, but there are a few things to keep in mind:

Audio file name:

The naming format for finished audio file is [ShortTitle]-[YYYYMMDD]-[Program ID]. If you decide to change that format, you could break some things unintentionally. For example, the script specifically looks for the 8-character Program ID in the file name to decide whether it's already been downloaded. genRSS.ps1 also relies on it for certain options. The script also uses a regex pattern to match files to delete if the -Archive parameter is set. Just be sure you update all of those things if you make changes to the file name format.

RCLONE script blocks:

If rclone is enabled, the script will take a hash of the -SaveDir and save it to a user-level environment variable with the same name as the SaveDir. This will track folder changes between instances. It will then take a hash of the SaveDir after running and compare the two hashes to decide whether to run the through the script blocks to run rclone.

You can test whether the environment variable is working properly by checking the $SaveDirHashIsFromEnvVar variable in the Console+Vars log. If it's $true then the hash was successfully pulled from the environment variable. If it's $false, then it couldn't find the variable and it generated the hash on the fly.

Tags with special characters:

Certain special characters have to be escaped with a backslash (\) in kid3 or kid3 will choke. I don't think a complete list exists. SoundsDownloadService escapes single quotes, double quotes, and pipes in a function called Format-kid3CommandString. If you find another character that needs to be escaped, you can add it to the Return line using the replace method. Function Format-kid3CommandString ($StringToFormat) {
Return $StringToFormat.replace("'","\'").replace("\`"","`"").replace("`"","\`"").replace("|","\|")
}
Note: You may also need to escape the special characters in Powershell for it to pass them correctly, especially quotes. Use a backtick (`) for that.

TitleFormat:

SoundsDownloadScript uses the format operator (-f) to let you build dynamic title formats. The available options are stored in an array called $TitleFormatArray. The first one starts at {0}. $TitleFormatArray = $TitleTable.'primary', $TitleTable.'secondary', $TitleTable.'tertiary', $ReleaseDate.ToUniversalTime(), [System.TimeZoneInfo]::ConvertTimeBySystemTimeZoneId($ReleaseDate, 'GMT Standard Time')

You can add more variables to the array to suit your title needs, but you should append them to the end so that they become {5}, {6}, etc. Otherwise it will scoot the numbering down the line and potentially screw up your previous settings. You can use this to format the dates a different way, or add any other values from the json metadata. See this page and/or this page for more information on the format operator.

To add other values, I recommend you set $Printjson to $true to view the data and paste it into an online json viewer to help you find a reference point to the value you're looking for. The script parses the metadata from json format into the $jsonData variable. Any of these fields can be used. Recalling a json value with powershell will look something like these examples:

Once you have the reference point, add it to the end of $TitleFormatArray.

genRSS.ps1

I heavily modified this script to create genRSS.ps1. I don't really understand the XML writer part of it, only that it just seems to work. The whole script is messy and very inefficient. Be sure to test THOROUGHLY before putting changes to genRSS into production because you can screw up your subscriber's feeds. Good luck!

PSP-1: The Podcast RSS Standard:

I am trying to make the RSS output to be PSP-1 compliant as much as I can. If a required element is missing or something isn't implemented according to the standard, open an issue on github.

Global logging:

The logging variable can be set in each profile template or through the command line with the -Logging switch. It can also be enabled or disabled globally by adding $Logging = $true or $Logging = $false to the configuration options section in genRSS.ps1. To force logging on for all instances, set to $true. To disable it on all instances, set to $false. If set to $false, it will override any Logging=yes entries in the profile templates and -Debug command line parameters. It effectively forces no genRSS logging. To not force logging on or off and restore the option back to the profiles or command line, just remove or comment out the $Logging variable.

With the way the script is laid out, I couldn't figure out how to gracefully determine if the $Logging switch was explicitly set to $false in the script. I got stubborn and chose to use Select-String to test a regex pattern: Select-String -Path $PSCommandPath -Pattern '^[\s*]*(\$Logging)[\s*]*=[\s*]*(\$false)(\s*$|\s*#.*)' Basically it searches the script for a $Logging = $false string. It works, but it's not efficient, goes against how powershell should work, and I don't like it. I know there's a better way. If you know what it is, open an issue. I'm all ears.

MP4/ID3 tags

Having correct MP4/ID3 tags is crucial for both scripts to "work" together. SoundsDownloadScript.ps1 sets way more tags than this, but these are the important ones for genRSS.ps1 to work properly:

SoundsDownloadScript.ps1 uses kid3 to set tags. If you want to change them or add additional ones, The Kid3 Handbook has a breakout of what tags it supports. genRSS.ps1 also uses kid3 to read tags. kid3 spits out the tags in json format and then the script parses it. If you're having trouble, you can run
.\kid3-cli.exe -c '{\"method\":\"get\"}' '[FILE]' to see exactly what kid3 is reading from the audio file and passing to the script. Sometimes special characters can trip it up and it might be necessary to add to the $StringToFormat variable in the Format-kid3CommandString function. It will probably take some troubleshooting. Turn on the logging to help.


SUPPORT:

For help with SoundsDownloadScript.ps1 or genRSS.ps1:


MIT LICENSE:

Copyright © 2024 endkb (https://github.com/endkb)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.