*Updated: 18-Nov-2024 22:21 GMT
These instructions are very crude and messy. Sorry for the windbag word vomit. I hope they help, though. If someone wants to convert this file into a markdown README.md for github, be my guest. The scripts are a little tedious to set up the first time, but once you get them working they're pretty reliable and easy to maintain.
I recommend you check the repo on github for the latest version of this README before continuing.
If you find a bug in the scripts or something is incorrect or incomplete in the documentation, feel free to open an issue on github.
SoundsDownloadScript.ps1 and genRSS.ps1 are Powershell scripts that can work together to download episodes from the BBC Sounds website and then publish a podcast feed.
SoundsDownloadScript.ps1 can work without genRSS.ps1 if you just want to download the audio files, but genRSS.ps1 won't really work with audio files tagged with other tools because they won't be tagged properly to build a podcast feed (see this note).
I believe there are Linux versions for all of these packages, but I've only ever used this on Windows. The script may work on Powershell for Linux, but it will probably take a lot of tweaking. I'm sure another language that's more appropriate. If you're up for the challenge, feel free to use the script logic as a guide and go for it!
When called, SoundsDownloadScript.ps1 checks the program page for the BBC program you are requesting. It gets the name of the most recent episode and checks whether it has downloaded it already. If it hasn't already been downloaded, it calls yt-dlp to download it and then gets the meta data and cover art from the episode's BBC Sounds page. The script calls kid3 to set id3 tags on the audio file. If configured, the script will then clean up old episodes it has downloaded. After that, it can upload the file to a remote location using rclone. If the episode has already been downloaded, the script exits with no action.
genRSS.ps1 uses profile config files to create an RSS file. It scans your download directory of the program and uses kid3 to pull the id3 tags of each episode. It checks the date and time of the most recent file and then checks the date and time of the RSS file to decide whether it needs to update the RSS. If needed, it uses the tags to build an RSS file for a podcast feed. It does not append, it rewrites the whole file each time. It uses rclone to upload the RSS file to a remote location, if configured. If accessible, the url of the RSS feed can be put into a podcast app to be subscribed to.
$DefaultTrackNoFormat
: [String] DateTime formatted string (see this guide for DateTime formatting help) to set the track number if the -TrackNoFormat
parameter is not set. Setting this to 'c'
will count up the track number from the last episode, 'c(r)'
does the same but searches recursive directories. Note: The 'c'
option will call kid3 on each track in the directory to determine the next track number. This can be slow as the directory fills up with more and more audio files. If you're not using the -Archive
parameter, consider using a DateTime format instead. Also, o
can be included in a DateTime as a one digit year, and jjj
can be included in a DateTime as a Julian date.$DefaultTitleFormat
: [Format string] Default format string to set episode title to if -TitleFormat
is not set. The string may contain the following variables in curly brackets:
{0}
= The BBC's primary title. This is usually the show's title.{1}
= The BBC's secondary title. This is usually the title of the episode or the series number.{2}
= The BBC's tertiary title, usually an episode subtitle or episode number in the series. It's often blank.{3}
= The release date and/or time in UTC. A DateTime format should follow. An example would be (3:
[DateTimeFormat]} ex: {3:HH:mm}
{4}
= The release date and/or time in UK time. See above.$DefaultBitrate
: [Number of kilobits] Specify the bitrate stream that yt-dlp should download if -Bitrate
is not set in the command line. Set to 0
to have yt-dlp download the highest bitrate available. It must be the number of kilobits per second (kbps), and only the number. The higher the bitrate, the higher the audio the quality and the bigger the file size. Available bitrates are generally 48
, 96
, 128
, and 320
. If the specified bitrate is not available, yt-dlp will fail to download the program and throw an error that says Requested format is not available
. You can view the bitrates that are available for a particular stream by running:
yt-dlp.exe --list-formats [URL...]
The BBC only makes its content available worldwide in 48
and 96
kbps. If you are outside the UK, and want to download a higher bitrate, you will need to combine this option with -VPNConfig
and have a VPN provider with UK servers that can access those streams.
$GenreTag
: [$true,$false] Will add the genre(s) to the metadata. If set to $true, the script will pull the genre from the 'Similar programmes' section at the bottom of the program page. The BBC's genres do not comply with the <itunes:category> tag in podcast feeds. It will not affect any genre information in genRSS.ps1. Some media tools are weird about displaying multiple values in m4a tags, but the values are there.$DumpDirectory
: [Directory path] Directory to save the stream files and artwork to while working on it. To use the win temp dir, use $env:TEMP
.$VPNAdapter
: [String] Name of the adapter used by OpenVPN. Run Get-NetAdapter
in Powershell to get your list of adapter names. You'll want the Name, not the InterfaceDescription. Mine is 'OpenVPN TAP-Windows6'. The script will use this to determine when the VPN is connected and disconnected to continue on. Only needed if using VPN.$VPNTimeout
: [Number of seconds] Number of seconds to wait before giving up on VPN if it doesn't connect. Remember that while it's waiting, the script will pause and could tie up other instances that are also waiting to run, so set it for a reasonable number of seconds. Only needed if using VPN.$ScriptInstanceControl
: [$true,$false] This controls the instances of the script that can download at a time. If ANY of your instances are using VPN, you should set this to $true
. If it's enabled, it works like this: If an instance is not configured to use VPN, other instances that are also not using the VPN can download at the same time. If an instance that needs VPN wants to download, it must wait until all other instances are done instances are done. If another script is downloading, the current one will wait. If multiple scripts are connecting and disconnecting the VPN, it will screw up downloads. You really only need this if you're using VPN.$LockFileDirectory
: [Directory path] Directory to save lock files for $ScriptInstanceControl
. Specify a non-environment dir (like not the user's temp dir) if the script is running under different user accounts. The paths will need to be accessible by all accounts and have read and write permissions for it to work properly. Only needed if using $ScriptInstanceControl
.$LockFileMaxDuration
: [Number of seconds] This is the maximum age in seconds before lock files are deleted. This keeps script from getting hung up by orphaned lock files. It's rare, but it can happen if the script is interrupted during a download. To disable (not recommended), set to 0
. Only needed if using $ScriptInstanceControl
.$ytdlpUpdate
: [$true,$false] Set to $true
to update yt-dlp before the script runs.$rcloneUpdate
: [$true,$false] Set to $true
to update rclone to the latest stable version before uploading files.$Logging
: [$true,$false,$Logging] Can be used to force log files on or off for all instances. If $false
, then logging will be turned off regardless of whether -Debug
is specified in the command line. Console output and script variables are saved to a log file, and rclone and OpenVPN logs are saved to separate files. Logs will be saved in the $LogDirectory
specified below.
$Logging = $true
= Save logs for all downloads$Logging = $false
= Don't save logs for any downloads$Logging > $null
= Use whatever is set in the command line parameter per instance ($Logging = $Logging
will also have the same effect)$Printjson
: [$true,$false] Prints the raw json data from $jsonResult
to the console. It will also be saved to the log if $Logging
is enabled. The text can be copied and pasted into an online json viewer for better readability. Use this for troubleshooting things like parsing issues, special character issues, or finding a reference point to retrieve a json value (see TitleFormat under the Documentation section). The Console+Vars log files will be significantly larger if this setting is left on.$LogDirectory
: [Directory path] This is the directory to move logs to when the -Logging
switch is present or when $Logging
is set to $true
. You'll want to check this directory once in a while because the logs can get unwieldy.$LogFileNameFormat
: [Format string] This is a format string to set the file name of the log files (ex: $LogFileNameFormat = "{0}-{1}-{2}-{3}.log"
). The string may contain the following variables in curly brackets:
{0}
= ShortTitle{1}
= Hash of task scheduler GUID{2}
= PID of the script that is running{3}
= The type of log (Console+Vars, rclone, vpn){4}
= Current date and time (must only include legal file name chars; {4:yyyyMMdd_HHmmss}
is a good format)$ffmpegExe
: [File path] Path to ffmpeg.exe. You can also use the Get-ChildItem cmdlet:
(Get-ChildItem -Path $PSScriptRoot -Filter "ffmpeg.exe" -Recurse | Sort-Object -Descending -Property LastWriteTime | Select-Object -First 1 | % {$_.FullName })
to recursively search for the most recent executable in the directory.$ffprobeExe
: [File path] Path or Get-ChildItem cmdlet to ffprobe.exe
.$kid3Exe
: [File path] Path or Get-ChildItem cmdlet to kid3-cli.exe
.$rcloneExe
: [File path] Path or Get-ChildItem cmdlet to rclone.exe
(you can comment this line out with #
if not using).$vpnExe
: [File path] Path to openvpn.exe. Since OpenVPN is usually installed into the Program Files directory, you can use the Get-ChildItem cmdlet to openvpn.exe
.
(Get-ChildItem -Path $env:Programfiles -Filter 'openvpn.exe' -Recurse -ErrorAction SilentlyContinue | Sort-Object -Descending -Property LastWriteTime | Select-Object -First 1 | % { $_.FullName })
This must be the command line executable, not the gui. You can comment this line out with #
if not using a VPN.$ytdlpExe
: [File path] Path or Get-ChildItem cmdlet to yt-dlp.exe
.$SortArticles
: [Delimited string] This isn't used much. This is a string of definite articles in various languages separated by a pipe (|). The script will strip these to fill tags specifically used for sorting. For example 'The Beatles' sort tags will become 'Beatles' and will get sorted in the Bs instead of the Ts. You can add your own for other languages, if needed. The caret (^) denotes the beginning of the string. Without it, it will also strip characters from the middle. Put a space after the definite article if it's its own word. For contractions like l', don't put a space. The defaults should be fine for most people, especially since this isn't music.rclone.exe config
rclone.exe config file
to get that location. On Windows, it's usually stored in the user's AppData folder. You'll probably want to move or copy it to the same directory that rclone is in, especially if you're going to run the script as a different user. When running the scripts, you'll need to specify the location by setting the $rcloneConfig
even if it's in the default location. You can have multiple remotes in the same config file, or you can have a different config file for each remote. Either option works.
auth-user-pass "C:\\Program Files\\OpenVPN\\config\\[YourPasswordFile].txt"
This will let OpenVPN connect without having to enter the credentials every time. Note the double back-slashes (\\
) in the path.rclone.exe
command line parameters to your needs. Script blocks must start with $remote_
and must be named the same as the specified remote config in order to be run. For example, if you have an rclone remote called poop
, you should have a script block named $remote_poop
. This is how the script will know which rclone command to use. You will need to be a little familiar with Powershell. Something like:
$remote_poop = {
& $rcloneExe sync $SaveDir $rcloneSyncDir --create-empty-src-dirs --progress --config $rcloneConfig -v $rcloneLoggingArgs
}
Your rclone commands should include all of the following parameters and variables:
$rcloneExe
is the path to rclone.exe$SaveDir
is the local directory (for syncing) or $MediaFile
is the downloaded media file (for copying)$rcloneSyncDir
is remote:path--config $RemoteConfig
is the location of the rclone configuration file-v $rcloneLoggingArgs
is the location to save the log file if enabledrclone.exe copy
or copyto
, then you will need to call the Confirm-DownloadedMediaFile
function at the top of your script block like below:
$remote_poop2 = {
Confirm-DownloadedMediaFile
& $rcloneExe copyto $MediaFile $rcloneSyncDir --check-first --metadata --config $rcloneConfig --progress -v --dump headers $rcloneLoggingArgs
}
Calling Confirm-DownloadedMediaFile
will exit the script block if there was no file downloaded. Otherwise if rclone runs and tries to copy a file that doesn't exist, it will throw a fit. It is not needed when using sync
. Note: the way the script is set up, rclone will not delete files from a remote unless the sync
option is used.
Note: If if you're using the files from a release, the animal name in the first line of genRSS should match the animal name in the first line of SoundsDownloadScript. This will ensure compatibility (the big issue is metatags in the media files). The main branch on github will not have animal name indicators. If you're updating from the main branch, you should update both files at the same time to avoid issues.
$kid3Exe
: [File path] Path to kid3-cli.exe. You can also use the Get-ChildItem cmdlet:
(Get-ChildItem -Path $PSScriptRoot -Filter "kid3-cli.exe" -Recurse | Sort-Object -Descending -Property LastWriteTime | Select-Object -First 1 | % {$_.FullName })
to recursively search for the most recent executable in the directory.$rcloneExe
: [File path] Path or Get-ChildItem cmdlet to rclone.exe (you can comment this line out with #
if not using).\
) need to be escaped by using two (\\
). Values may be surrounded by single, double, or no quotes.
MediaDirectory
: [Directory path] The directory with your downloaded SoundsDownloadScript files to scan.Recursive
: [yes,no] Search subdirectories of MediaDirectory
.MediaExtension
: [String] File extensions to search for without the leading period, separated by a comma with no space. Default is m4a,mp3
.Directory
: [Directory path] Local directory to save the RSS file.RSSFileName
: [File name] Name of the RSS file to save locally. This is the file that will get uploaded.CheckMediaDirectoryHash
: ['contents','filenames','no'] Uses a hash to determine whether the media files have changed and to republish the RSS. If CheckMediaDirectoryHash
is set to contents
or filenames
then each time the script is called, it will compute the MD5 hash value of the MediaDirectory and save it into the RSS. When it is called again, it will pull the hash from the RSS and compare the values. The hash will be saved in <MediaDirectoryHash> as a child of the <rss> element outside of the <channel> element of the RSS. If CheckMediaDirectoryHash
is no
or is not set, genRSS compares the file with the latest LastWriteTimeUtc to the time the RSS was last generated (<lastBuildDate>) to know whether to update republish the RSS. Possible values are:contents
= Scans each file in the MediaDirectory
and calculates a hash of the contents using ReadAllBytes and ComputeHash. This will also detect changes to metadata. Use this if you're using the -RecheckMetadata
option in SoundsDownloadScript. This will be pretty slow if there are a lot of files or of the files are large.filenames
= Computes the MD5 hash value of the MediaDirectory
filenames only using MD5CryptoServiceProvider. This will detect file additions or deletions, but not changes to metadata.no
= Don't take a hash. Use LastWriteTime of the latest file in MediaDirectory
to determine whether the RSS needs to be updated. This is the fastest option, but it won't detect changes if only a file was deleted and not added. This is the default value if CheckMediaDirectoryHash
is not set.CheckProfileHash
: [yes,no] Updates the RSS file if there are changes to the podcast profile. Normally, genRSS will only update the RSS file when there is a new episode. If CheckProfileHash
is set to yes
then the next time the script is called, it will put the MD5 hash value of the profile into the RSS using Get-FileHash. When it is called again, it will pull the hash from the RSS and compare the values. If they are different, then it knows that the profile was updated and the RSS needs to be regenerated and republished. The hash will be put into <ProfileHash> as a child of the <rss> element outside of the <channel> element in the RSS.rcloneConfig
: [File path] Path to the rclone ini config file.RemotePublishDirectory
: [String] This is the remote and directory that rclone should upload the RSS file to. It should be in the form of 'Remote:Directory'. Remote is the name of the appropriate config found in the config file specified in -rcloneConfig
. For services that use the S3 API, use Remote:BucketName\Directory.RemoteRSSFileName
: [File name] Name of the RSS file to publish remotely.PodcastFeedURL
: [URL] The publicly accessible URL of the RSS feed. This populates <atom:link>, which the PSP-1 requires. If it is empty or missing, the element will be omitted from the RSS file for backwards compatibility with older profile templates.PodcastTitle
: [String] The title of the podcast. Just use the program title.PodcastDescription
: [String] Populates the <description> and <itunes:summary> tags at the <channel> (podcast) level. Value will be marked as CDATA and can technically contain html tags, but many podcast apps only support plain text in the description field. You can insert a new line with \n
. See the XmlWriterSettings.NewLineHandling Property page for more information on handling new lines with XmlWriter.PodcastAuthor
: [String] It makes sense to use the name of the BBC station here.OwnerName
: [String] You. I don't want my name out there, so I just put my domain here. This populates the <itunes:owner> element. If left empty, this will be omitted from the RSS.OwnerEmail
: [E-mail address] An e-mail address. This populates the <itunes:email> element. If left empty, this will be omitted from the RSS.PodcastURL
: [URL] Populates the <link> tag at the <channel> (podcast) level. It's supposed to be website or web page associated with a podcast. I set it to the BBC program page of the show.PodcastLanguage
: [ISO 639 language code] This should be a two letter code, optionally with a country code (ex: en-US
).PodcastCopyright
: [String] I put the BBC's info here (ex: (C) BBC 2024
) since I don't own the copyright for the media. If this is empty, the element will be omitted from the RSS.Category
: [String array] Specify one or more categories to define the podcast. Categories may be separated by a comma with no space in between. Subcategories can be specified with a >
angle bracket. The first value will be the category. An unlimited number of subcategories can be specified. Take the example:
Category=News,Science>Chemistry>Physics
News and Science will be categories, and Chemistry and Physics will be subcategories under Science. Recommend limiting them to Apple Podcasts categories.PodcastImage
: [URL] The <channel> (podcast) level cover art that will display in podcast apps. You could host the image wherever you want, but I just point it to the image on the BBC's site. To find the URL, open the program page and view the page source. Search for .jpg and you should find the cover image easily. The URL will look like this: https://ichef.bbci.co.uk/images/ic/128x128/p0c5ydny.jpg. You should copy the link you find and plug in different size dimensions and choose the largest one. Sizes that seem to work are: 128x128, 512x512, 1048x1048, 1792x1792, 1920x1920, and 3000x3000. It should be a square of course.Block
: [yes,no,String array] Populates the <itunes:block> and <podcast:block> elements. If this is set to yes
, podcast indexes like Podcast Addict should not publish the feed in their directories, sort of like noindex. You might want to use this for testing when you first create a podcast feed before it's ready to be published. Note: yes
and no
can be complete values. The <podcast:block> element is included with the PSP-1 standard and can also be made granular by adding specific platform IDs (here's a list of IDs) into an array separated by commas with no spaces. See the two examples below:
Block=yes,podcastaddict:no,podcastindex:no
Block=no,amazon:yes,audible:yes,itunes:yes
In the first example, all platforms would be blocked except Podcast Addict and The Podcast Index (works like a whitelist). In the second example, all platforms would be allowed except Amazon, Audible, and iTunes (works like a blacklist). Unfortunately, <podcast:block> is not supported by very many platorms yet. If Block
is absent, no
is assumed.MediaRootURL
: [URL] This should be the publicly accessible URL path that podcast apps can download the episodes. It should normally be an HTTP directory and then the script will add the file name to the end. I think it could also be a file hosted locally on a windows or smb share by using file://
instead, but I haven't tested it.RerunLabel
: [String] A label to prepend to episode titles when the episode is a rerun according to the criteria in RerunFiles
or RerunTitles
. Include a delimiting character like a colon or dash if you'd like. A space is NOT automatically included. To include a space between the label and the title, surround the value in quotes or double quotes ("Repeat: "
).AutoDetectReruns
: [yes,no,Number of days greater than 0] Specify the number of days after the original air date that an episode should be considered a rerun. In other words, try to determine whether the episode is a rerun by comparing the original date (Original Date or TDOR) to the most recent release date (Release Date or TOAL) of the episode. If it's over the number of days specified, the script marks it as a rerun. If the value is yes
then a default of 90 days is used. If the value is no
or is not present, the feature is disabled.RerunFiles
: [String] Searches the episode file names for this text to decide whether it's a rerun. Separate entries by a comma with no space. I recommend using the BBC program ID.RerunTitles
: [String] Searches the episode titles for this text to decide whether it's a rerun. Separate entries by a comma and without a space. I suggest putting the Series numbers in here (Ex: Series 21,Series 22
).SkipFiles
: [String] Searches the episode file names for this text to decide whether to skip the file. Separate entries by a comma. I recommend using the BBC program ID. These episodes will not be included in the RSS.SkipTitles
: [String] Searches the episode titles for this text to decide whether to skip the file. Separate entries by a comma. These episodes will not be included in the RSS.[Program ID]=Desired Title
: You can force it to use a custom title on specific episodes. One line for each episode. genRSS will use that title for the episode instead of pulling it from the metadata. Examples:
m001ts8t=Seasonal Trimmings
m002tc9v=Year in Review
Logging
: [yes,no] Outputs the console and variables to a text file in the LogDirectory
. This can also be set by using the -Logging
command line parameter. Finally, you can set it globally by adding $Logging = $true
or $Logging = $false
to the configurable options in the script.LogDirectory
: [Directory path] The directory to save log files if Logging
is enabled. It also can be set with a command line parameter (-LogDirectory
) or globally as a configurable option in the script ($LogDirectory
).LogFileNameFormat
: [Format string] A format string to set the name of the log files (ex: LogFileNameFormat = "{0}-{1}-{2}-genRSS_{3}.log"
). The string may contain the following variables in curly brackets:
{0}
= Profile name{1}
= Hash of task scheduler GUID{2}
= PID of the script that is running{3}
= The type of log (Console+Vars, rclone){4}
= Current date and time (must only include legal file name chars; {4:yyyyMMdd_HHmmss}
is a good format)-LogFileNameFormat
) or globally as a configurable option in the script ($LogFileNameFormat
), instead.
The scripts can be called manually from a Powershell console, but I use the Windows Task Scheduler unless I'm doing testing or downloading a one-off episode. Each show has its own task. For most weekly shows, I have it check for new episodes twice a day. Sometimes shows will do out-of-cycle specials around the holidays and this will catch those. For daily shows, I have it check around the time the show is done airing on the BBC schedule and then keep checking every 20 to 30 minutes. Caution: if you set the frequency to run too often, I've noticed that it can get hung up if the same task is still running while it keeps trying. Basically, make sure you allow enough of an interval to let the script finish downloading and uploading an audio file before setting it to retry. I blame Task Scheduler, but that's the way it is.
Command line parameters:
-ProgramURL
: [URL] This is the bbc.co.uk/programmes URL of the show to download the latest ep. Sometimes https://www.bbc.co.uk/programmes/[Program ID]/episodes/player works better, especially if the program releases special features. You can get that link by clicking on the program page and selecting available episodes. You can also use the https://bbc.co.uk/sounds/play/[Program ID] link if you want to download only a specific episode.-SaveDir
: [Directory path] The local directory to move the finished audio file to. The directory will be created if it does not exist.-ShortTitle
: [String] A short reference for the filename. It shouldn't have spaces (Ex: TheNowShow
).-TrackNoFormat
: [String] Set track number format. It can be a DateTime string (see this guide for help formatting DateTime). Additionally, o
can be included in a DateTime as a one digit year, jjj
can be included in a DateTime as a Julian date. There are a few other options: 'c'
counts up from the track number of the most recent file in the save directory. 'c(r)'
does the same but searches recursive directories. Note: The 'c'
option will call kid3 on each track in the directory to determine the next track number. This can be slow as the directory fills up with more and more audio files. If you're not using the -Archive
parameter, consider using a DateTime format instead. If this is omitted, it will use the $DefaultTrackNoFormat
.-TitleFormat
: [Format string] Set the format of the Title tag. It should be a string. There are several variables that can be used, surrounded by curly brackets:
{0}
= The BBC's primary title. This is usually the show's title.{1}
= The BBC's secondary title. This is usually the title of the episode or the series number.{2}
= The BBC's tertiary title, usually an episode subtitle or episode number in the series. It's often blank.{3}
= The release date and/or time in UTC. A DateTime format should follow. An example would be (3:[DateTimeFormat]} ex: {3:HH:mm}
{4}
= The release date and/or time in UK time. See above.'{1} - {2}'
which will usually set the series and episode numbers: Series 15 - Episode 4.
-UseOrigRelease
: [Switch] Sets the release date of the episode to the original date the episode aired. Without this parameter, it uses the most recent availability date. When this parameter is set, the RELEASEDATE/TDRL and ORIGINALDATE/TDOR tags will have the same value. This will affect the file name and also the <pubDate> tag in genRSS.-Bitrate
: [Number of kilobits] Specify the bitrate stream that yt-dlp should download. Set to 0
to have yt-dlp download the highest bitrate available. It must be the number of kilobits per second (kbps), and only the number. The higher the bitrate, the higher the audio the quality and the bigger the file size. Available bitrates are generally 48
, 96
, 128
, and 320
. If the specified bitrate is not available, yt-dlp will fail to download the program and will throw an error that says Requested format is not available
. You can view the bitrates that are available for a particular stream by running:
yt-dlp.exe --list-formats [URL]
The BBC only makes its content available worldwide in 48
and 96
kbps. If you are outside the UK, and want to download a higher bitrate, you will need to combine this option with -VPNConfig
and have a VPN provider with UK servers that can access those streams.
-mp3
: [Switch] Transcode the audio file to mp3 using ffmpeg after downloading. The default is file type is m4a. Uses libmp3lame codec and mirrors the bitrate of the m4a file. Note: I don't do as much testing on the mp3 option (e.g., m4a and mp3 use different tags). If you find a bug with it, open an issue in the repo.-Archive
: [Number] The number of episodes to keep. Set to 0
to disable and keep all episodes. After the script downloads the latest one, it will delete excess ones. It searches for episodes using the ShortTitle. If multiple shows are saved in the same folder for some reason, the other shows will not be deleted.-Days
: [Switch] Bases the -Archive
parameter on the number of days to keep instead of the number of episodes. This option reads the date from each filename (which is set from the episode's GMT release date in the metadata). If Archive is 0
or is not set, this parameter has no effect.-RecheckMetadata
: [Switch] Refreshes the metadata of all media files to pull any changes from the BBC. It searches recursively for files matching the -ShortTitle
and fetches the latest metadata from the episode page on the BBC's website. If there are discrepancies, it updates the media file’s title and comment. I would only use it with the -Archive
option because it can be slow if you have a lot of files since it has to pull the web page of each file individually. Use case: Sometimes the BBC doesn't set the real title or description of a show until after it has already been published to Sounds. If you're also using genRSS to build a podcast feed, you'll want to specify CheckMediaDirectoryHash=contents
in the podcast profile so that genRSS detects the changes and updates the feed with the new metadata.-VPNConfig
: [String array of file paths] If using a VPN, this is the path to the OpenVPN .ovpn config file. It can also be an array of file locations separated by a comma. Since the BBC tries to block VPNs in a cat and mouse game, the script will run through each config in order until it finds one that can download the episode. If you're using VPN and running the script from the task scheduler, I would set the task to run more often so it gets more chances to work. I use PIA for VPN. Also be sure create and set an auth-user-pass file if using. See OpenVPN support for that.-rcloneConfig
: [File path] If using rclone to upload the episode somewhere, this is the path to the rclone config file. You'll need to use:
rclone.exe config create
to create a config. You can copy the config text to another file and specify it here.-rcloneSyncDir
: [String array of file paths] This is the remote and directory that rclone should upload to. It should be in the form of 'Remote:Directory'. Remote is the name of the appropriate config found in the config file specified in rcloneConfig. The value can be an array separated by comma if you want to put it to multiple locations. For S3 API services, use Remote:BucketName\Directory
.-DotSrcConfig
: [File path] The path to an external script file (.ps1) that contains configuration options can be specified here. At a minimum, the file must contain values for $DumpDirectory
, $ffmpegExe
, $ffprobeExe
, $kid3Exe
, and $ytdlpExe
. The best thing to do would be to simply copy the entire 'inline' configuration options section from SoundsDownloadScript.ps1 and paste it into a new .ps1. The script will be called using dot sourcing. It will override any settings that are also set in the inline configuration options. This was implemented to make it easier to upgrade SoundsDownloadScript without having to transcribe the settings to a new file each time. The dot sourcing method was selected over 'real' config files (ini, xml, json, etc.) to account for the multi-line remote script blocks which would have been hard to reliably implement using the other ways. It's not a great option, but it's there and it works. More information on dot sourcing can be found at this Microsoft Learn page.-Logging
: [Switch] Output the console and variables to text files in the specified -LogDirectory
. If rclone and OpenVPN are used, it will create separate log files for those in the same directory with the same name structure.-LogDirectory
: [Directory path] The directory to save log files if -Logging
is enabled.-LogFileNameFormat
: [Format string] A format string to set the name of the log files (ex: -LogFileNameFormat "{0}-{1}-{2}-{3}.log"
). Available format variables can be found above.-NoDL
: [Switch] Use this option to skip downloading the episode. It's useful to see the metadata to set the TitleFormat and troubleshoot escape character and encoding problems.-Force
: [Switch] Download the episode even if it's already downloaded. It will append an underscore number to the end so as not to overwrite an existing file. This can be useful for testing.TerminatingError(Invoke-WebRequest):
Sometimes the BBC updates the program page before it releases the episode on Sounds. This will cause SoundsDownloadScript.ps1 to try to download the episode that isn't available yet. In the Console+Vars log file, you'll see:
PS>TerminatingError(Invoke-WebRequest):"
BBC Sounds
followed by a lot of html and the message Sorry, the page you are looking for cannot be found!
. Don't freak out if this happens. Just wait and run the script again later after the episode has been fully released.
Command line parameters:
-Profile
: [File path] The path to the profile config file to use.-Test
: [File path] A local file name to generate a test RSS file to. This option will not upload anything remotely. It is useful for testing without messing up a public RSS feed.-Force
: [Switch] Force the script to update the RSS file, even if it doesn't need to be. This is useful for testing or pushing changes.-Logging
: [Switch] Output the console and variables to a text file in the $LogDirectory
.-LogDirectory
: [Directory path] The directory to save log files if -Logging
is enabled.-LogFileNameFormat
: [Format string] A format string to set the name of the log files (ex: -LogFileNameFormat "{0}-{1}-{2}-genRSS_{3}.log"
). See the available variables above.If you're using Windows Task Scheduler, here are some examples of how to format the actions:
Basic download:"C:\Program Files\PowerShell\7\pwsh.exe"
Arguments: -Command "& 'C:\Program Files\VideoLAN\VLC\SoundsDownloadScript.ps1' -ProgramURL https://www.bbc.co.uk/programmes/b006r9yq -SaveDir 'F:\Audio\The News Quiz' -ShortTitle TheNewsQuiz -TitleFormat '{1} - {2}' -Archive 0 -Logging -LogDirectory 'C:\Logs'"
"C:\Program Files\PowerShell\7\pwsh.exe"
Arguments: -Command "& 'C:\Program Files\VideoLAN\VLC\SoundsDownloadScript.ps1' -ProgramURL https://www.bbc.co.uk/programmes/b006qfvv -SaveDir 'F:\Audio\Shipping Forecast' -ShortTitle ShippingForecast -TitleFormat '{1} {4:HH:mm}' -rcloneConfig 'C:\Program Files\VideoLAN\VLC\rclone\rclone.conf' -rcloneSyncDir 'bbcsoundsrss_r2:bbcsoundsrss\ShippingForecast\media' -Archive 7 -Days -Logging"
"C:\Program Files\PowerShell\7\pwsh.exe"
Arguments: -Command "& 'C:\Program Files\VideoLAN\VLC\genRSS.ps1' -Profile 'E:\FeedProfiles\ShippingForecast' -Logging -LogDirectory 'C:\Logs'"
"C:\Program Files\PowerShell\7\pwsh.exe"
Arguments: -Command "& 'C:\Program Files\VideoLAN\VLC\SoundsDownloadScript.ps1' -ProgramURL https://www.bbc.co.uk/programmes/b0100rp6 -SaveDir 'F:\Audio\Radcliffe and Maconie' -ShortTitle RadMac -VPNConfig 'C:\Program Files\OpenVPN\config\uk_streaming.ovpn,C:\Program Files\OpenVPN\config\uk_southampton.ovpn,C:\Program Files\OpenVPN\config\uk_manchester.ovpn,C:\Program Files\OpenVPN\config\uk_london.ovpn' -rcloneConfig 'C:\Program Files\VideoLAN\VLC\rclone\rclone.conf' -rcloneSyncDir 'internetarchive_config:' -Archive 0"
Feel free to mess around in the code if you're comfortable with Powershell.
I tried to make SoundsDownloadScript.ps1 fairly modular and include comments to help, but there are a few things to keep in mind:
Audio file name:The naming format for finished audio file is [ShortTitle]-[YYYYMMDD]-[Program ID]. If you decide to change that format, you could break some things unintentionally. For example, the script specifically looks for the 8-character Program ID in the file name to decide whether it's already been downloaded. genRSS.ps1 also relies on it for certain options. The script also uses a regex pattern to match files to delete if the -Archive
parameter is set. Just be sure you update all of those things if you make changes to the file name format.
If rclone is enabled, the script will take a hash of the -SaveDir
and save it to a user-level environment variable with the same name as the SaveDir. This will track folder changes between instances. It will then take a hash of the SaveDir after running and compare the two hashes to decide whether to run the through the script blocks to run rclone.
You can test whether the environment variable is working properly by checking the $SaveDirHashIsFromEnvVar
variable in the Console+Vars log. If it's $true
then the hash was successfully pulled from the environment variable. If it's $false
, then it couldn't find the variable and it generated the hash on the fly.
Certain special characters have to be escaped with a backslash (\
) in kid3 or kid3 will choke. I don't think a complete list exists. SoundsDownloadService escapes single quotes, double quotes, and pipes in a function called Format-kid3CommandString. If you find another character that needs to be escaped, you can add it to the Return
line using the replace
method.
Function Format-kid3CommandString ($StringToFormat) {
Note: You may also need to escape the special characters in Powershell for it to pass them correctly, especially quotes. Use a backtick (
Return $StringToFormat.replace("'","\'").replace("\`"","`"").replace("`"","\`"").replace("|","\|")
}`
) for that.
SoundsDownloadScript uses the format operator (-f
) to let you build dynamic title formats. The available options are stored in an array called $TitleFormatArray
. The first one starts at {0}
.
$TitleFormatArray = $TitleTable.'primary', $TitleTable.'secondary', $TitleTable.'tertiary', $ReleaseDate.ToUniversalTime(), [System.TimeZoneInfo]::ConvertTimeBySystemTimeZoneId($ReleaseDate, 'GMT Standard Time')
You can add more variables to the array to suit your title needs, but you should append them to the end so that they become {5}
, {6}
, etc. Otherwise it will scoot the numbering down the line and potentially screw up your previous settings. You can use this to format the dates a different way, or add any other values from the json metadata. See this page and/or this page for more information on the format operator.
To add other values, I recommend you set $Printjson
to $true
to view the data and paste it into an online json viewer to help you find a reference point to the value you're looking for. The script parses the metadata from json format into the $jsonData
variable. Any of these fields can be used. Recalling a json value with powershell will look something like these examples:
$jsonData.modules.data[0].data.network.short_title
$jsonData.modules.data[0].data.duration.label
$TitleFormatArray
.
I heavily modified this script to create genRSS.ps1. I don't really understand the XML writer part of it, only that it just seems to work. The whole script is messy and very inefficient. Be sure to test THOROUGHLY before putting changes to genRSS into production because you can screw up your subscriber's feeds. Good luck!
PSP-1: The Podcast RSS Standard:I am trying to make the RSS output to be PSP-1 compliant as much as I can. If a required element is missing or something isn't implemented according to the standard, open an issue on github.
Global logging:The logging variable can be set in each profile template or through the command line with the -Logging
switch. It can also be enabled or disabled globally by adding $Logging = $true
or $Logging = $false
to the configuration options section in genRSS.ps1. To force logging on for all instances, set to $true
. To disable it on all instances, set to $false
. If set to $false
, it will override any Logging=yes
entries in the profile templates and -Debug
command line parameters. It effectively forces no genRSS logging. To not force logging on or off and restore the option back to the profiles or command line, just remove or comment out the $Logging
variable.
With the way the script is laid out, I couldn't figure out how to gracefully determine if the $Logging
switch was explicitly set to $false
in the script. I got stubborn and chose to use Select-String to test a regex pattern:
Select-String -Path $PSCommandPath -Pattern '^[\s*]*(\$Logging)[\s*]*=[\s*]*(\$false)(\s*$|\s*#.*)'
Basically it searches the script for a $Logging = $false
string. It works, but it's not efficient, goes against how powershell should work, and I don't like it. I know there's a better way. If you know what it is, open an issue. I'm all ears.
Having correct MP4/ID3 tags is crucial for both scripts to "work" together. SoundsDownloadScript.ps1 sets way more tags than this, but these are the important ones for genRSS.ps1 to work properly:
SoundsDownloadScript.ps1 uses kid3 to set tags. If you want to change them or add additional ones, The Kid3 Handbook has a breakout of what tags it supports. genRSS.ps1 also uses kid3 to read tags. kid3 spits out the tags in json format and then the script parses it. If you're having trouble, you can run
.\kid3-cli.exe -c '{\"method\":\"get\"}' '[FILE]'
to see exactly what kid3 is reading from the audio file and passing to the script. Sometimes special characters can trip it up and it might be necessary to add to the $StringToFormat
variable in the Format-kid3CommandString function. It will probably take some troubleshooting. Turn on the logging to help.
For help with SoundsDownloadScript.ps1 or genRSS.ps1:
Copyright © 2024 endkb (https://github.com/endkb)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.