Quick journey with ffmpeg: improving audio stream with loudnorm
Hey hey! Tonight I feel like trying a very quick script generation: the goal will be to have an NVP command that I can use simply to renormalize the sound stream in a given input video file. This could be very handy to post-process my devlog recordings for instance.
Creating a skeleton script
- As usual we start with the creation of a dummy script command doing nothing, that will be:
nvp norm-sound -i my_video.mkv
- We will place this new command in the nvp/media/movie_handler.py component.
- So here is the initial command parser descriptor:
psr = context.build_parser("norm-sound") psr.add_str("-i", "--input", dest="input_file")("input video file to normalize") psr.add_str("-o", "--output", dest="output_file")("Output file to generate.") psr.add_float("-g", "--gain", dest="gain")("Volume gain factor")
- And here is the initial handling for this command:
if cmd == "norm-sound": file = self.get_param("input_file") out_file = self.get_param("output_file", None) # return self.concat_media(files, out_file) logger.info("Should normalize sound for %s", file) return True
- And trying to run this command works as expected:
D:\Temp\Videos\youtube>nvp norm-sound -i 0010_rendernode_impl_session01.mkv 2023/02/20 21:11:49 [__main__] INFO: Should normalize sound for 0010_rendernode_impl_session01.mkv D:\Temp\Videos\youtube>
Adding some meat to the handler
- Next we should implement the actual handling system for this command, so we create a dedicated norm_sound function.
- So, now I'm providing a default value for the volume gain:
psr.add_float("-g", "--gain", dest="gain", default=1.0)("Volume gain factor")
- And we also pass this gain value to our new norm_sound() method:
if cmd == "norm-sound": file = self.get_param("input_file") out_file = self.get_param("output_file", None) gain = self.get_param("gain") return self.norm_sound(file, out_file, gain)
- The first version of the
norm_sound()
method only setup the default output file for now:def norm_sound(self, input_file, out_file, gain): """Normalize the audio stream from a given input media file using the user provided gain""" if out_file is None: # folder = self.get_parent_folder(input_file) # fname = self.get_filename(input_file) ext = self.get_path_extension(input_file) out_file = self.set_path_extension(input_file, f"_normed{ext}") logger.info("Should norm sound in %s and write %s", input_file, out_file) return True
- And finally, we can add the correct processing:
def norm_sound(self, input_file, out_file, gain): """Normalize the audio stream from a given input media file using the user provided gain""" if out_file is None: # folder = self.get_parent_folder(input_file) # fname = self.get_filename(input_file) ext = self.get_path_extension(input_file) out_file = self.set_path_extension(input_file, f"_normed{ext}") logger.info("Should norm sound in %s and write %s", input_file, out_file) tools: ToolsManager = self.get_component("tools") ffmpeg_path = tools.get_tool_path("ffmpeg") filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-18:TP=-1.7:LRA=10" cmd = [ffmpeg_path, "-threads", "8", "-i", input_file] cmd += ["-filter_complex", f"{filter_str}"] cmd += ["-c:v", "copy", "-y", out_file] logger.info("Executing command: %s", cmd) res, rcode, outs = self.execute(cmd) if not res: logger.error("Sound normalization failed with return code %d:\n%s", rcode, outs) return False logger.info("Done writting file.") return True
Testing the handler
- Now time to run a first test of that handler:
nvp norm-sound -i 0010_rendernode_impl_session01.mkv -g 2.0
- This will indeed generate a file called 0010_rendernode_impl_session01_normed.mkv and the volume there is indeed significantly higher.
- Now, I just found this other page on the loudnorm filter: http://johnriselvato.com/ffmpeg-how-to-normalize-audio/
- So the suggestion there is to use instead the settings:
$ ffmpeg -i input.mp3 -af loudnorm=I=-16:LRA=11:TP=-1.5 output.mp3
- At the same I think I should also change the naming convention to something like .lnorm.mkv, so let's update accordingly:
def norm_sound(self, input_file, out_file, gain): """Normalize the audio stream from a given input media file using the user provided gain""" if out_file is None: # folder = self.get_parent_folder(input_file) # fname = self.get_filename(input_file) ext = self.get_path_extension(input_file) out_file = self.set_path_extension(input_file, f".lnorm{ext}") logger.info("Should norm sound in %s and write %s", input_file, out_file) tools: ToolsManager = self.get_component("tools") ffmpeg_path = tools.get_tool_path("ffmpeg") # filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-18:TP=-1.7:LRA=10" filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-16:LRA=11:TP=-1.5" cmd = [ffmpeg_path, "-threads", "8", "-i", input_file] cmd += ["-filter_complex", f"{filter_str}"] cmd += ["-c:v", "copy", "-y", out_file] logger.info("Executing command: %s", cmd) res, rcode, outs = self.execute(cmd) if not res: logger.error("Sound normalization failed with return code %d:\n%s", rcode, outs) return False logger.info("Done writting file.") return True
- And now we are good
This is working just find for me 😁
- Side note: After some additional tests I simply decided to reduce a bit the volume gain from 2.0 to 1.8
Adding support for video compression
- In fact, I'm now wondering if I should not also add additional video compression on my devlog video files. So in effect, this means I want to use my existing convert_video script (I think?). Let's see how I could do that…
- So, I updated my convert command to support preset/gain/crf arguments:
psr = context.build_parser("convert") psr.add_flag("-c", "--concat-only", dest="concat_only")("Only perform the GOPRO concatenation part.") psr.add_flag("-s", "--stabilize", dest="stabilize")("Stabilize the video") psr.add_str("-l", "--lens", dest="correct_lens", default="auto")("Lens correction to apply") psr.add_flag("-n", "--norm-audio", dest="normalize_audio")("Normalize audio stream") psr.add_flag("-p", "--sharpen", dest="sharpen")("Sharpen the image") psr.add_float("-g", "--gain", dest="gain", default=1.0)("Volume gain factor") psr.add_int("--crf", dest="crf", default=21)( "Video CRF value (selecthighest value that provide acceptable quality. 29 <=> x264 value 23" ) psr.add_str( "--preset", dest="preset", default="slow", choices=["ultrafast", "superfast", "veryfast", "faster", "fast", "medium", "slow", "slower", "veryslow"], )("Video compression preset")
- And the started to use those parameters and also now using libvorbis for the audio stream:
cmd += f"-ignore_unknown -map 0 -dn -c:v libx265 -crf {crf} -preset {preset} -c:a libvorbis -qscale:a 5 -scodec copy".split()
- First test was with the command line:
nvp vconv convert --crf 25 --preset slower -g 1.8 -n
- ⇒ This is way too slow.
- Testing:
nvp vconv convert --crf 28 --preset medium -g 1.8 -n
- ⇒ This is a lot faster, and the quality is good enough for devlogs, and the file is smaller, so that's what I will now use by default for “non high quality” needs 👍!
Conclusion
- I told you this would be a quick one, but it was still longer than anticipated since I added the video compression part too lol, but anyway, now I'm done and I can use the
vconv convert
script to efficiently compress my devlog videos, all good 😉.