====== Quick journey with ffmpeg: improving audio stream with loudnorm ====== {{tag>dev python ffmpeg audio loudnorm}} Hey hey! Tonight I feel like trying a very quick script generation: the goal will be to have an NVP command that I can use simply to renormalize the sound stream in a given input video file. This could be very handy to post-process my devlog recordings for instance. ====== ====== ===== Creating a skeleton script ===== * As usual we start with the creation of a dummy script command doing nothing, that will be: ''nvp norm-sound -i my_video.mkv'' * We will place this new command in the **nvp/media/movie_handler.py** component. * So here is the initial command parser descriptor: psr = context.build_parser("norm-sound") psr.add_str("-i", "--input", dest="input_file")("input video file to normalize") psr.add_str("-o", "--output", dest="output_file")("Output file to generate.") psr.add_float("-g", "--gain", dest="gain")("Volume gain factor") * And here is the initial handling for this command: if cmd == "norm-sound": file = self.get_param("input_file") out_file = self.get_param("output_file", None) # return self.concat_media(files, out_file) logger.info("Should normalize sound for %s", file) return True * And trying to run this command works as expected: D:\Temp\Videos\youtube>nvp norm-sound -i 0010_rendernode_impl_session01.mkv 2023/02/20 21:11:49 [__main__] INFO: Should normalize sound for 0010_rendernode_impl_session01.mkv D:\Temp\Videos\youtube> ===== Adding some meat to the handler ===== * Next we should implement the actual handling system for this command, so we create a dedicated **norm_sound** function. * So, now I'm providing a default value for the volume gain: psr.add_float("-g", "--gain", dest="gain", default=1.0)("Volume gain factor") * And we also pass this gain value to our new norm_sound() method: if cmd == "norm-sound": file = self.get_param("input_file") out_file = self.get_param("output_file", None) gain = self.get_param("gain") return self.norm_sound(file, out_file, gain) * The first version of the ''norm_sound()'' method only setup the default output file for now: def norm_sound(self, input_file, out_file, gain): """Normalize the audio stream from a given input media file using the user provided gain""" if out_file is None: # folder = self.get_parent_folder(input_file) # fname = self.get_filename(input_file) ext = self.get_path_extension(input_file) out_file = self.set_path_extension(input_file, f"_normed{ext}") logger.info("Should norm sound in %s and write %s", input_file, out_file) return True * And finally, we can add the correct processing: def norm_sound(self, input_file, out_file, gain): """Normalize the audio stream from a given input media file using the user provided gain""" if out_file is None: # folder = self.get_parent_folder(input_file) # fname = self.get_filename(input_file) ext = self.get_path_extension(input_file) out_file = self.set_path_extension(input_file, f"_normed{ext}") logger.info("Should norm sound in %s and write %s", input_file, out_file) tools: ToolsManager = self.get_component("tools") ffmpeg_path = tools.get_tool_path("ffmpeg") filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-18:TP=-1.7:LRA=10" cmd = [ffmpeg_path, "-threads", "8", "-i", input_file] cmd += ["-filter_complex", f"{filter_str}"] cmd += ["-c:v", "copy", "-y", out_file] logger.info("Executing command: %s", cmd) res, rcode, outs = self.execute(cmd) if not res: logger.error("Sound normalization failed with return code %d:\n%s", rcode, outs) return False logger.info("Done writting file.") return True ===== Testing the handler ===== * Now time to run a first test of that handler: nvp norm-sound -i 0010_rendernode_impl_session01.mkv -g 2.0 * This will indeed generate a file called **0010_rendernode_impl_session01_normed.mkv** and the volume there is indeed significantly higher. * Now, I just found this other page on the loudnorm filter: http://johnriselvato.com/ffmpeg-how-to-normalize-audio/ * So the suggestion there is to use instead the settings: $ ffmpeg -i input.mp3 -af loudnorm=I=-16:LRA=11:TP=-1.5 output.mp3 * At the same I think I should also change the naming convention to something like **.lnorm.mkv**, so let's update accordingly: def norm_sound(self, input_file, out_file, gain): """Normalize the audio stream from a given input media file using the user provided gain""" if out_file is None: # folder = self.get_parent_folder(input_file) # fname = self.get_filename(input_file) ext = self.get_path_extension(input_file) out_file = self.set_path_extension(input_file, f".lnorm{ext}") logger.info("Should norm sound in %s and write %s", input_file, out_file) tools: ToolsManager = self.get_component("tools") ffmpeg_path = tools.get_tool_path("ffmpeg") # filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-18:TP=-1.7:LRA=10" filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-16:LRA=11:TP=-1.5" cmd = [ffmpeg_path, "-threads", "8", "-i", input_file] cmd += ["-filter_complex", f"{filter_str}"] cmd += ["-c:v", "copy", "-y", out_file] logger.info("Executing command: %s", cmd) res, rcode, outs = self.execute(cmd) if not res: logger.error("Sound normalization failed with return code %d:\n%s", rcode, outs) return False logger.info("Done writting file.") return True * And now we are good :-) This is working just find for me 😁 * **Side note**: After some additional tests I simply decided to reduce a bit the volume gain **from 2.0 to 1.8** ===== Adding support for video compression ===== * In fact, I'm now wondering if I should not also add additional video compression on my devlog video files. So in effect, this means I want to use my existing **convert_video** script (I think?). Let's see how I could do that... * So, I updated my convert command to support preset/gain/crf arguments: psr = context.build_parser("convert") psr.add_flag("-c", "--concat-only", dest="concat_only")("Only perform the GOPRO concatenation part.") psr.add_flag("-s", "--stabilize", dest="stabilize")("Stabilize the video") psr.add_str("-l", "--lens", dest="correct_lens", default="auto")("Lens correction to apply") psr.add_flag("-n", "--norm-audio", dest="normalize_audio")("Normalize audio stream") psr.add_flag("-p", "--sharpen", dest="sharpen")("Sharpen the image") psr.add_float("-g", "--gain", dest="gain", default=1.0)("Volume gain factor") psr.add_int("--crf", dest="crf", default=21)( "Video CRF value (selecthighest value that provide acceptable quality. 29 <=> x264 value 23" ) psr.add_str( "--preset", dest="preset", default="slow", choices=["ultrafast", "superfast", "veryfast", "faster", "fast", "medium", "slow", "slower", "veryslow"], )("Video compression preset") * And the started to use those parameters and also now using libvorbis for the audio stream: cmd += f"-ignore_unknown -map 0 -dn -c:v libx265 -crf {crf} -preset {preset} -c:a libvorbis -qscale:a 5 -scodec copy".split() * First test was with the command line: nvp vconv convert --crf 25 --preset slower -g 1.8 -n * => This is way too slow. * Testing: nvp vconv convert --crf 28 --preset medium -g 1.8 -n * => This is a lot faster, and the quality is good enough for devlogs, and the file is smaller, so that's what I will now use by default for "non high quality" needs 👍! ===== Conclusion ===== * I told you this would be a quick one, but it was still longer than anticipated since I added the video compression part too lol, but anyway, now I'm done and I can use the ''vconv convert'' script to efficiently compress my devlog videos, all good 😉.