====== Quick journey with ffmpeg: improving audio stream with loudnorm ======
{{tag>dev python ffmpeg audio loudnorm}}
Hey hey! Tonight I feel like trying a very quick script generation: the goal will be to have an NVP command that I can use simply to renormalize the sound stream in a given input video file. This could be very handy to post-process my devlog recordings for instance.
====== ======
===== Creating a skeleton script =====
* As usual we start with the creation of a dummy script command doing nothing, that will be: ''nvp norm-sound -i my_video.mkv''
* We will place this new command in the **nvp/media/movie_handler.py** component.
* So here is the initial command parser descriptor: psr = context.build_parser("norm-sound")
psr.add_str("-i", "--input", dest="input_file")("input video file to normalize")
psr.add_str("-o", "--output", dest="output_file")("Output file to generate.")
psr.add_float("-g", "--gain", dest="gain")("Volume gain factor")
* And here is the initial handling for this command: if cmd == "norm-sound":
file = self.get_param("input_file")
out_file = self.get_param("output_file", None)
# return self.concat_media(files, out_file)
logger.info("Should normalize sound for %s", file)
return True
* And trying to run this command works as expected: D:\Temp\Videos\youtube>nvp norm-sound -i 0010_rendernode_impl_session01.mkv
2023/02/20 21:11:49 [__main__] INFO: Should normalize sound for 0010_rendernode_impl_session01.mkv
D:\Temp\Videos\youtube>
===== Adding some meat to the handler =====
* Next we should implement the actual handling system for this command, so we create a dedicated **norm_sound** function.
* So, now I'm providing a default value for the volume gain: psr.add_float("-g", "--gain", dest="gain", default=1.0)("Volume gain factor")
* And we also pass this gain value to our new norm_sound() method: if cmd == "norm-sound":
file = self.get_param("input_file")
out_file = self.get_param("output_file", None)
gain = self.get_param("gain")
return self.norm_sound(file, out_file, gain)
* The first version of the ''norm_sound()'' method only setup the default output file for now: def norm_sound(self, input_file, out_file, gain):
"""Normalize the audio stream from a given input media file using the user provided gain"""
if out_file is None:
# folder = self.get_parent_folder(input_file)
# fname = self.get_filename(input_file)
ext = self.get_path_extension(input_file)
out_file = self.set_path_extension(input_file, f"_normed{ext}")
logger.info("Should norm sound in %s and write %s", input_file, out_file)
return True
* And finally, we can add the correct processing: def norm_sound(self, input_file, out_file, gain):
"""Normalize the audio stream from a given input media file using the user provided gain"""
if out_file is None:
# folder = self.get_parent_folder(input_file)
# fname = self.get_filename(input_file)
ext = self.get_path_extension(input_file)
out_file = self.set_path_extension(input_file, f"_normed{ext}")
logger.info("Should norm sound in %s and write %s", input_file, out_file)
tools: ToolsManager = self.get_component("tools")
ffmpeg_path = tools.get_tool_path("ffmpeg")
filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-18:TP=-1.7:LRA=10"
cmd = [ffmpeg_path, "-threads", "8", "-i", input_file]
cmd += ["-filter_complex", f"{filter_str}"]
cmd += ["-c:v", "copy", "-y", out_file]
logger.info("Executing command: %s", cmd)
res, rcode, outs = self.execute(cmd)
if not res:
logger.error("Sound normalization failed with return code %d:\n%s", rcode, outs)
return False
logger.info("Done writting file.")
return True
===== Testing the handler =====
* Now time to run a first test of that handler: nvp norm-sound -i 0010_rendernode_impl_session01.mkv -g 2.0
* This will indeed generate a file called **0010_rendernode_impl_session01_normed.mkv** and the volume there is indeed significantly higher.
* Now, I just found this other page on the loudnorm filter: http://johnriselvato.com/ffmpeg-how-to-normalize-audio/
* So the suggestion there is to use instead the settings: $ ffmpeg -i input.mp3 -af loudnorm=I=-16:LRA=11:TP=-1.5 output.mp3
* At the same I think I should also change the naming convention to something like **.lnorm.mkv**, so let's update accordingly: def norm_sound(self, input_file, out_file, gain):
"""Normalize the audio stream from a given input media file using the user provided gain"""
if out_file is None:
# folder = self.get_parent_folder(input_file)
# fname = self.get_filename(input_file)
ext = self.get_path_extension(input_file)
out_file = self.set_path_extension(input_file, f".lnorm{ext}")
logger.info("Should norm sound in %s and write %s", input_file, out_file)
tools: ToolsManager = self.get_component("tools")
ffmpeg_path = tools.get_tool_path("ffmpeg")
# filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-18:TP=-1.7:LRA=10"
filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-16:LRA=11:TP=-1.5"
cmd = [ffmpeg_path, "-threads", "8", "-i", input_file]
cmd += ["-filter_complex", f"{filter_str}"]
cmd += ["-c:v", "copy", "-y", out_file]
logger.info("Executing command: %s", cmd)
res, rcode, outs = self.execute(cmd)
if not res:
logger.error("Sound normalization failed with return code %d:\n%s", rcode, outs)
return False
logger.info("Done writting file.")
return True
* And now we are good :-) This is working just find for me 😁
* **Side note**: After some additional tests I simply decided to reduce a bit the volume gain **from 2.0 to 1.8**
===== Adding support for video compression =====
* In fact, I'm now wondering if I should not also add additional video compression on my devlog video files. So in effect, this means I want to use my existing **convert_video** script (I think?). Let's see how I could do that...
* So, I updated my convert command to support preset/gain/crf arguments: psr = context.build_parser("convert")
psr.add_flag("-c", "--concat-only", dest="concat_only")("Only perform the GOPRO concatenation part.")
psr.add_flag("-s", "--stabilize", dest="stabilize")("Stabilize the video")
psr.add_str("-l", "--lens", dest="correct_lens", default="auto")("Lens correction to apply")
psr.add_flag("-n", "--norm-audio", dest="normalize_audio")("Normalize audio stream")
psr.add_flag("-p", "--sharpen", dest="sharpen")("Sharpen the image")
psr.add_float("-g", "--gain", dest="gain", default=1.0)("Volume gain factor")
psr.add_int("--crf", dest="crf", default=21)(
"Video CRF value (selecthighest value that provide acceptable quality. 29 <=> x264 value 23"
)
psr.add_str(
"--preset",
dest="preset",
default="slow",
choices=["ultrafast", "superfast", "veryfast", "faster", "fast", "medium", "slow", "slower", "veryslow"],
)("Video compression preset")
* And the started to use those parameters and also now using libvorbis for the audio stream: cmd += f"-ignore_unknown -map 0 -dn -c:v libx265 -crf {crf} -preset {preset} -c:a libvorbis -qscale:a 5 -scodec copy".split()
* First test was with the command line: nvp vconv convert --crf 25 --preset slower -g 1.8 -n
* => This is way too slow.
* Testing: nvp vconv convert --crf 28 --preset medium -g 1.8 -n
* => This is a lot faster, and the quality is good enough for devlogs, and the file is smaller, so that's what I will now use by default for "non high quality" needs 👍!
===== Conclusion =====
* I told you this would be a quick one, but it was still longer than anticipated since I added the video compression part too lol, but anyway, now I'm done and I can use the ''vconv convert'' script to efficiently compress my devlog videos, all good 😉.