blog:2023:0220_ffmpeg_loudnorm_filtering

Quick journey with ffmpeg: improving audio stream with loudnorm

Hey hey! Tonight I feel like trying a very quick script generation: the goal will be to have an NVP command that I can use simply to renormalize the sound stream in a given input video file. This could be very handy to post-process my devlog recordings for instance.

  • As usual we start with the creation of a dummy script command doing nothing, that will be: nvp norm-sound -i my_video.mkv
  • We will place this new command in the nvp/media/movie_handler.py component.
  • So here is the initial command parser descriptor:
        psr = context.build_parser("norm-sound")
        psr.add_str("-i", "--input", dest="input_file")("input video file to normalize")
        psr.add_str("-o", "--output", dest="output_file")("Output file to generate.")
        psr.add_float("-g", "--gain", dest="gain")("Volume gain factor")
  • And here is the initial handling for this command:
            if cmd == "norm-sound":
                file = self.get_param("input_file")
                out_file = self.get_param("output_file", None)
                # return self.concat_media(files, out_file)
                logger.info("Should normalize sound for %s", file)
                return True
  • And trying to run this command works as expected:
    D:\Temp\Videos\youtube>nvp norm-sound -i 0010_rendernode_impl_session01.mkv
    2023/02/20 21:11:49 [__main__] INFO: Should normalize sound for 0010_rendernode_impl_session01.mkv
    
    D:\Temp\Videos\youtube>
  • Next we should implement the actual handling system for this command, so we create a dedicated norm_sound function.
  • So, now I'm providing a default value for the volume gain:
        psr.add_float("-g", "--gain", dest="gain", default=1.0)("Volume gain factor")
  • And we also pass this gain value to our new norm_sound() method:
            if cmd == "norm-sound":
                file = self.get_param("input_file")
                out_file = self.get_param("output_file", None)
                gain = self.get_param("gain")
    
                return self.norm_sound(file, out_file, gain)
  • The first version of the norm_sound() method only setup the default output file for now:
        def norm_sound(self, input_file, out_file, gain):
            """Normalize the audio stream from a given input media file using the user provided gain"""
            if out_file is None:
                # folder = self.get_parent_folder(input_file)
                # fname = self.get_filename(input_file)
                ext = self.get_path_extension(input_file)
                out_file = self.set_path_extension(input_file, f"_normed{ext}")
    
            logger.info("Should norm sound in %s and write %s", input_file, out_file)
    
            return True
  • And finally, we can add the correct processing:
        def norm_sound(self, input_file, out_file, gain):
            """Normalize the audio stream from a given input media file using the user provided gain"""
            if out_file is None:
                # folder = self.get_parent_folder(input_file)
                # fname = self.get_filename(input_file)
                ext = self.get_path_extension(input_file)
                out_file = self.set_path_extension(input_file, f"_normed{ext}")
    
            logger.info("Should norm sound in %s and write %s", input_file, out_file)
    
            tools: ToolsManager = self.get_component("tools")
            ffmpeg_path = tools.get_tool_path("ffmpeg")
    
            filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-18:TP=-1.7:LRA=10"
    
            cmd = [ffmpeg_path, "-threads", "8", "-i", input_file]
            cmd += ["-filter_complex", f"{filter_str}"]
            cmd += ["-c:v", "copy", "-y", out_file]
    
            logger.info("Executing command: %s", cmd)
            res, rcode, outs = self.execute(cmd)
    
            if not res:
                logger.error("Sound normalization failed with return code %d:\n%s", rcode, outs)
                return False
    
            logger.info("Done writting file.")
            return True
  • Now time to run a first test of that handler:
    nvp norm-sound -i 0010_rendernode_impl_session01.mkv -g 2.0
  • This will indeed generate a file called 0010_rendernode_impl_session01_normed.mkv and the volume there is indeed significantly higher.
  • So the suggestion there is to use instead the settings:
     $ ffmpeg -i input.mp3 -af loudnorm=I=-16:LRA=11:TP=-1.5 output.mp3
  • At the same I think I should also change the naming convention to something like .lnorm.mkv, so let's update accordingly:
        def norm_sound(self, input_file, out_file, gain):
            """Normalize the audio stream from a given input media file using the user provided gain"""
            if out_file is None:
                # folder = self.get_parent_folder(input_file)
                # fname = self.get_filename(input_file)
                ext = self.get_path_extension(input_file)
                out_file = self.set_path_extension(input_file, f".lnorm{ext}")
    
            logger.info("Should norm sound in %s and write %s", input_file, out_file)
    
            tools: ToolsManager = self.get_component("tools")
            ffmpeg_path = tools.get_tool_path("ffmpeg")
    
            # filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-18:TP=-1.7:LRA=10"
            filter_str = f"[0:a]aformat=fltp:44100:stereo,volume={gain:.2f},loudnorm=I=-16:LRA=11:TP=-1.5"
    
            cmd = [ffmpeg_path, "-threads", "8", "-i", input_file]
            cmd += ["-filter_complex", f"{filter_str}"]
            cmd += ["-c:v", "copy", "-y", out_file]
    
            logger.info("Executing command: %s", cmd)
            res, rcode, outs = self.execute(cmd)
    
            if not res:
                logger.error("Sound normalization failed with return code %d:\n%s", rcode, outs)
                return False
    
            logger.info("Done writting file.")
            return True
  • And now we are good :-) This is working just find for me 😁
  • Side note: After some additional tests I simply decided to reduce a bit the volume gain from 2.0 to 1.8
  • In fact, I'm now wondering if I should not also add additional video compression on my devlog video files. So in effect, this means I want to use my existing convert_video script (I think?). Let's see how I could do that…
  • So, I updated my convert command to support preset/gain/crf arguments:
        psr = context.build_parser("convert")
        psr.add_flag("-c", "--concat-only", dest="concat_only")("Only perform the GOPRO concatenation part.")
        psr.add_flag("-s", "--stabilize", dest="stabilize")("Stabilize the video")
        psr.add_str("-l", "--lens", dest="correct_lens", default="auto")("Lens correction to apply")
        psr.add_flag("-n", "--norm-audio", dest="normalize_audio")("Normalize audio stream")
        psr.add_flag("-p", "--sharpen", dest="sharpen")("Sharpen the image")
        psr.add_float("-g", "--gain", dest="gain", default=1.0)("Volume gain factor")
        psr.add_int("--crf", dest="crf", default=21)(
            "Video CRF value (selecthighest value that provide acceptable quality. 29 <=> x264 value 23"
        )
        psr.add_str(
            "--preset",
            dest="preset",
            default="slow",
            choices=["ultrafast", "superfast", "veryfast", "faster", "fast", "medium", "slow", "slower", "veryslow"],
        )("Video compression preset")
  • And the started to use those parameters and also now using libvorbis for the audio stream:
            cmd += f"-ignore_unknown -map 0 -dn -c:v libx265 -crf {crf} -preset {preset} -c:a libvorbis -qscale:a 5 -scodec copy".split()
    
  • First test was with the command line:
    nvp vconv convert --crf 25 --preset slower -g 1.8 -n
    • ⇒ This is way too slow.
  • Testing:
    nvp vconv convert --crf 28 --preset medium -g 1.8 -n
    • ⇒ This is a lot faster, and the quality is good enough for devlogs, and the file is smaller, so that's what I will now use by default for “non high quality” needs 👍!
  • I told you this would be a quick one, but it was still longer than anticipated since I added the video compression part too lol, but anyway, now I'm done and I can use the vconv convert script to efficiently compress my devlog videos, all good 😉.
  • blog/2023/0220_ffmpeg_loudnorm_filtering.txt
  • Last modified: 2023/02/21 13:41
  • by 127.0.0.1