Incredible Render Performance In Kdenlive With NVENC - But 1 Big Problem

johnar1 johnar1 at protonmail.com
Sat Jul 28 16:05:09 BST 2018


*
Let's render the 1st minute of the h.264 version and post the results.

I am using melt 6.11, 6600K 4.5GHZ, 1050TI no OC, kdenlive 18.08 refractoring, my nvenc profile that I sent you a couple days ago (use that too if you can), nvidia-390, Lubuntu 18.04 with 4.15 Kernel and residual kdenlive environment binaries from Dan Dennedy's melt-build.sh.

Regards!

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On July 28, 2018 5:02 PM, johnar1 <johnar1 at protonmail.com> wrote:

> Hey Jean!
>
> So you are saying, that the reason why there is very low CPU usage and no NVENC usage with the Sintel clip, is because it's not exactly 1920x1080p resolution?
> I shall test different clips then.
>
> What version of kdenlive are you using?
> Also what versions of ffmpeg, ffprobe, ffplay, and melt?
>
> Some info on your CPU, general hardware setup and platform would be nice too.
>
> I would like to link up with you and run tests on the same footage.
>
> I have a 1050TI lying around here and a 6600K, let's compare our results in a similar testing environment.
>
> Just reinstalled Lubuntu 18.04 and working on the kdenlive 18.08 refractoring release.
>
> Let's use this as a sample: https://peach.blender.org/download/
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On July 27, 2018 2:29 PM, Jean-Baptiste Mardelle jb at kdenlive.org wrote:
>
> > On Thursday, July 26, 2018 6:17:52 PM CEST, johnar1 wrote:
> >
> > > I use 16.04 and 18.04 and I have found that using any kernel
> > >
> > > > 4.15 severly breaks the nvidia-driver.
> > >
> > > I am currently compiling 4.18 rc3 with the fixed modules to
> > > bypass the error caused with the driver but it's a pain.
> > > NVENC is a nice feature, but if you use a lot of transitions,
> > > effects and title clips in kdenlive, then it's almost pointless,
> > > because gpu utilization is basically halted during these
> > > operations.
> > > Tell me if you got it working.
> > > And definitely try the Shotcut melt binary, it has given me
> > > even better nvenc performance than my self-compiled melt 6.11
> > > from the latest git.
> >
> > Hello!
> > So I got nvenc working and have some interesting results. First, regarding
> > the slowdown with the high quality track compositing:
> > The transition used for tracks high quality compositing (qtblend) has some
> > code to detect if a compositing is necessary. For example, if the video on
> > the top track uses a pixel format that does not use alpha, like RGB, we
> > don't try to perform the composition, and directly return the top frame as
> > a result. There are several checks to make sure we don't need to do the
> > compositing, and one of them is a check of the aspect ratio. If the frames
> > aspect ratio don't match, we do perform the compositing. Useful for example
> > if you put a small image on a track above a video track, you usually want
> > to be able to see the video in the area not covered by the image.
> > This is the reason for the slowdown on rendering with high quality, because
> > your sample sintel movie is in fact a 1920x818 video, with a DAR (display
> > aspect ratio) of 960:409.
> > So it does not match the 1920x1080, 16:9 DAR project property, and so we do
> > perform the compositing, slowing down the process. Not sure what is the
> > best way to handle this, but we can probably find a solution to prevent
> > compositing in some cases.
> > Now regarding the nvenc performance, I also got some very encouraging
> > results. Using a 1920x1080 mp4 sample clip of 46 seconds, I got the
> > following results with my GTX 1050 TI card (all tests use the default "high
> > quality" track compositing):
> > Test 1: No effects, one clip in timeline. Render times:
> > libxvid: 1m30s
> > nvenc: 7s (!!!)
> > Test 2: one clip in timeline, with lift/gamma/gain color correction. Render
> > times:
> > libxvid: 1m29s
> > nvenc: 36s
> > Test 3: one clip in timeline, with sepia effect. Render times:
> > libxvid: 1m29s
> > nvenc: 8s
> > (sepia filter works in yuv422, while lift/gamma/gain requires an rgb
> > colorspace, which explains the performance diff).
> > Test 4: 1 exta image clip composited over the video. Render times:
> > libxvid: 1m37s
> > nvenc: 1m02s
> > So even with effects and transitions we still get a comfortable time gain.
> > It will also be interesting to integrate this in Kdenlive's internal uses,
> > like timeline preview and creating proxy clips which will make everything
> > faster (I successfully created I frame only clips with nvenc so we can get
> > a frame accurate seeking). I also made some tests with the scale_npp
> > rescale filter:
> > ffmpeg -y -hwaccel cuvid -c:v h264_cuvid -i Sintel.2010.1080p.mkv -filter:v
> > scale_npp=320:200 -vcodec h264_nvenc result.mp4
> > transcoding (using ffmpeg only) the full movie to 320x200 proxy:
> > using nvenc and scale_npp: 43s.
> > Using nvenc the normal ffmpeg's scale filter: 1m18s
> > Using libxvid and normal scale filter: 1m31s
> > Compared to no resize:
> > using nvenc without resizing: 1m11s
> > using libxvid without resizing: 6m08s
> > So that means we can should be able to create proxy clips twice as fast,
> > and timeline preview probably also.
> > The only sad part is that it requires the NVIDIA proprietary drivers, but
> > maybe FFmpeg's other free hwaccel methods will be usable as well in the
> > future. If anybody wants to test other hardware, you are welcome.
> > I will anyways integrate these nvenc speedups in the big december 2018
> > refactoring release.
> > Best regards
> > Jean-Baptiste
> >
> > > Sent with ProtonMail Secure Email.
> > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > On July 26, 2018 10:47 AM, jb jb at kdenlive.org wrote:
> > > Le 26.07.18 à 10:02, johnar1 a écrit :
> > > Hello Jean,
> > > yes that is correct, switching from High Quality compositing
> > > had a major impact on performance and I am not sure why.
> > > I can help you out getting NVENC to work.
> > > What platform are you on?
> > > Thanks. My main issue is the NVidia driver, since I currently
> > > use an Ubuntu 16.04 based distro, and FFmpeg complains about my
> > > NVIDIA driver version < 390. I will upgrade my distro and give
> > > some feedback tomorrow.
> > > After that I guess a page on setting up nvenc would be nice on our wiki:
> > > https://community.kde.org/Kdenlive/Development/KF5
> > > Regards
> > > Jean-Baptiste
> > > There are a couple things to be mindful of.
> > > .)After installing the graphics driver you need to genereate an xorg.conf
> > > .)Install the NVENC Headers from here:
> > > https://github.com/lutris/ffmpeg-nvenc/issues/22
> > > .)Use the correct NVENC parameter in your render profile, as
> > > the current one has been deprecated.
> > > Here's my profile:
> > > f=mp4 vcodec=h264_nvenc gb=21 vq=21 acodec=aac ab=384k r=60
> > > preset= slow g=120 bf=2
> > > .)You need to compile ffmpeg and mlt with the following flags:
> > > ./configure --enable-nvenc --enable-cuvid --enable-nonfree
> > > Or if you don't want to compile yourself you can simply use the
> > > melt binary + libs from the most recent Shotcut build.
> > > https://github.com/mltframework/shotcut/releases/download/v18.07/shotcut-linux-x86_64-180702.tar.bz2
> > > Instructions here:
> > > https://www.youtube.com/watch?v=X14GvmBpq08&t=314s
> > > This will get NVENC working 100%.
> > > When you're done, you need to track the GPU utilization in the
> > > driver and make sure it is working. How many frames your GPU can
> > > push depends on the power of your CPU. I use the
> > > kdenlive_multirender script to ensure 100% utilization on all
> > > cores and subsequently higher GPU utilization.
> > > https://github.com/unfa/kdenlive-multirender
> > > Let's keep in touch!
> > > Sent with ProtonMail Secure Email.
> > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > On July 26, 2018 9:30 AM, Jean-Baptiste Mardelle jb at kdenlive.org wrote:
> > > On 25.07.2018 21:38, johnar1 wrote:
> > > Dear Mr. Vincent Pinon,
> > > if that is in fact your real name, my first born son shall
> > > henceforth be known as Vincent.
> > > Your suggestion was spot on and according to my tests so far I
> > > believe it works.
> > > Here are my findings:
> > > Rendering 1st minute of Sintel 1080p/60FPS , 4 Threads, NVENC
> > > enabled, Kdenlive 18.04 AppImage
> > > Hello Johnar,
> > > I am myself trying to setup a working nvenc environment and
> > > hope to make some more tests.
> > > .)Track Composition - "None"
> > > [CPU Utilization: 70% - 67% - 64% - 80%] [GPU Utilization: 70%]
> > > [Render Time: 15s]
> > > I noticed however, that transitions such as for example Slide
> > > or Composite are rendered improperly, with certain interference
> > > patterns.
> > > After consulting the documentation, I realized that this should
> > > be fixed by disabling any surrounding empty tracks, but so far I
> > > have not been able to achieve that.
> > > .)Track Composition - "Preview"
> > > [CPU Utilization: 67% - 65% - 69% - 75%] [GPU Utilization: 65%]
> > > [Render Time: 20s]
> > > I conclude that the best of both worlds comes into play with
> > > this option enabled.
> > > Both the GPU and CPU are almost fully utilized, while it
> > > appears that transitions are rendered correctly.
> > > So if I understand correctly, rendering the same project with
> > > Track compositing set to "High Quality" has a major impact and
> > > you get this result:
> > > [CPU Utilization: 100% - 7% - 10% - 18%] [Video Engine
> > > Utilization (NVENC): 8%] [Render Time: 2m54s]
> > > This seems strange to me since Kdenlive's "high quality" track
> > > compositing uses the qtblend transition that should
> > > automatically be bypassed when there is no transparency in the
> > > video. If you can confirm that and that this simple change in
> > > track compositing has such an impact this definitely has to be
> > > checked... Also, Dan recently fixed many of the "affine"
> > > transition issue, so it should give results similar to the
> > > "qtblend" transition but may be faster..
> > > Thanks for all your investigations, I hope to come back with
> > > more infos once I successfully achieve my setup.
> > > Best regards
> > > Jean-Baptiste
> > > I will do more thorough testing and read any documentation
> > > available, as I absolutely want to understand what exactly these
> > > options do.
> > > I think it's a bit counterintutive to have "High Quality"
> > > enabled by default, which so heavily impacts performance, while
> > > in my opinion not making enough of an effort to alert the user
> > > to the extreme effects it may have on render times.
> > > I literally spent 72 hours straight, compiling every single
> > > version of melt and kdenlive, documenting and testing every
> > > possible compilation parameter variation, performance reviews
> > > with every available version of Ubuntu, corresponding kernels
> > > and nvidia drivers, and every remotely related kdenlive option
> > > or workarounds.
> > > This has definitely shortened my life span by about 3 - 4 years.
> > > I would like to extend my gratitude to you, good sir.
> > > Sent with ProtonMail Secure Email.
> > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > On July 25, 2018 2:23 AM, Vincent Pinon vpinon at kde.org wrote:
> > > Hello,
> > > I don't know how precisely you do the job in kdenlive
> > > (you could share the .kdenlive, the .sh.mlt, a screenshot),
> > > one thing I suspect is the track composition (automatic transparency):
> > > if you keep the default high quality choice,
> > > kdenlive adds "composite & transform" transitions that are based on Qt.
> > > So without gpl / qt module, MLT skips these transitions.
> > > Could you run your test switching to "no transparency"?
> > > (toolbar just above timeline)
> > > Thanks for you enthusiastic investigations :)
> > > Vincent
> > > Le mardi 24 juillet 2018, 23:26:01 CEST johnar1 a écrit :
> > > Hey guys, I have some more info.
> > > Hey Eugen, I have some more info.
> > > For this test I used mlt 6.11, successfully compiled by Dan
> > > Dennedy's build-melt.sh
> > > The test file that I am using is the 1080p version of Sintel.
> > > https://durian.blender.org/download/
> > > CPU: i5 6600K, GPU:GTX 750 TI nvidia-390 driver , Platform: Kubuntu 18.04
> > > In order to check melt and isolate the problem I simply
> > > rendered the first minute of the Sintel short film with the
> > > following command:
> > > (This is not the /bin/melt, but the script which launches it
> > > with the correct libs)
> > > /home/frank/melt/20180724/melt -profile atsc_1080p_60
> > > sintel.mkv out=3600 -consumer avformat:result-60.mp4 f=mp4
> > > vcodec=h264_nvenc preset=slow
> > > [CPU Utilization: 67% - 70% - 68% - 74%] [Video Engine
> > > Utilization (NVENC): 80%] [Render Time: 20s]
> > > It obviously works perfectly.
> > > Now when I select this melt in the kdenlive environment, and
> > > also ffmpeg, ffplay, ffprobe and the profiles path from Dan's
> > > melt folder, yields the following results when rendering the
> > > first minute of Sintel.
> > > [CPU Utilization: 100% - 7% - 10% - 18%] [Video Engine
> > > Utilization (NVENC): 8%] [Render Time: 2m54s]
> > > Something in kdenlive breaks parallel processing, only allowing
> > > 1 single core to be fully utilized.
> > > And I have tested every single version of kdenlive available on this earth.
> > > Every app image, including the refractoring version and every
> > > single ppa version, including stable, dev and master.
> > > Also generating and launching the render script from the
> > > terminal yields the same result.
> > > RENDERER="/home/frank/kdenlive/bin/kdenlive_render"
> > > MELT="/home/frank/melt/20180724/melt"
> > > SOURCE_0="file:///home/frank/Documents/scripts/script001.sh.mlt"
> > > TARGET_0="file:///home/frank/Documents/untitled.mkv"
> > > PARAMETERS_0="-pid:2664 in=0 out=3052 $MELT atsc_1080p_60
> > > avformat - $SOURCE_0 $TARGET_0 vcodec=nvenc_h264 threads=4
> > > real_time=-1"
> > > $RENDERER $PARAMETERS_0
> > > I have also tested different kdenlive_render executables/libs
> > > with the same result.
> > > I should note, that using the kdenlive_multirender script in
> > > conjunction with the generated render script by kdenlive, while
> > > specifying 4 threads, the CPU uses 2 cores at 100%.
> > > https://github.com/unfa/kdenlive-multirender
> > > Now as I have described before, when I compile melt without
> > > enabling gpl, the 1 minute of Sintel renders perfectly again,
> > > with full utilization on both the CPU and GPU, but from within
> > > Kdenlive this time.
> > > I conclude that this problem is somehow caused by Kdenlive and
> > > related to qt, but I do not possess the knowhow to further
> > > analyze it.
> > > With the latest 18.08 Beta18 and the most recent QT version, 2
> > > cores instead of 1 are now being utilized at 100% with the NVENC
> > > profile and 100% on all 4 cores using the MP4 h264 profile.
> > > So this is 100% a QT issue with NVENC, but I need further
> > > insight from a professionals like yourselves.
> > > Sent with ProtonMail Secure Email.
> > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > On July 16, 2018 7:51 AM, johnar1 johnar1 at protonmail.com wrote:
> > > System: i5 6600K, 1050TI, Ubuntu 18.04, Kernel 4.16
> > > I have successfully compiled mlt and ffmpeg with nvenc support
> > > using the official nvenc headers stripped from the Nvidia SDK.
> > > Rendering the first minute of the 1080p Sintel version, with 4
> > > threads specified and my nvenc profile, finishes in 10 seconds.
> > > Sintel can be downloaded here: https://durian.blender.org/download/
> > > Nvenc Profile: (compatible with recent mlt versions who are
> > > nvenc enabled by deafult)
> > > f=mp4 vcodec=nvenc_h264 global_quality=21 vq=21 preset=slow bf=2 ab=384k
> > > Now here is the problem that I do not understand:
> > > Using the latest version of kdenlive from the kdenlive-master
> > > ppa combined with the newly compiled versions of ffmpeg and mlt
> > > works perfectly, but only under very specific circumstances.
> > > I have only been able to get rendering with nvenc to work
> > > properly when I use and open this specific kdenlive [b]save
> > > file[/b] which I made of the first minute of the Sintel short
> > > film with the Appimage Version of Kdenlive. After launching the
> > > ppa/installed version of kdenlive and opening this save file,
> > > rendering with nvenc works flawlessly.
> > > If I simply start a new project, adding the whole Sintel short
> > > film to the project bin, cutting the first minute and render it,
> > > nvenc simply does not work and the render time is tripled,
> > > despite having changed nothing else, including the nvenc render
> > > profile.
> > > If I create a save file of the first minute of Sintel with the
> > > installed version and open it on the Appimage version, nvenc
> > > does not work again.
> > > Conclusion: There must be something in this save file, maybe a
> > > parameter, additonal settings or any type of code not present in
> > > the default kdenlive project profiles, which enables NVENC.
> > > I would greatly appreciate it if we could find out the source
> > > of this problem together.
> > > Kdenlive Appimage Save File with which NVENC works:
> > > https://pastebin.com/rzjR57DJ
> > > PPA/Installed Version of Kdenlive created Save File which breaks NVENC:
> > > https://pastebin.com/3uQ8sP0C




More information about the kdenlive mailing list