• AnyStream is having some DRM issues currently, Netflix is not available in HD for the time being.
    Situations like this will always happen with AnyStream: streaming providers are continuously improving their countermeasures while we try to catch up, it's an ongoing cat-and-mouse game. Please be patient and don't flood our support or forum with requests, we are working on it 24/7 to get it resolved. Thank you.

NVENC on different hardware comparison

This was my first high-end laptop (M.2 SSD + 8-core i9 + 32GB RAM + high end GPU + high end cooling). I bought it after ragequitting my plans to purchase new desktop components during the COVID price gouging days. I have the laptop plugged in to my LG 48" OLED that sits on my desk where I also watch some movies and TV shows. You'd be surprised. The laptop experience is hardly a compromise these days.
Sounds nice ... I'm currently in the process of gathering the components for my new machine ... It won't be an i9, more like an i7-12700K, but the rest will be similar.
And since I need a mouse and keyboard anyway, I will not be convinced to buy a Laptop :cool:

Looks like my GPU is stealing 64GB's of my memory though...lol
Yup, that seems half of the computers memory ... but I've never seen it utilized

I just tried the 4k Encode with my favourite tool StaxRip where I have a preset with pretty much everything set to max quality and ended up after 6m15s with a filesize of 2.5GB (QC set to 22 might be overkill :whistle:)
My guess is, its faster, because also the decode is done by the Nvidia card with the DGSource (payed) filter.
This is confirmed by the CPU utilization, that went down from 50% to 10%
 
Last edited:
Good lord that is a lot of memory. Good enough disks there too? :p

It is interesting that video encoding is at 100% for all three of us but cartman0208's 3D usage spikes heavily. It is also interesting that all of our runtimes are different yet nothing drastic so much. I assume CPU grunt shouldn't really come in to play?

CPU was running about 6-8%
Yeah...I do a lot of Photoshop and Premiere stuff for work, so disk space and memory is a must. BTN and PTP take up a lot of disk space also.
 
I just tried the 4k Encode with my favourite tool StaxRip where I have a preset with pretty much everything set to max quality and ended up after 6m15s with a filesize of 2.5GB (QC set to 22 might be overkill :whistle:)
My guess is, its faster, because also the decode is done by the Nvidia card with the DGSource (payed) filter.
This is confirmed by the CPU utilization, that went down from 50% to 10%

20-23 is optimal for HD....so you're right in the ballpark at 22

I've never tried StaxRip...it looks interesting....
 
20-23 is optimal for HD....so you're right in the ballpark at 22
Thought so... but 2.5Gig for a 12 minute video seems a bit ... much ... that would sum up to 12GB per hour.
I have very few videos in my collection which beat that rate ...

I've never tried StaxRip...it looks interesting....
You bet ... one main reason for switching from Vidcoder/Handbrake ... I failed to find an option there to simply remux video and audio, e.g. from MP4 to MKV.
Staxrip can do ;)
 
Last edited:
I'm a sucker for Blu-Ray remuxes for TV series though....7-8GB's for an hour episode (45min.) is normal for me...12GB's for 12 minutes...yeah, that's excessive.
I didn't even look at the length of the movie.
 
Sooo, I got my hands on a 3070ti (and other new Hardware) and tried the 4K file again ... interestingly ended up after 7:55 at slowest
And I don't know what I'm doing different, but my 3D indicator is around 90% during the whole encode...
 
Last edited:
That's really strange...and mine stays at 1
I suppose it could be a firmware difference between brands. I don't remember what my time was...I posted in the older thread, but you actually went up I think
with the 3070. I know a new driver has come out since our original tests too.
I know on my ASRock board...if I choose between Power Saver, Standard, and Performance, it totally throws off the results of the card also, which
I wouldn't think it would since I had Handbrake set to use 100% GPU.

I did notice prices are dropping faster than I thought...The 3080ti is now in the $800's...I would have waited if I knew.

I know there are differences in the ASUS KO Edition also...haven't really jumped in to see exactly what they are...but they were originally only meant to be sold in
South Korea after they had questioned users on their expectations and desires in the card operation. Not sure if there was any performance difference, but my card is
completely silent even under 100% load. You can definitely feel the heat from it though as it exhausts out the end of the card.

I know most of the new cards have a built in limitation making them basically worthless for bitcoin mining....just a few seconds slower doing some part of the processing
was enough to accomplish it.
 
Ok, since I have an RTX 3080 and the RTX 3060 in the same machine I did some tests. Driver is
Code:
GEFORCE GAME READY DRIVER version 516.59
GPU            #0: NVIDIA GeForce RTX 3080 (8704 cores, 1725 MHz)[PCIe4x16][516.59]
GPU            #1: NVIDIA GeForce RTX 3060 (3584 cores, 1777 MHz)[PCIe4x16][516.59]

Used Video is an 11 minute cartoon from
Code:
 Lego on AP
Original file is: 1920 x 1080 pixels
And here are the results for HEVC fastest encoding, no quality improvement tweaks.

Code:
RTX 3080 same HDD
encoded 15888 frames, 505.67 fps, 1946.55 kbps, 153.62 MB
encode time 0:00:31, CPU: 4.0%, GPU: 88.9%, VE: 99.5%, VD: 64.1%, GPUClock: 1924MHz, VEClock: 1704MHz

RTX 3080 other HDD
encoded 15888 frames, 501.29 fps, 1946.55 kbps, 153.62 MB
encode time 0:00:31, CPU: 4.1%, GPU: 81.6%, VE: 99.5%, VD: 64.3%, GPUClock: 1917MHz, VEClock: 1692MHz

RTX 3060 same HDD
encoded 15888 frames, 501.77 fps, 1946.55 kbps, 153.62 MB
encode time 0:00:31, CPU: 3.4%, VE: 99.4%, VD: 72.1%, GPUClock: 1913MHz, VEClock: 1694MHz

RTX 3060 other HDD
encoded 15888 frames, 501.77 fps, 1946.55 kbps, 153.62 MB
encode time 0:00:31, CPU: 3.6%, VE: 99.2%, VD: 72.1%, GPUClock: 1868MHz, VEClock: 1660MHz


10bit HEVC
RTX 3080 same HDD
encoded 15888 frames, 496.81 fps, 1890.26 kbps, 149.17 MB
encode time 0:00:31, CPU: 3.9%, GPU: 87.5%, VE: 99.6%, VD: 63.2%, GPUClock: 1894MHz, VEClock: 1675MHz

RTX 3060 same HDD
encoded 15888 frames, 497.67 fps, 1890.26 kbps, 149.17 MB
encode time 0:00:31, CPU: 3.5%, VE: 99.3%, VD: 71.6%, GPUClock: 1907MHz, VEClock: 1689MHz


FYI I am using NVENC to encode via command line, not a GUI.
You can find NVENC here:
Code:
 https://github.com/rigaya/NVEnc

My settings used for this test saved as an option file with name hevc.txt. I removed device selection as that is only needed if there is more than one Nvidia GPU and you want to select a certain one.
Code:
--avhw
--codec hevc
--preset P1
#remove the hash in below line to enable 10 bit when/if HW support
#--output-depth 10 available
--audio-copy

command line
Code:
 "NVEncC64.exe" -i "input.mp4" --option-file "hevc.txt" -o "output.mkv"



Bitrate allocation for original file and encoded into HEVC 10 bit.

upload_2022-8-12_10-50-13.png

upload_2022-8-12_10-51-9.png

upload_2022-8-12_10-53-5.png
 
Here a test with another system and a 1660ti

Code:
GEFORCE GAME READY DRIVER version 516.59
GPU            #0: NVIDIA GeForce GTX 1660 Ti (1536 cores, 1770 MHz)[PCIe3x16][516.59]

GTX 1660 Ti same HDD
encoded 15888 frames, 471.52 fps, 1946.55 kbps, 153.62 MB
encode time 0:00:33, CPU: 10.3%, GPU: 72.4%, VE: 87.1%, VD: 56.5%, GPUClock: 1829MHz, VEClock: 1717MHz

GTX 1660 Ti same HDD - 10 bit
encoded 15888 frames, 505.79 fps, 1890.26 kbps, 149.17 MB
encode time 0:00:31, CPU: 12.0%, GPU: 86.1%, VE: 99.5%, VD: 64.2%, GPUClock: 1848MHz, VEClock: 1732MHz

upload_2022-8-12_13-46-16.png
upload_2022-8-12_13-46-55.png


FYI - I only compared the speed not the quality of the encoded results which would be possible but increases the time it takes to encode the file.
 
Thanks.
We mostly figured, that the speed does not depend on the cards generation since there obviously seems the same encoding hardware inside. Differences in encoding time can occur from different clock speeds and adjacent hardware. You probably could check that by having the 30's and 16's series in the same machine.

Interesting thing ... the 10bit encode is faster than the 8bit ... I never tried that and would have thought the opposite.

Could you try with the VidCoder Portable I linked and check if the 3D graph shows action during the encode?
 
When I run VidCoder Portable I see this in the task manager under the 3080.
upload_2022-8-12_15-1-53.png

The systems used for the testing above are:
1. i9 12900k with the two RTX 30xx
2. i7 9700 with the GTX 1660 Ti

What I noticed is that when I use VidCoder the CPU usage goes up to 25% usage.
Even with QSV decoder CPU goes up to 14% and iGPU to around 50% decoding.
With the NVENC cmd tool its max 4% on the same machine.
 
Yup, same with me ... I guess Vidcoder has some other CPU based tasks running.
Good to see, I'm not the only one with 3D utilization though (y)
Thanks for that.
 
In case anyone wants to know the quality results. I re did the tests and checked the quality during encode.

Code:
3080 hevc without 10 bit
ssim/psnr/vmaf: SSIM YUV: 0.991365 (20.637568), 0.991767 (20.844583), 0.991365 (20.637191), All: 0.991432 (20.671329), (Frames: 15888)
ssim/psnr/vmaf: PSNR YUV: 47.839443, 49.225046, 48.959433, Avg: 48.217705, (Frames: 15888)
ssim/psnr/vmaf: VMAF Score 95.783808

3080 hevc with 10 bit
ssim/psnr/vmaf: SSIM YUV: 0.992850 (21.456788), 0.992906 (21.490856), 0.992525 (21.263609), All: 0.992805 (21.429606), (Frames: 15888)
ssim/psnr/vmaf: PSNR YUV: 48.484537, 49.751860, 49.468985, Avg: 48.827750, (Frames: 15888)
ssim/psnr/vmaf: VMAF Score 96.140075

3060 hevc without 10 bit
ssim/psnr/vmaf: SSIM YUV: 0.991365 (20.637568), 0.991767 (20.844583), 0.991365 (20.637191), All: 0.991432 (20.671329), (Frames: 15888)
ssim/psnr/vmaf: PSNR YUV: 47.839443, 49.225046, 48.959433, Avg: 48.217705, (Frames: 15888)
ssim/psnr/vmaf: VMAF Score 95.783808

3060 hevc with 10 bit
ssim/psnr/vmaf: SSIM YUV: 0.992850 (21.456788), 0.992906 (21.490856), 0.992525 (21.263609), All: 0.992805 (21.429606), (Frames: 15888)
ssim/psnr/vmaf: PSNR YUV: 48.484537, 49.751860, 49.468985, Avg: 48.827750, (Frames: 15888)
ssim/psnr/vmaf: VMAF Score 96.140075

1660 Ti hevc without 10 bit
ssim/psnr/vmaf: SSIM YUV: 0.991365 (20.637568), 0.991767 (20.844583), 0.991365 (20.637191), All: 0.991432 (20.671329), (Frames: 15888)
ssim/psnr/vmaf: PSNR YUV: 47.839443, 49.225046, 48.959433, Avg: 48.217705, (Frames: 15888)
ssim/psnr/vmaf: VMAF Score 95.783808

1660 Ti hevc with 10 bit
ssim/psnr/vmaf: SSIM YUV: 0.992850 (21.456788), 0.992906 (21.490856), 0.992525 (21.263609), All: 0.992805 (21.429606), (Frames: 15888)
ssim/psnr/vmaf: PSNR YUV: 48.484537, 49.751860, 49.468985, Avg: 48.827750, (Frames: 15888)
ssim/psnr/vmaf: VMAF Score 96.140075
 
Very interesting ... that means the encoding hardware does exactly the same on all three cards.

From the score it seems not even a pixel is different:eek:
 
Here the NVENC feature comparison of the 3080, 3060 and the 1660 Ti

RTX 3080
Code:
#0: NVIDIA GeForce RTX 3080 (8704 cores, 1725 MHz)[PCIe4x16][516.59]
NVEnc features
Codec: H.264/AVC
Encoder Engines           1
Max Bframes               4
B Ref Mode                3 (each + only middle)
RC Modes                  63
Field Encoding            0 (no)
MonoChrome                no
FMO                       no
Quater-Pel MV             yes
B Direct Mode             yes
CABAC                     yes
Adaptive Transform        yes
Max Temporal Layers       4
Hierarchial P Frames      yes
Hierarchial B Frames      yes
Max Level                 62 (6.2)
Min Level                 10 (1)
4:4:4                     yes
Min Width                 145
Max Width                 4096
Min Height                49
Max Height                4096
Multiple Refs             yes
Max LTR Frames            8
Dynamic Resolution Change yes
Dynamic Bitrate Change    yes
Forced constant QP        yes
Dynamic RC Mode Change    no
Subframe Readback         yes
Constrained Encoding      yes
Intra Refresh             yes
Custom VBV Bufsize        yes
Dynamic Slice Mode        yes
Ref Pic Invalidiation     yes
PreProcess                no
Async Encoding            yes
Max MBs                   65536
Lossless                  yes
SAO                       no
Me Only Mode              1 (I,P frames)
Lookahead                 yes
AQ (temporal)             yes
Weighted Prediction       yes
10bit depth               no
Codec: H.265/HEVC
Encoder Engines           1
Max Bframes               5
B Ref Mode                3 (each + only middle)
RC Modes                  63
Field Encoding            0 (no)
MonoChrome                no
Quater-Pel MV             yes
B Direct Mode             no
Max Temporal Layers       0
Hierarchial P Frames      no
Hierarchial B Frames      no
Max Level                 186 (6.2)
Min Level                 30 (1)
4:4:4                     yes
Min Width                 129
Max Width                 8192
Min Height                33
Max Height                8192
Multiple Refs             yes
Max LTR Frames            7
Dynamic Resolution Change yes
Dynamic Bitrate Change    yes
Forced constant QP        yes
Dynamic RC Mode Change    no
Subframe Readback         yes
Constrained Encoding      yes
Intra Refresh             yes
Custom VBV Bufsize        yes
Dynamic Slice Mode        yes
Ref Pic Invalidiation     yes
PreProcess                no
Async Encoding            yes
Max MBs                   262144
Lossless                  yes
SAO                       yes
Me Only Mode              1 (I,P frames)
Lookahead                 yes
AQ (temporal)             yes
Weighted Prediction       yes
10bit depth               yes
NVDec features
  H.264/AVC:  nv12, yv12
  H.265/HEVC: nv12, yv12, yv12(10bit), yv12(12bit), yuv444, yuv444(10bit), yuv444(12bit)
  MPEG1:      nv12, yv12
  MPEG2:      nv12, yv12
  MPEG4:      nv12, yv12
  VP8:        nv12, yv12
  VP9:        nv12, yv12, yv12(10bit), yv12(12bit)
  VC-1:       nv12, yv12
  AV1:        nv12, yv12, yv12(10bit)

RTX 3060
Code:
#1: NVIDIA GeForce RTX 3060 (3584 cores, 1777 MHz)[PCIe4x16][516.59]
NVEnc features
Codec: H.264/AVC
Encoder Engines           1
Max Bframes               4
B Ref Mode                3 (each + only middle)
RC Modes                  63
Field Encoding            0 (no)
MonoChrome                no
FMO                       no
Quater-Pel MV             yes
B Direct Mode             yes
CABAC                     yes
Adaptive Transform        yes
Max Temporal Layers       4
Hierarchial P Frames      yes
Hierarchial B Frames      yes
Max Level                 62 (6.2)
Min Level                 10 (1)
4:4:4                     yes
Min Width                 145
Max Width                 4096
Min Height                49
Max Height                4096
Multiple Refs             yes
Max LTR Frames            8
Dynamic Resolution Change yes
Dynamic Bitrate Change    yes
Forced constant QP        yes
Dynamic RC Mode Change    no
Subframe Readback         yes
Constrained Encoding      yes
Intra Refresh             yes
Custom VBV Bufsize        yes
Dynamic Slice Mode        yes
Ref Pic Invalidiation     yes
PreProcess                no
Async Encoding            yes
Max MBs                   65536
Lossless                  yes
SAO                       no
Me Only Mode              1 (I,P frames)
Lookahead                 yes
AQ (temporal)             yes
Weighted Prediction       yes
10bit depth               no
Codec: H.265/HEVC
Encoder Engines           1
Max Bframes               5
B Ref Mode                3 (each + only middle)
RC Modes                  63
Field Encoding            0 (no)
MonoChrome                no
Quater-Pel MV             yes
B Direct Mode             no
Max Temporal Layers       0
Hierarchial P Frames      no
Hierarchial B Frames      no
Max Level                 186 (6.2)
Min Level                 30 (1)
4:4:4                     yes
Min Width                 129
Max Width                 8192
Min Height                33
Max Height                8192
Multiple Refs             yes
Max LTR Frames            7
Dynamic Resolution Change yes
Dynamic Bitrate Change    yes
Forced constant QP        yes
Dynamic RC Mode Change    no
Subframe Readback         yes
Constrained Encoding      yes
Intra Refresh             yes
Custom VBV Bufsize        yes
Dynamic Slice Mode        yes
Ref Pic Invalidiation     yes
PreProcess                no
Async Encoding            yes
Max MBs                   262144
Lossless                  yes
SAO                       yes
Me Only Mode              1 (I,P frames)
Lookahead                 yes
AQ (temporal)             yes
Weighted Prediction       yes
10bit depth               yes
NVDec features
  H.264/AVC:  nv12, yv12
  H.265/HEVC: nv12, yv12, yv12(10bit), yv12(12bit), yuv444, yuv444(10bit), yuv444(12bit)
  MPEG1:      nv12, yv12
  MPEG2:      nv12, yv12
  MPEG4:      nv12, yv12
  VP8:        nv12, yv12
  VP9:        nv12, yv12, yv12(10bit), yv12(12bit)
  VC-1:       nv12, yv12
  AV1:        nv12, yv12, yv12(10bit)

GTX 1660 Ti
Code:
#0: NVIDIA GeForce GTX 1660 Ti (1536 cores, 1770 MHz)[PCIe3x16][516.59]
NVEnc features
Codec: H.264/AVC
Encoder Engines           1
Max Bframes               4
B Ref Mode                3 (each + only middle)
RC Modes                  63
Field Encoding            0 (no)
MonoChrome                no
FMO                       no
Quater-Pel MV             yes
B Direct Mode             yes
CABAC                     yes
Adaptive Transform        yes
Max Temporal Layers       4
Hierarchial P Frames      yes
Hierarchial B Frames      yes
Max Level                 62 (6.2)
Min Level                 10 (1)
4:4:4                     yes
Min Width                 145
Max Width                 4096
Min Height                49
Max Height                4096
Multiple Refs             yes
Max LTR Frames            8
Dynamic Resolution Change yes
Dynamic Bitrate Change    yes
Forced constant QP        yes
Dynamic RC Mode Change    no
Subframe Readback         yes
Constrained Encoding      yes
Intra Refresh             yes
Custom VBV Bufsize        yes
Dynamic Slice Mode        yes
Ref Pic Invalidiation     yes
PreProcess                no
Async Encoding            yes
Max MBs                   65536
Lossless                  yes
SAO                       no
Me Only Mode              1 (I,P frames)
Lookahead                 yes
AQ (temporal)             yes
Weighted Prediction       yes
10bit depth               no
Codec: H.265/HEVC
Encoder Engines           1
Max Bframes               5
B Ref Mode                3 (each + only middle)
RC Modes                  63
Field Encoding            0 (no)
MonoChrome                no
Quater-Pel MV             yes
B Direct Mode             no
Max Temporal Layers       0
Hierarchial P Frames      no
Hierarchial B Frames      no
Max Level                 186 (6.2)
Min Level                 30 (1)
4:4:4                     yes
Min Width                 129
Max Width                 8192
Min Height                33
Max Height                8192
Multiple Refs             yes
Max LTR Frames            7
Dynamic Resolution Change yes
Dynamic Bitrate Change    yes
Forced constant QP        yes
Dynamic RC Mode Change    no
Subframe Readback         yes
Constrained Encoding      yes
Intra Refresh             yes
Custom VBV Bufsize        yes
Dynamic Slice Mode        yes
Ref Pic Invalidiation     yes
PreProcess                no
Async Encoding            yes
Max MBs                   262144
Lossless                  yes
SAO                       yes
Me Only Mode              1 (I,P frames)
Lookahead                 yes
AQ (temporal)             yes
Weighted Prediction       yes
10bit depth               yes
NVDec features
  H.264/AVC:  nv12, yv12
  H.265/HEVC: nv12, yv12, yv12(10bit), yv12(12bit), yuv444, yuv444(10bit), yuv444(12bit)
  MPEG1:      nv12, yv12
  MPEG2:      nv12, yv12
  MPEG4:      nv12, yv12
  VP8:        nv12, yv12
  VP9:        nv12, yv12, yv12(10bit), yv12(12bit)
  VC-1:       nv12, yv12

It seems there is no difference in the features except for NVDec feature which is not listed in the 1660 Ti
Code:
   AV1:        nv12, yv12, yv12(10bit)

@cartman0208 I think some of the things in performance and quality have been improved with Nvidias Driver updates.
 
It seems there is no difference in the features except for NVDec feature which is not listed in the 1660 Ti
That is expected
Code:
https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new
I would really like to have AV1 Encoding Hardware support ... that one is superior to HEVC regarding quality :dance:
 
That is expected
Code:
https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new
I would really like to have AV1 Encoding Hardware support ... that one is superior to HEVC regarding quality :dance:
Soon Nvidia will bring out support for the VVC (H266) implementation. You will see a 50% reduction in the size of your file and no noticeable quality loss. I have been one of the lucky ones chosen to test this, I would post it but I am under an NDA.
 
I have been one of the lucky ones chosen to test this, I would post it but I am under an NDA.
I thought you didn't want to waste your time with testing :whistle::D
Anyway I'm curious about the quality/encoding-time ratio
 
I thought you didn't want to waste your time with testing :whistle::D
Anyway I'm curious about the quality/encoding-time ratio
Quality is excellent, but encoding time is still a work in progress with the new .dll, The new codec is being held up for various reasons, it has to be compatible with many devices, new technologies in TV, and other electronics. The H266 codec and the Nvidia .dll have to work seamlessly and remember AMD is also in the mix. It's quite complicated. My part is just a small part of the process.
 
Last edited:
Back
Top