[gnome-remote-desktop] hwaccel-nvidia: Use a block dim of 16x16x1 for BGRX_TO_YUV420 kernel



commit ddf05636351862c53b770d090ed4fb76b541cc45
Author: Pascal Nowack <Pascal Nowack gmx de>
Date:   Mon Jan 31 21:19:58 2022 +0100

    hwaccel-nvidia: Use a block dim of 16x16x1 for BGRX_TO_YUV420 kernel
    
    Since H.264 works on 16x16 tiles, the NV12 buffer for the AVC encoder
    is created in a way to include additional padding if the width or
    height of the frame are not multiples of 16.
    For the block dimension for the launch kernel call 32x8 is currently
    used.
    However, 16x16 would be the ideal block dimension, as in that case no
    thread in a block would ever be idle.
    
    So, change the block dimension to 16x16. The block size itself will
    remain the same with this change (256).

 src/grd-hwaccel-nvidia.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
---
diff --git a/src/grd-hwaccel-nvidia.c b/src/grd-hwaccel-nvidia.c
index 6708fb0f..bbc53cae 100644
--- a/src/grd-hwaccel-nvidia.c
+++ b/src/grd-hwaccel-nvidia.c
@@ -432,8 +432,8 @@ grd_hwaccel_nvidia_avc420_encode_bgrx_frame (GrdHwAccelNvidia  *hwaccel_nvidia,
     return FALSE;
 
   /* Threads per blocks */
-  block_dim_x = 32;
-  block_dim_y = 8;
+  block_dim_x = 16;
+  block_dim_y = 16;
   block_dim_z = 1;
   /* Amount of blocks per grid */
   grid_dim_x = aligned_width / 2 / block_dim_x +


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]