Hi yongjia,
Right, the size is indeed uniform for all work items in the same group. I just guess people may be a little bit concern of the y which is derived
from get_local_id(1).
You may consider to add the following attribute to your fast kernel to eliminate those concerns.
__attribute__((reqd_work_group_size(256, 1, 1)))
From:
张勇加 [mailto:zhang_yong_jia 126 com]
Sent: Thursday, January 23, 2014 9:48 AM
To: Victor Oliveira; tom stellard net
Cc: gegl-developer-list gnome org; Zou, Nanhai; Gong, Zhigang; Zhang, Yongjia
Subject: Re:Re: [Gegl-developer] [PATCH] Optimize operation box-blur opencl kernel
Though the barrier you were talking about is not guaranteed to be executed by all threads, but the work items in a workgroup's behaviors about this barrier are all the same, so, if a work item executes the barrier, then the others of the same work group execute
it as well, and vice versa. It's for two reasons. 1st is the 'size' viarable are all the same value in all work items and it decreases without any condition. 2nd is that if the 'break' is called because of the 'y==height-1', it is because that we were dealing
with the last row of the buffer, the work items of a work group are all in the same situation, every work item will pass this barrier and return.
|