博客中有一部分公式来自:cuda 线程索引ID的计算公式_blockidx.x_奕星星奕的博客-CSDN博客
我做的工作就是加了图更加形象的表示,还有公式的延申。
线程索引的计算公式
一个Grid可以包含多个Blocks,Blocks的组织方式可以是一维的,二维或者三维的。block包含多个Threads,这些Threads的组织方式也可以是一维,二维或者三维的。
CUDA中每一个线程都有一个唯一的标识ID—ThreadIdx,这个ID随着Grid和Block的划分方式的不同而变化,这里给出Grid和Block不同划分方式下线程索引ID的计算公式。
1、 grid划分成1维,block划分为1维
int threadId = blockIdx.x *blockDim.x + threadIdx.x;
特例,当一维的grid值为(1,1,1)时,此时有:
int threadId = threadIdx.x;
2、 grid划分成1维,block划分为2维
int threadId = blockIdx.x * blockDim.x * blockDim.y+ threadIdx.y * blockDim.x + threadIdx.x;
2.1、 grid划分成1维,block划分为2维(另一种排列方式)
int x = (blockIdx.x * blockDim.x) + threadIdx.x;
int y = threadIdx.y;
int threadId = y * (gridDim.x * blockDim.x) + x;
特例,当一维的grid值为(1,1,1)时,此时有:
int x = threadIdx.x;
int y = threadIdx.y;
int threadId = y * blockDim.x + x;
3、 grid划分成1维,block划分为3维 (图不好画,直接给公式)
int threadId = blockIdx.x * blockDim.x * blockDim.y * blockDim.z
+ threadIdx.z * blockDim.y * blockDim.x
+ threadIdx.y * blockDim.x + threadIdx.x;
4、 grid划分成2维,block划分为1维
int blockId = blockIdx.y * gridDim.x + blockIdx.x;
int threadId = blockId * blockDim.x + threadIdx.x;
4.1、grid划分成2维,block划分为1维(另一种排列方式)
int x = (blockIdx.x * blockDim.x) + threadIdx.x;
int y = blockIdx.y;
int threadId = y * (gridDim.x * blockDim.x) + x;
5、 grid划分成2维,block划分为2维
int blockId = blockIdx.x + blockIdx.y * gridDim.x;
int threadId = blockId * (blockDim.x * blockDim.y)
+ (threadIdx.y * blockDim.x) + threadIdx.x;
5.1、grid划分成2维,block划分为2维(另一种排列方式)
int x = (blockIdx.x * blockDim.x) + threadIdx.x;
int y = (blockIdx.y * blockDim.y) + threadIdx.y;
int threadId = y * (gridDim.x * blockDim.x) + x;
6、 grid划分成2维,block划分为3维(图不好画,直接给公式)
int blockId = blockIdx.x + blockIdx.y * gridDim.x;
int threadId = blockId * (blockDim.x * blockDim.y * blockDim.z)
+ (threadIdx.z * (blockDim.x * blockDim.y))
+ (threadIdx.y * blockDim.x) + threadIdx.x;
7、 grid划分成3维,block划分为1维 (图不好画,直接给公式)
int blockId = blockIdx.x + blockIdx.y * gridDim.x
+ gridDim.x * gridDim.y * blockIdx.z;
int threadId = blockId * blockDim.x + threadIdx.x;
8、 grid划分成3维,block划分为2维 (图不好画,直接给公式)
int blockId = blockIdx.x + blockIdx.y * gridDim.x
+ gridDim.x * gridDim.y * blockIdx.z;
int threadId = blockId * (blockDim.x * blockDim.y)
+ (threadIdx.y * blockDim.x) + threadIdx.x;
9、 grid划分成3维,block划分为3维(图不好画,直接给公式)
int blockId = blockIdx.x + blockIdx.y * gridDim.x
+ gridDim.x * gridDim.y * blockIdx.z;
int threadId = blockId * (blockDim.x * blockDim.y * blockDim.z)
+ (threadIdx.z * (blockDim.x * blockDim.y))
+ (threadIdx.y * blockDim.x) + threadIdx.x;