Cuda个别库函数的兼容性 - shuffle\数学库\原子

news2024/11/28 19:00:52

兼容性针对的是不同的Cuda版本和设备计算能力(compute capability)

shuffle

在C\C++扩展一节

新版本函数见Cuda12.0 文档

__shfl_sync, __shfl_up_sync, __shfl_down_sync, and __shfl_xor_sync exchange a variable between threads within a warp.
Supported by devices of compute capability 5.0 or higher.
计算能力5.0以上设备支持_sync后缀的函数
Deprecation Notice: __shfl, __shfl_up, __shfl_down, and __shfl_xor have been deprecated in CUDA 9.0 for all devices.
Cuda 9.0 开始移除了老版不带_sync后缀的函数
Removal Notice: When targeting devices with compute capability 7.x or higher, __shfl, __shfl_up, __shfl_down, and __shfl_xor are no longer available and their sync variants should be used instead.
计算能力7.x以上的设备不支持老版函数，请更换带_sync后缀的

老版函数见Cuda8.0 文档

__shfl, __shfl_up, __shfl_down, __shfl_xor exchange a variable between threads within a warp.
Supported by devices of compute capability 3.x or higher.
支持计算能力3.x以上的设备

数学库函数

使用时直接查文档即可，在数学API参考一节，没找到就是不支持

Cuda12.0 数学函数/指令列表
在这里插入图片描述

Cuda8.0 数学函数/指令列表
在这里插入图片描述

原子

在C\C++扩展一节

Cuda12.0 文档

Cuda8.0 文档

看最新版的文档就行。BTW，吐槽一下新版的排版没有老版的方便看了

atomicAdd()
The 32-bit floating-point version of atomicAdd() is only supported by devices of compute capability 2.x and higher.
32位浮点的原子加要求计算能力2.x以上
The 64-bit floating-point version of atomicAdd() is only supported by devices of compute capability 6.x and higher.
64位浮点的原子加要求计算能力6.x以上
The 32-bit __half2 floating-point version of atomicAdd() is only supported by devices of compute capability 6.x and higher.
32位双半精度浮点的原子加要求计算能力6.x以上
The 16-bit __half floating-point version of atomicAdd() is only supported by devices of compute capability 7.x and higher.
16位半精度浮点的原子加要求计算能力7.x以上
The 16-bit __nv_bfloat16 floating-point version of atomicAdd() is only supported by devices of compute capability 8.x and higher.
16位nv半精度浮点的原子加要求计算能力8.x以上

atomicMin()、atomicMax()、atomicAnd()、atomicOr()、atomicXor()
The 64-bit version of atomicMin() / atomicMax() / atomicAnd() / atomicOr() /atomicXor()is only supported by devices of compute capability 5.0 and higher.
64位的要求计算能力5.x以上

atomicExch()、atomicInc()、atomicDec()、atomicCAS() 没有特别要求，都兼容

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/88120.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！