一、编程指南PDF下载链接(中英文档)
-
1、Metal编程指南PDF链接
https://github.com/dennie-lee/ios_tech_record/raw/main/Metal学习PDF/Metal 编程指南.pdf -
2、Metal着色语言(Metal Shader Language:简称MSL)编程指南PDF链接
https://github.com/dennie-lee/ios_tech_record/raw/main/Metal学习PDF/Metal 着色语言指南.pdf -
3、补充:官网API文档链接
https://developer.apple.com/documentation/metal/performing_calculations_on_a_gpu
二、内容前述
本文章通过元素个数相同的两个数组对应位置相加来切入Metal(GPU)的并行计算着色函数(本篇文章未涉及渲染函数:顶点着色函数和片元着色函数,会另起一篇文章介绍)。在此示例中,可以了解所有 Metal 应用程序中使用的基本任务。 您将看到如何将用 C 编写的简单函数转换为Metal着色语言 (Metal Shader Language :MSL),以便它可以在 GPU 上运行。通过创建管道,准备MSL函数在其上运行,并创建GPU可访问的数据对象。要针对您的数据执行管道,创建命令缓冲区,将命令写入其中,然后将缓冲区提交到命令队列,Metal将命令发送到GPU执行。
(下图是官网并行计算着色函数流程原理图:)
三、C语言和MSL语言对两个数组相加的函数对比
- 1、C语言函数:两个数组相加
void add_arrays(const float* inA,
const float* inB,
float* result,
int length){
for (int index = 0; index < length ; index++)
{
result[index] = inA[index] + inB[index];
}
}
- 2、MSL语言函数:两个数组相加
kernel void add_arrays(device const float* inA,
device const float* inB,
device float* result,
uint index [[thread_position_in_grid]]){
result[index] = inA[index] + inB[index];
}
关键词解释说明(在MSL编程指南PDF中可查看,点击上面的链接下载即可)
[[thread_position_in_grid]]文档中也有,更加详细的解析说明可查看这篇文章:https://juejin.cn/post/7085633906501746724,文章中还有延伸说明解释[[threadgroup_position_in_grid]]和[[threads_per_threadgroup]]
四、关键代码段解析
- 1、初始化、加载扩展名为.metal的文件、创建管道状态对象加载并行计算着色函数
//_mDevice = MTLCreateSystemDefaultDevice();
// Load the shader files with a .metal file extension in the project
id<MTLLibrary> newDefaultLibrary = [_mDevice newDefaultLibrary];
if (newDefaultLibrary == nil){
NSLog(@"Failed to find the default library");
return nil;
}
id<MTLFunction> newFunction = [newDefaultLibrary newFunctionWithName:@"add_arrays"];
if (newFunction == nil){
NSLog(@"Failed to find the adder function");
return nil;
}
NSError *error;
// Create a compute pipeline state object.
//根据扩展名为.metal文件中kernel定义的函数创建计算管道(项目Add文件)
_mAddFunctionPSO = [_mDevice newComputePipelineStateWithFunction:newFunction error:&error];
if (_mAddFunctionPSO == nil){
// If the Metal API validation is enabled, you can find out more information about what
// went wrong. (Metal API validation is enabled by default when a debug build is run
// from Xcode)
NSLog(@"Failed to create pipeline state object,error : %@)",error);
return nil;
}
// 指令队列
_mCommandQueue = [_mDevice newCommandQueue];
if (_mCommandQueue == nil){
NSLog(@"Failed to find command queue");
return nil;
}
- 2、初始化数组数据
//初始化数组数据
- (void)prepareData{
_mBufferA = [_mDevice newBufferWithLength:bufferSize options:MTLResourceStorageModeShared];
_mBufferB = [_mDevice newBufferWithLength:bufferSize options:MTLResourceStorageModeShared];
_mBufferResult = [_mDevice newBufferWithLength:bufferSize options:MTLResourceStorageModeShared];
[self generateRandomFloatData:_mBufferA];
[self generateRandomFloatData:_mBufferB];
}
- (void)generateRandomFloatData:(id<MTLBuffer>)buffer{
float *dataPtr = buffer.contents;
for (int index = 0; index < arrayLength; index++){
dataPtr[index] = (float)rand() / (float)RAND_MAX;
}
}
- 3、指令参数添加,提交GPU计算
- (void)sendComputeCommand{
//create a command buffer to hold commands
//创建指令缓存冲区
id<MTLCommandBuffer> commandBuffer = [_mCommandQueue commandBuffer];
assert(commandBuffer != nil);
//开始进行指令添加参数
id<MTLComputeCommandEncoder> computeEncoder = [commandBuffer computeCommandEncoder];
assert(computeEncoder != nil);
NSTimeInterval startTimeInterval = [[[NSDate alloc] init] timeIntervalSince1970];
[self encodeAdderCommand:computeEncoder];
//添加参数完毕,
[computeEncoder endEncoding];
//提交执行指令
[commandBuffer commit];
//等待计算完毕
[commandBuffer waitUntilCompleted];
NSTimeInterval endTimeInterval = [[[NSDate alloc] init] timeIntervalSince1970];
NSLog(@"长度为:%u 的两个数组相加(Metal方式)花费的时间:%f",arrayLength,(endTimeInterval - startTimeInterval));
//验证计算结果
[self verifyResults];
}
- (void)encodeAdderCommand:(id<MTLComputeCommandEncoder>)computeEncoder{
//encode pipeline state object and its parameters
[computeEncoder setComputePipelineState:_mAddFunctionPSO];
//此处的atIndex为0,与MSL函数中参数对应顺序对应(也可以设置buffer(0),此处可先忽略buffer,详细信息可查看着色语言编程指南PDF)
[computeEncoder setBuffer:_mBufferA offset:0 atIndex:0];
//此处的atIndex为1,与MSL函数中参数对应顺序对应(也可以设置buffer(1),此处可先忽略buffer,详细信息可查看着色语言编程指南PDF)
[computeEncoder setBuffer:_mBufferB offset:0 atIndex:1];
//此处的atIndex为1,与MSL函数中参数对应顺序对应(也可以设置buffer(2),此处可先忽略buffer,详细信息可查看着色语言编程指南PDF)
[computeEncoder setBuffer:_mBufferResult offset:0 atIndex:2];
//全部线程数量,此处对应[[thread_position_in_grid]],理解此处请看第三.2对应关键词解释的文章
MTLSize gridSize = MTLSizeMake(arrayLength, 1, 1);
NSUInteger threadGroupSize = _mAddFunctionPSO.maxTotalThreadsPerThreadgroup;
if (threadGroupSize > arrayLength){
threadGroupSize = arrayLength;
}
//线程组大小
MTLSize threadGroupsize = MTLSizeMake(threadGroupSize, 1, 1);
[computeEncoder dispatchThreads:gridSize threadsPerThreadgroup:threadGroupsize];
}
五、完整代码
例子:github链接:https://github.com/dennie-lee/MetalArrayAddDemo