本部分主要记录下使用D3D12入门所涉及到的API,记录简单使用方式供后期快速查找使用(数据参照龙书实现)。
首先看一下DX12中拥有的管线能力:
- Raster Graphics Pipeline
- Compute Graphics Pipeline
- Ray Tracing Pipeline
- Mesh Geometry Pipeline
具体管线示意如下图所示:
上边几组管线虽然各有不同,但有些基础API是通用的,因此本文从五分方面介绍DX12 API使用流程,具体每个API的详细使用规则,还请查看官方文档(Windows DirectX 12 API)或者龙书相关讲解。
一、DX12 Base API
1.1 工厂(Factory)
Factory(工厂,实在不愿意叫这个中翻)是 DirectX 12 API 的入口点。它们可用于查找适配器,然后可以创建设备和其他重要数据结构。
还可以用Factory创建一个调试控制器(类似VK校验层),该控制器可以校验 API 使用情况,只能在“Debug”下使用。
具体创建于使用方法如下:
// 声明DirectX 12句柄
IDXGIFactory4* factory;
ID3D12Debug1* debugController;
// 创建
UINT dxgiFactoryFlags = 0;
#if defined(_DEBUG)
// 创建一个调试控制器来跟踪错误
ID3D12Debug* dc;
ThrowIfFailed(D3D12GetDebugInterface(IID_PPV_ARGS(&dc)));
ThrowIfFailed(dc->QueryInterface(IID_PPV_ARGS(&debugController)));
debugController->EnableDebugLayer();
debugController->SetEnableGPUBasedValidation(true);
dxgiFactoryFlags |= DXGI_CREATE_FACTORY_DEBUG;
dc->Release();
dc = nullptr;
#endif
HRESULT result = CreateDXGIFactory2(dxgiFactoryFlags, IID_PPV_ARGS(&factory));
1.2 适配器(Adapter)
适配器(其实就是显卡)提供有关给定 DirectX 设备的物理属性的信息。您可以使用它查询当前 GPU 名称、厂商商、内存量等信息。
适配器有:软件和硬件适配器 2 类。基于软件的DirectX实现,可以在没有专用硬件(集显)的情况下使用,你可以使用API遍历下适配器就知道了。
具体创建于使用方法如下:
// 声明句柄
IDXGIAdapter1* adapter;
// 创建Adapter
for (UINT adapterIndex = 0;
DXGI_ERROR_NOT_FOUND != factory->EnumAdapters1(adapterIndex, &adapter);
++adapterIndex)
{
DXGI_ADAPTER_DESC1 desc;
adapter->GetDesc1(&desc);
// 一般不要选择默认的渲染驱动程序适配器
if (desc.Flags & DXGI_ADAPTER_FLAG_SOFTWARE)
{
continue;
}
// 检查适配器是否支持Direct3D 12,并将其用于应用程序的其余部分
if (SUCCEEDED(D3D12CreateDevice(adapter, D3D_FEATURE_LEVEL_12_0, _uuidof(ID3D12Device), nullptr)))
{
break;
}
// 如果不用的话,要释放掉
adapter->Release();
}
1.3 设备(Device)
Device允许你创建常用的数据对象,如命令队列、分配器、管道、缓冲区、缓冲区视图、着色器blobs、堆和同步原语等资源,其实所有创建相关操作都是用的Device(和DX11一样)。
具体创建于使用方法如下:
// 声明句柄
ID3D12Device* device;
// 创建
ID3D12Device* pDev = nullptr;
ThrowIfFailed(D3D12CreateDevice(adapter, D3D_FEATURE_LEVEL_12_0, IID_PPV_ARGS(&device)));
Debug Device(调试设备)允许你依赖DirectX 12的调试模式。通常跟踪用DirectX创建的数据结构很费劲,但是通过Debug Device你可以防止数据泄露或验证API的使用正确性。
// 声明句柄
ID3D12DebugDevice* debugDevice;
#if defined(_DEBUG)
// 获取
ThrowIfFailed(device->QueryInterface(&debugDevice));
#endif
1.4 命令队列(Command Queue)
命令队列允许您提交命令组(称为命令列表)一起按顺序执行,从而允许 GPU 保持忙碌并优化其执行速度。这些命令可以是指定缓冲区需要屏障、等待上传到 GPU 完成、执行栅格或计算管道等。
此处需要注意合理使用队列将会是DX12的一个优势,DX12 定义了三种不同的命令队列类型:
- 复制队列:可用于发出命令以复制资源数据(CPU -> GPU、GPU -> GPU、GPU -> CPU)。
- 计算队列:可以执行复制队列可以执行的所有操作,并发出计算(调度)命令。
- 绘制队列:可以执行复制和计算队列可以执行的所有操作,并发出绘制命令。
上边三个按顺序向上兼容。可以参照结构体(BUNDLE类似VK的pipelineCache,有兴趣的自行查阅):
enum D3D12_COMMAND_LIST_TYPE
{
D3D12_COMMAND_LIST_TYPE_DIRECT = 0,
D3D12_COMMAND_LIST_TYPE_BUNDLE = 1,
D3D12_COMMAND_LIST_TYPE_COMPUTE = 2,
D3D12_COMMAND_LIST_TYPE_COPY = 3,
D3D12_COMMAND_LIST_TYPE_VIDEO_DECODE = 4,
D3D12_COMMAND_LIST_TYPE_VIDEO_PROCESS = 5,
D3D12_COMMAND_LIST_TYPE_VIDEO_ENCODE = 6
} D3D12_COMMAND_LIST_TYPE;
具体创建于使用方法如下:
// 声明句柄
ID3D12CommandQueue* commandQueue;
// 创建队列
D3D12_COMMAND_QUEUE_DESC queueDesc = {};
queueDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE;
queueDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT;
ThrowIfFailed(device->CreateCommandQueue(&queueDesc, IID_PPV_ARGS(&commandQueue)));
1.5 命令分配器(Command Allocator)
命令分配器允许您创建命令列表,您可以在其中定义您希望 GPU 为该分配器执行的功能(类似VK的CommandPool)。
具体创建于使用方法如下:
// 声明句柄
ID3D12CommandAllocator* commandAllocator;
// 创建命令分配器
ThrowIfFailed(device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(&commandAllocator)));
1.6 同步(Synchronization)
DirectX 12 具有许多基本同步原语(虽然没有没有VK多,此处GPU共享内存间的同步语句就不提了),可以帮助驱动程序了解将来如何使用资源、了解 GPU 何时完成任务等,同步是DX12多线程渲染你必须掌握的一个新特性。
1.6.1 Fence (栅栏)
Fence (栅栏)可以让您的程序知道 GPU 何时执行了某些任务,如资源是否上传到GPU了,资源是否转换完成了等。栅栏存储单个值,该值指示用于发出栅栏信号的最后一个值。尽管可以将同一 fence 对象与多个命令队列一起使用,但确保命令队列之间命令的正确同步并不可靠。因此,建议为每个命令队列至少创建一个围栏对象。多个命令队列可以在栅栏上等待达到特定值,但只应允许从单个命令队列发出信号。除了栅栏对象之外,应用程序还必须跟踪用于向栅栏发出信号的栅栏值。
其中主要使用一下三种方法:
- IsFenceComplete:检查是否已达到围栏的完成值。
- WaitForFenceValue:停止 CPU 线程,直到达到围栏值。
- Signal:将栅栏值插入命令队列。当命令队列中达到该值时,用于向命令队列发出信号的栅栏将设置其已完成值。
具体创建于使用方法如下:
// 声明句柄
UINT frameIndex;
HANDLE fenceEvent;
ID3D12Fence* fence;
UINT64 fenceValue;
// 创建围栏
ThrowIfFailed(device->CreateFence(0, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(&fence)));
我们可以来看一下执行 GPU 同步的示例:
在上图中,主线程上发出了几个命令。在此示例中,第一帧表示为 Frame N 。命令列表在命令队列上执行。执行命令列表后,队列立即收到值N,当命令队列到达该点时,栅栏将使用指定的值发出信号。
紧跟在Signal之后,有一个命令等待前一帧 WaitForFenceValue() 完成。由于Frame N-1中的命令队列中没有命令,因此执行将继续,Frame N-1而不会使 CPU 线程停止。
然后Frame N+1在 CPU 线程上构建并在绘制命令队列上执行。在 CPU 可以继续之前,命令队列必须完成Frame N使用中的资源。在这种情况下,CPU 必须等到N到达指示命令队列已完成这些资源的信号。
命令队列使用完Frame N中的资源后,Frame N+2可以在队列上生成和执行。如果队列仍在处理来自Frame N+1的命令,则 CPU 必须再次等待这些资源可用,然后才能继续。
此示例演示了一个典型的双缓冲方案。您可能认为使用三重缓冲进行渲染将减少 CPU 等待 GPU 完成其工作的时间。这是解决问题的简单解决方案。每当 CPU 发出命令的速度比命令队列处理这些命令的速度快时,CPU 将不得不在某个时刻停止,以便命令队列赶上 CPU。
1.6.2 Barrier (屏障)
Barrier (屏障)可以让驱动程序知道如何在即将执行的命令中使用资源。如您正在写入纹理,并且想要将该纹理复制到另一个纹理(例如交换链的渲染附件、UAV的Read <=> Write等)。
具体创建于使用方法如下:
// 声明句柄
ID3D12GraphicsCommandList* commandList;
// 创建屏障
D3D12_RESOURCE_BARRIER barrier = {};
result.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
result.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
barrier.Transition.pResource = texResource;
barrier.Transition.StateBefore = D3D12_RESOURCE_STATE_COPY_SOURCE;
barrier.Transition.StateAfter = D3D12_RESOURCE_STATE_UNORDERED_ACCESS;
barrier.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
commandList->ResourceBarrier(1, &barrier);
1.7 交换链(Swapchain)
交换链处理交换和分配后台缓冲区,以显示您正在渲染给窗口的数据。
具体创建于使用方法如下:
unsigned width = 640;
unsigned height = 640;
// 声明句柄
static const UINT backbufferCount = 2;
UINT currentBuffer;
ID3D12DescriptorHeap* renderTargetViewHeap;
ID3D12Resource* renderTargets[backbufferCount];
UINT rtvDescriptorSize;
// 声明交换链
IDXGISwapChain3* swapchain;
D3D12_VIEWPORT viewport;
D3D12_RECT surfaceSize;
surfaceSize.left = 0;
surfaceSize.top = 0;
surfaceSize.right = static_cast<LONG>(width);
surfaceSize.bottom = static_cast<LONG>(height);
viewport.TopLeftX = 0.0f;
viewport.TopLeftY = 0.0f;
viewport.Width = static_cast<float>(width);
viewport.Height = static_cast<float>(height);
viewport.MinDepth = .1f;
viewport.MaxDepth = 1000.f;
if (swapchain != nullptr)
{
// 从交换链创建渲染目标附件
swapchain->ResizeBuffers(backbufferCount, width, height,
DXGI_FORMAT_R8G8B8A8_UNORM, 0);
}
else
{
// 创建交换链
DXGI_SWAP_CHAIN_DESC1 swapchainDesc = {};
swapchainDesc.BufferCount = backbufferCount;
swapchainDesc.Width = width;
swapchainDesc.Height = height;
swapchainDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
swapchainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
swapchainDesc.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD;
swapchainDesc.SampleDesc.Count = 1;
IDXGISwapChain1* newSwapchain =
xgfx::createSwapchain(window, factory, commandQueue, &swapchainDesc);
HRESULT swapchainSupport = swapchain->QueryInterface(
__uuidof(IDXGISwapChain3), (void**)&newSwapchain);
if (SUCCEEDED(swapchainSupport))
{
swapchain = (IDXGISwapChain3*)newSwapchain;
}
}
frameIndex = swapchain->GetCurrentBackBufferIndex();
// 描述并创建一个渲染目标视图(RTV)描述符堆。
D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {};
rtvHeapDesc.NumDescriptors = backbufferCount;
rtvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV;
rtvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE;
ThrowIfFailed(device->CreateDescriptorHeap(
&rtvHeapDesc, IID_PPV_ARGS(&renderTargetViewHeap)));
rtvDescriptorSize =
device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV);
// 创建帧资源
D3D12_CPU_DESCRIPTOR_HANDLE
rtvHandle(renderTargetViewHeap->GetCPUDescriptorHandleForHeapStart());
// 为每一帧创建RTV
for (UINT n = 0; n < backbufferCount; n++)
{
ThrowIfFailed(swapchain->GetBuffer(n, IID_PPV_ARGS(&renderTargets[n])));
device->CreateRenderTargetView(renderTargets[n], nullptr, rtvHandle);
rtvHandle.ptr += (1 * rtvDescriptorSize);
}
二、Raster Graphics Pipeline API
光栅化管线现阶段最常用的 GPU 渲染管线。
2.1 描述符堆(Descriptor Heaps)
描述符堆是处理存储着色器引用对象描述所需的内存分配的对象,说起来很绕,其实如上图,它就是资源数组视图。从 DirectX 12 开始,在创建资源视图(如呈现目标视图 (RTV)、着色器资源视图 (SRV)、无序访问视图 (UAV) 或常量缓冲区视图 (CBV))之前,需要创建描述符堆。可以在同一堆中创建某些类型的资源视图(描述符)。例如,CBV、SRV 和 UAV 可以存储在同一个堆中,但 RTV 和采样器视图都需要单独的描述符堆。
下边看一下创建rtv描述符堆的创建实例:
ID3D12DescriptorHeap* renderTargetViewHeap;
D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {};
rtvHeapDesc.NumDescriptors = backbufferCount;
rtvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV;
rtvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE;
ThrowIfFailed(device->CreateDescriptorHeap(
&rtvHeapDesc, IID_PPV_ARGS(&renderTargetViewHeap)));
2.2 根签名(Root Signature)
根签名是定义着色器可以访问的资源类型的对象,无论是常量缓冲区、结构化缓冲区、采样器、纹理、结构化缓冲区等。
根签名可以包含任意数量的参数。每个参数可以是以下类型之一:
- D3D12_ROOT_PARAMETER_TYPE_32BIT_CONSTANTS:32 位根常量
- D3D12_ROOT_PARAMETER_TYPE_CBV:常量描述符
- D3D12_ROOT_PARAMETER_TYPE_SRV:着色器资源描述符
- D3D12_ROOT_PARAMETER_TYPE_UAV:无序描述符
- D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE:描述符表
其中,介绍一下常用的描述附表:
上图说明了具有单个描述符表参数 (A) 的根签名。描述符表包含三个描述符范围 (B):3 个常量缓冲区视图 (CBV)、4 个着色器资源视图 (SRV) 和 2 个无序访问视图 (UAV)。CBV、SRV 和 UAV 可以在同一描述符表中引用,因为所有三种类型的描述符都可以存储在同一个描述符堆中。GPU 可见描述符 (C) 必须连续出现在 GPU 可见描述符堆中,但它们引用的资源 (D) 可能出现在 GPU 内存中的任何位置,甚至出现在不同的资源堆中。上图显示了描述符表范围的简化版本。描述符表中的描述符范围还需要指定参数将在着色器中绑定到的寄存器槽和空格。
// 声明句柄
ID3D12RootSignature* rootSignature;
// 判断支持版本
D3D12_FEATURE_DATA_ROOT_SIGNATURE featureData = {};
featureData.HighestVersion = D3D_ROOT_SIGNATURE_VERSION_1_1;
if (FAILED(device->CheckFeatureSupport(D3D12_FEATURE_ROOT_SIGNATURE,
&featureData, sizeof(featureData))))
{
featureData.HighestVersion = D3D_ROOT_SIGNATURE_VERSION_1_0;
}
// 单个GPU资源
D3D12_DESCRIPTOR_RANGE1 ranges[1];
ranges[0].BaseShaderRegister = 0;
ranges[0].RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_CBV;
ranges[0].NumDescriptors = 1;
ranges[0].RegisterSpace = 0;
ranges[0].OffsetInDescriptorsFromTableStart = 0;
ranges[0].Flags = D3D12_DESCRIPTOR_RANGE_FLAG_NONE;
// GPU资源分组
D3D12_ROOT_PARAMETER1 rootParameters[1];
rootParameters[0].ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE;
rootParameters[0].ShaderVisibility = D3D12_SHADER_VISIBILITY_VERTEX;
rootParameters[0].DescriptorTable.NumDescriptorRanges = 1;
rootParameters[0].DescriptorTable.pDescriptorRanges = ranges;
// 总布局
D3D12_VERSIONED_ROOT_SIGNATURE_DESC rootSignatureDesc;
rootSignatureDesc.Version = D3D_ROOT_SIGNATURE_VERSION_1_1;
rootSignatureDesc.Desc_1_1.Flags =
D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT;
rootSignatureDesc.Desc_1_1.NumParameters = 1;
rootSignatureDesc.Desc_1_1.pParameters = rootParameters;
rootSignatureDesc.Desc_1_1.NumStaticSamplers = 0;
rootSignatureDesc.Desc_1_1.pStaticSamplers = nullptr;
ID3DBlob* signature;
ID3DBlob* error;
try
{
// 创建根签名
ThrowIfFailed(D3D12SerializeVersionedRootSignature(&rootSignatureDesc,
&signature, &error));
ThrowIfFailed(device->CreateRootSignature(0, signature->GetBufferPointer(),
signature->GetBufferSize(),
IID_PPV_ARGS(&rootSignature)));
rootSignature->SetName(L"Hello Triangle Root Signature");
}
catch (std::exception e)
{
const char* errStr = (const char*)error->GetBufferPointer();
std::cout << errStr;
error->Release();
error = nullptr;
}
if (signature)
{
signature->Release();
signature = nullptr;
}
2.3 堆(Heaps)
此处先说一下,DX12的资源有三种
-
ID3D12Device::CreateCommittedResource:这类资源非常适合分配大型资源,如纹理或静态大小的资源(资源的大小不会更改)。
-
ID3D12Device::CreatePlacedResource:此类资源可以显式放置在堆中的特定偏移量中,类似内存池的方式来自行控制空间的使用以解决显存碎片等问题。可以有以下两种使用方式:
放置的资源放置在特定内存堆中的特定偏移量处:
放置的资源可以在堆中使用区分: -
ID3D12Device::CreateReservedResource:可以创建大于单个堆所能容纳的预留资源。可以使用驻留在物理 GPU 内存中的一个或多个堆来映射(和取消映射)保留资源的某些部分。
使用预留资源,可以使用虚拟内存创建大型卷纹理,但只需要将卷纹理的驻留空间映射到物理内存。此资源类型提供了用于实现使用稀疏体素八叉树而不超出 GPU 内存预算的渲染技术的选项。
堆是包含 GPU 内存的对象。它们可用于将顶点缓冲区或纹理等资源上传到 GPU 独占内存。
具体创建于使用方法如下:
// 更新资源示例
// 声明句柄
ID3D12Resource* uploadBuffer;
std::vector<unsigned char> sourceData;
D3D12_HEAP_PROPERTIES uploadHeapProps = {D3D12_HEAP_TYPE_UPLOAD,
D3D12_CPU_PAGE_PROPERTY_UNKNOWN,
D3D12_MEMORY_POOL_UNKNOWN, 1u, 1u};
D3D12_RESOURCE_DESC uploadBufferDesc = {D3D12_RESOURCE_DIMENSION_BUFFER,
65536ull,
65536ull,
1u,
1,
1,
DXGI_FORMAT_UNKNOWN,
{1u, 0u},
D3D12_TEXTURE_LAYOUT_ROW_MAJOR,
D3D12_RESOURCE_FLAG_NONE};
result = device->CreateCommittedResource(
&uploadHeapProps, D3D12_HEAP_FLAG_NONE, &uploadBufferDesc,
D3D12_RESOURCE_STATE_GENERIC_READ, nullptr, __uuidof(ID3D12Resource),
((void**)&uploadBuffer));
uint8_t* data = nullptr;
D3D12_RANGE range{ 0, SIZE_T(chunkSize) };
auto hr = spStaging->Map(0, &range, reinterpret_cast<void**>(&data));
if (FAILED(hr)) {
resourceInfo.state = ResourceInfo::STATE_ERROR_COULDNT_MAP;
STREAMER_FAIL_IF(true, "Could not map resource", D3D12ResourceStreamerStatus::FAIL);
}
// 拷贝数据
if (resourceDesc.Dimension == D3D12_RESOURCE_DIMENSION_BUFFER) {
memcpy(data, sourceData.data(), sourceData.size());
// 回读资源示例
// 声明句柄
ID3D12Resource* readbackBuffer;
D3D12_HEAP_PROPERTIES heapPropsRead = {D3D12_HEAP_TYPE_READBACK,
D3D12_CPU_PAGE_PROPERTY_UNKNOWN,
D3D12_MEMORY_POOL_UNKNOWN, 1u, 1u};
D3D12_RESOURCE_DESC resourceDescDimBuffer = {
D3D12_RESOURCE_DIMENSION_BUFFER,
65536ull,
2725888ull,
1u,
1,
1,
DXGI_FORMAT_UNKNOWN,
{1u, 0u},
D3D12_TEXTURE_LAYOUT_ROW_MAJOR,
D3D12_RESOURCE_FLAG_DENY_SHADER_RESOURCE};
result = device->CreateCommittedResource(
&heapPropsRead, D3D12_HEAP_FLAG_NONE, &resourceDescDimBuffer,
D3D12_RESOURCE_STATE_COPY_DEST, nullptr, __uuidof(ID3D12Resource),
((void**)&readbackBuffer));
通过创建自己的堆,您可以更加精细地进行显存管理。
2.4 顶点缓冲区(Vertex Buffer)
顶点缓冲区将每个顶点信息作为属性存储在顶点着色器中。所有缓冲区都是 DirectX 12 中的对象,无论是顶点缓冲区、索引缓冲区、常量缓冲区等都是ID3D12Resource(不再和DX11一样区分的很开)。
具体创建于使用方法如下:
// 声明顶点数据
struct Vertex
{
float position[3];
float color[3];
};
Vertex vertexBufferData[3] = {{{1.0f, -1.0f, 0.0f}, {1.0f, 0.0f, 0.0f}},
{{-1.0f, -1.0f, 0.0f}, {0.0f, 1.0f, 0.0f}},
{{0.0f, 1.0f, 0.0f}, {0.0f, 0.0f, 1.0f}}};
// 声明句柄
ID3D12Resource* vertexBuffer;
D3D12_VERTEX_BUFFER_VIEW vertexBufferView;
const UINT vertexBufferSize = sizeof(vertexBufferData);
D3D12_HEAP_PROPERTIES heapProps;
heapProps.Type = D3D12_HEAP_TYPE_UPLOAD;
heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN;
heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN;
heapProps.CreationNodeMask = 1;
heapProps.VisibleNodeMask = 1;
D3D12_RESOURCE_DESC vertexBufferResourceDesc;
vertexBufferResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
vertexBufferResourceDesc.Alignment = 0;
vertexBufferResourceDesc.Width = vertexBufferSize;
vertexBufferResourceDesc.Height = 1;
vertexBufferResourceDesc.DepthOrArraySize = 1;
vertexBufferResourceDesc.MipLevels = 1;
vertexBufferResourceDesc.Format = DXGI_FORMAT_UNKNOWN;
vertexBufferResourceDesc.SampleDesc.Count = 1;
vertexBufferResourceDesc.SampleDesc.Quality = 0;
vertexBufferResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
vertexBufferResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
ThrowIfFailed(device->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &vertexBufferResourceDesc,
D3D12_RESOURCE_STATE_GENERIC_READ, nullptr, IID_PPV_ARGS(&vertexBuffer)));
// 拷贝三角型数据到顶点缓冲区
UINT8* pVertexDataBegin;
// 不从CPU上的这个资源中读取。
D3D12_RANGE readRange;
readRange.Begin = 0;
readRange.End = 0;
ThrowIfFailed(vertexBuffer->Map(0, &readRange,
reinterpret_cast<void**>(&pVertexDataBegin)));
memcpy(pVertexDataBegin, vertexBufferData, sizeof(vertexBufferData));
vertexBuffer->Unmap(0, nullptr);
// 初始化顶点缓冲区视图.
vertexBufferView.BufferLocation = vertexBuffer->GetGPUVirtualAddress();
vertexBufferView.StrideInBytes = sizeof(Vertex);
vertexBufferView.SizeInBytes = vertexBufferSize;
2.4 索引缓冲区(Index Buffer)
索引缓冲区包含要绘制的每个三角形/线/点的各个索引信息。
具体创建于使用方法如下:
// 声明数据
uint32_t indexBufferData[3] = {0, 1, 2};
// 声明索引
ID3D12Resource* indexBuffer;
D3D12_INDEX_BUFFER_VIEW indexBufferView;
const UINT indexBufferSize = sizeof(indexBufferData);
D3D12_HEAP_PROPERTIES heapProps;
heapProps.Type = D3D12_HEAP_TYPE_UPLOAD;
heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN;
heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN;
heapProps.CreationNodeMask = 1;
heapProps.VisibleNodeMask = 1;
D3D12_RESOURCE_DESC vertexBufferResourceDesc;
vertexBufferResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
vertexBufferResourceDesc.Alignment = 0;
vertexBufferResourceDesc.Width = indexBufferSize;
vertexBufferResourceDesc.Height = 1;
vertexBufferResourceDesc.DepthOrArraySize = 1;
vertexBufferResourceDesc.MipLevels = 1;
vertexBufferResourceDesc.Format = DXGI_FORMAT_UNKNOWN;
vertexBufferResourceDesc.SampleDesc.Count = 1;
vertexBufferResourceDesc.SampleDesc.Quality = 0;
vertexBufferResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
vertexBufferResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
ThrowIfFailed(device->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &vertexBufferResourceDesc,
D3D12_RESOURCE_STATE_GENERIC_READ, nullptr, IID_PPV_ARGS(&indexBuffer)));
// 将数据复制到DirectX 12驱动程序内存:
UINT8* pVertexDataBegin;
D3D12_RANGE readRange;
readRange.Begin = 0;
readRange.End = 0;
ThrowIfFailed(indexBuffer->Map(0, &readRange,
reinterpret_cast<void**>(&pVertexDataBegin)));
memcpy(pVertexDataBegin, indexBufferData, sizeof(indexBufferData));
indexBuffer->Unmap(0, nullptr);
// 初始化索引缓冲区视图
indexBufferView.BufferLocation = indexBuffer->GetGPUVirtualAddress();
indexBufferView.Format = DXGI_FORMAT_R32_UINT;
indexBufferView.SizeInBytes = indexBufferSize;
2.6 常量缓冲区(Constant Buffer)
常量缓冲区描述我们在绘制时将发送到着色器阶段的数据。通常,您会在此处放置模型视图投影矩阵或任何特定的变量数据。
具体创建于使用方法如下:
// 声明数据
struct
{
glm::mat4 projectionMatrix;
glm::mat4 modelMatrix;
glm::mat4 viewMatrix;
} cbVS;
// 声明句柄
ID3D12Resource* constantBuffer;
ID3D12DescriptorHeap* constantBufferHeap;
UINT8* mappedConstantBuffer;
// 创建常量缓冲
D3D12_HEAP_PROPERTIES heapProps;
heapProps.Type = D3D12_HEAP_TYPE_UPLOAD;
heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN;
heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN;
heapProps.CreationNodeMask = 1;
heapProps.VisibleNodeMask = 1;
D3D12_DESCRIPTOR_HEAP_DESC heapDesc = {};
heapDesc.NumDescriptors = 1;
heapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE;
heapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV;
ThrowIfFailed(device->CreateDescriptorHeap(&heapDesc,
IID_PPV_ARGS(&constantBufferHeap)));
D3D12_RESOURCE_DESC cbResourceDesc;
cbResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
cbResourceDesc.Alignment = 0;
cbResourceDesc.Width = (sizeof(cbVS) + 255) & ~255;
cbResourceDesc.Height = 1;
cbResourceDesc.DepthOrArraySize = 1;
cbResourceDesc.MipLevels = 1;
cbResourceDesc.Format = DXGI_FORMAT_UNKNOWN;
cbResourceDesc.SampleDesc.Count = 1;
cbResourceDesc.SampleDesc.Quality = 0;
cbResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
cbResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
ThrowIfFailed(device->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &cbResourceDesc,
D3D12_RESOURCE_STATE_GENERIC_READ, nullptr, IID_PPV_ARGS(&constantBuffer)));
constantBufferHeap->SetName(L"Constant Buffer Upload Resource Heap");
// 创建常量缓冲区视图
D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc = {};
cbvDesc.BufferLocation = constantBuffer->GetGPUVirtualAddress();
cbvDesc.SizeInBytes =
(sizeof(cbVS) + 255) & ~255; // 经典向上取整
D3D12_CPU_DESCRIPTOR_HANDLE
cbvHandle(constantBufferHeap->GetCPUDescriptorHandleForHeapStart());
cbvHandle.ptr = cbvHandle.ptr + device->GetDescriptorHandleIncrementSize(
D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV) *
0;
device->CreateConstantBufferView(&cbvDesc, cbvHandle);
D3D12_RANGE readRange;
readRange.Begin = 0;
readRange.End = 0;
ThrowIfFailed(constantBuffer->Map(
0, &readRange, reinterpret_cast<void**>(&mappedConstantBuffer)));
memcpy(mappedConstantBuffer, &cbVS, sizeof(cbVS));
constantBuffer->Unmap(0, &readRange);
2.7 顶点着色器(Vertex Shader)
顶点着色器按顶点执行坐标系转换,可以做混合形状、GPU 蒙皮等执行每个顶点动画。
cbuffer cb : register(b0)
{
row_major float4x4 projectionMatrix : packoffset(c0);
row_major float4x4 modelMatrix : packoffset(c4);
row_major float4x4 viewMatrix : packoffset(c8);
};
struct VertexInput
{
float3 inPos : POSITION;
float3 inColor : COLOR;
};
struct VertexOutput
{
float3 color : COLOR;
float4 position : SV_Position;
};
VertexOutput main(VertexInput vertexInput)
{
float3 inColor = vertexInput.inColor;
float3 inPos = vertexInput.inPos;
float3 outColor = inColor;
float4 position = mul(float4(inPos, 1.0f), mul(modelMatrix, mul(viewMatrix, projectionMatrix)));
VertexOutput output;
output.position = position;
output.color = outColor;
return output;
}
您可以使用与 DirectX 11/12 API 捆绑在一起的传统 DirectX 着色器编译器编译着色器,但最好使用较新的官方编译器:DirectXShaderCompiler
例:
dxc.exe -T lib_6_3 -Fo assets/triangle.vert.dxil assets/triangle.vert.hlsl
然后,您可以将着色器加载为二进制文件:
// 二进制读取
inline std::vector<char> readFile(const std::string& filename)
{
std::ifstream file(filename, std::ios::ate | std::ios::binary);
bool exists = (bool)file;
if (!exists || !file.is_open())
{
throw std::runtime_error("failed to open file!");
}
size_t fileSize = (size_t)file.tellg();
std::vector<char> buffer(fileSize);
file.seekg(0);
file.read(buffer.data(), fileSize);
file.close();
return buffer;
};
// 声明句柄
D3D12_SHADER_BYTECODE vsBytecode;
std::string compiledPath;
std::vector<char> vsBytecodeData = readFile(compCompiledPath);
vsBytecode.pShaderBytecode = vsBytecodeData.data();
vsBytecode.BytecodeLength = vsBytecodeData.size();
2.8 片元着色器(Pixel Shader)
像素/片元着色器按输出的每个像素执行,包括与该像素坐标对应的其他附件。
struct PixelInput
{
float3 color : COLOR;
};
struct PixelOutput
{
float4 attachment0 : SV_Target0;
};
PixelOutput main(PixelInput pixelInput)
{
float3 inColor = pixelInput.color;
PixelOutput output;
output.attachment0 = float4(inColor, 1.0f);
return output;
}
2.9 管线状态(Pipeline State)
管线状态描述了执行给定的基于栅格的绘制调用所需的一切。
// 声明句柄
ID3D12PipelineState* pipelineState;
// 定义PSO描述
D3D12_GRAPHICS_PIPELINE_STATE_DESC psoDesc = {};
// 输入布局
D3D12_INPUT_ELEMENT_DESC inputElementDescs[] = {
{"POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0,
D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0},
{"COLOR", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 12,
D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0}};
psoDesc.InputLayout = {inputElementDescs, _countof(inputElementDescs)};
// 资源布局
psoDesc.pRootSignature = rootSignature;
// 顶点着色器
D3D12_SHADER_BYTECODE vsBytecode;
vsBytecode.pShaderBytecode = vertexShaderBlob->GetBufferPointer();
vsBytecode.BytecodeLength = vertexShaderBlob->GetBufferSize();
psoDesc.VS = vsBytecode;
// 片元着色器
D3D12_SHADER_BYTECODE psBytecode;
psBytecode.pShaderBytecode = pixelShaderBlob->GetBufferPointer();
psBytecode.BytecodeLength = pixelShaderBlob->GetBufferSize();
psoDesc.PS = psBytecode;
// 光栅化状态
D3D12_RASTERIZER_DESC rasterDesc;
rasterDesc.FillMode = D3D12_FILL_MODE_SOLID;
rasterDesc.CullMode = D3D12_CULL_MODE_NONE;
rasterDesc.FrontCounterClockwise = FALSE;
rasterDesc.DepthBias = D3D12_DEFAULT_DEPTH_BIAS;
rasterDesc.DepthBiasClamp = D3D12_DEFAULT_DEPTH_BIAS_CLAMP;
rasterDesc.SlopeScaledDepthBias = D3D12_DEFAULT_SLOPE_SCALED_DEPTH_BIAS;
rasterDesc.DepthClipEnable = TRUE;
rasterDesc.MultisampleEnable = FALSE;
rasterDesc.AntialiasedLineEnable = FALSE;
rasterDesc.ForcedSampleCount = 0;
rasterDesc.ConservativeRaster = D3D12_CONSERVATIVE_RASTERIZATION_MODE_OFF;
psoDesc.RasterizerState = rasterDesc;
psoDesc.PrimitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE;
// 混合状态
D3D12_BLEND_DESC blendDesc;
blendDesc.AlphaToCoverageEnable = FALSE;
blendDesc.IndependentBlendEnable = FALSE;
const D3D12_RENDER_TARGET_BLEND_DESC defaultRenderTargetBlendDesc = {
FALSE,
FALSE,
D3D12_BLEND_ONE,
D3D12_BLEND_ZERO,
D3D12_BLEND_OP_ADD,
D3D12_BLEND_ONE,
D3D12_BLEND_ZERO,
D3D12_BLEND_OP_ADD,
D3D12_LOGIC_OP_NOOP,
D3D12_COLOR_WRITE_ENABLE_ALL,
};
for (UINT i = 0; i < D3D12_SIMULTANEOUS_RENDER_TARGET_COUNT; ++i)
blendDesc.RenderTarget[i] = defaultRenderTargetBlendDesc;
psoDesc.BlendState = blendDesc;
// 深度模板状态
psoDesc.DepthStencilState.DepthEnable = FALSE;
psoDesc.DepthStencilState.StencilEnable = FALSE;
psoDesc.SampleMask = UINT_MAX;
// 输出
psoDesc.NumRenderTargets = 1;
psoDesc.RTVFormats[0] = DXGI_FORMAT_R8G8B8A8_UNORM;
psoDesc.SampleDesc.Count = 1;
// 创建PSO
try
{
ThrowIfFailed(device->CreateGraphicsPipelineState(
&psoDesc, IID_PPV_ARGS(&pipelineState)));
}
catch (std::exception e)
{
std::cout << "Failed to create Graphics Pipeline!";
}
2.10 管线命令(Commands API)
为了执行绘制调用,你可以对管线与资源进行命令录制,然后提交到对应队列去并行执行,具体命令如上图所示。
// 什么句柄
ID3D12CommandAllocator* commandAllocator;
ID3D12PipelineState* initialPipelineState;
ID3D12GraphicsCommandList* commandList;
// 创建命令队列
ThrowIfFailed(device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT,
commandAllocator, initialPipelineState,
IID_PPV_ARGS(&commandList)));
稍后,对这些命令进行录制并提交它们:
// 重置命令列表并添加新命令.
ThrowIfFailed(commandAllocator->Reset());
// 开始使用光栅管线
ThrowIfFailed(commandList->Reset(commandAllocator, pipelineState));
// 布局资源
commandList->SetGraphicsRootSignature(rootSignature);
ID3D12DescriptorHeap* pDescriptorHeaps[] = {constantBufferHeap};
commandList->SetDescriptorHeaps(_countof(pDescriptorHeaps), pDescriptorHeaps);
D3D12_GPU_DESCRIPTOR_HANDLE
cbvHandle(constantBufferHeap->GetGPUDescriptorHandleForHeapStart());
commandList->SetGraphicsRootDescriptorTable(0, cbvHandle);
// 指定后台缓冲将用作渲染目标。
D3D12_RESOURCE_BARRIER renderTargetBarrier;
renderTargetBarrier.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
renderTargetBarrier.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
renderTargetBarrier.Transition.pResource = renderTargets[frameIndex];
renderTargetBarrier.Transition.StateBefore = D3D12_RESOURCE_STATE_PRESENT;
renderTargetBarrier.Transition.StateAfter = D3D12_RESOURCE_STATE_RENDER_TARGET;
renderTargetBarrier.Transition.Subresource =
D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
commandList->ResourceBarrier(1, &renderTargetBarrier);
D3D12_CPU_DESCRIPTOR_HANDLE
rtvHandle(rtvHeap->GetCPUDescriptorHandleForHeapStart());
rtvHandle.ptr = rtvHandle.ptr + (frameIndex * rtvDescriptorSize);
commandList->OMSetRenderTargets(1, &rtvHandle, FALSE, nullptr);
// 记录光栅化命令。
const float clearColor[] = {0.2f, 0.2f, 0.2f, 1.0f};
commandList->RSSetViewports(1, &viewport);
commandList->RSSetScissorRects(1, &surfaceSize);
commandList->ClearRenderTargetView(rtvHandle, clearColor, 0, nullptr);
commandList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
commandList->IASetVertexBuffers(0, 1, &vertexBufferView);
commandList->IASetIndexBuffer(&indexBufferView);
commandList->DrawIndexedInstanced(3, 1, 0, 0, 0);
// 屏障保证后端缓冲区将用于呈现。
D3D12_RESOURCE_BARRIER presentBarrier;
presentBarrier.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
presentBarrier.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
presentBarrier.Transition.pResource = renderTargets[frameIndex];
presentBarrier.Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
presentBarrier.Transition.StateAfter = D3D12_RESOURCE_STATE_PRESENT;
presentBarrier.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
commandList->ResourceBarrier(1, &presentBarrier);
ThrowIfFailed(commandList->Close());
2.11 展示(Render)
Triangle Raster Gif
在 DirectX 12 中展示交换链中数据很简单:更改要更新的任何常量缓冲区数据、提交要执行的命令列表、显示交换链到窗口,并向应用程序发出已完成呈现的信号即可。
具体创建于使用方法如下:
// 声明句柄
std::chrono::time_point<std::chrono::steady_clock> tStart, tEnd;
float elapsedTime = 0.0f;
void render()
{
// FPS相关
tEnd = std::chrono::high_resolution_clock::now();
float time =
std::chrono::duration<float, std::milli>(tEnd - tStart).count();
if (time < (1000.0f / 60.0f))
{
return;
}
tStart = std::chrono::high_resolution_clock::now();
// 更新CBV Data
elapsedTime += 0.001f * time;
elapsedTime = fmodf(elapsedTime, 6.283185307179586f);
cbVS.modelMatrix = Matrix4::rotationY(elapsedTime);
D3D12_RANGE readRange;
readRange.Begin = 0;
readRange.End = 0;
ThrowIfFailed(constantBuffer->Map(
0, &readRange, reinterpret_cast<void**>(&mappedConstantBuffer)));
memcpy(mappedConstantBuffer, &cbVS, sizeof(cbVS));
constantBuffer->Unmap(0, &readRange);
setupCommands();
ID3D12CommandList* ppCommandLists[] = {commandList};
commandQueue->ExecuteCommandLists(_countof(ppCommandLists), ppCommandLists);
// 展示
swapchain->Present(1, 0);
const UINT64 fence = fenceValue;
ThrowIfFailed(commandQueue->Signal(fence, fence));
fenceValue++;
if (fence->GetCompletedValue() < fence)
{
ThrowIfFailed(fence->SetEventOnCompletion(fence, fenceEvent));
WaitForSingleObject(fenceEvent, INFINITE);
}
frameIndex = swapchain->GetCurrentBackBufferIndex();
}
三、Compute Graphics Pipeline API
DirectX 12 中的计算管道可以说比图形管道容易得多。配置它的复杂性要低得多,资源和计算的分离更加明确。让我们先回顾一下计算管道的工作原理、如何编写计算着色器以及使用 DirectX 12 执行计算。
3.1 管线资源(Resources)
与光栅或光线追踪管道一样,计算管线将需要资源,无论是三角形数据的结构化缓冲区、正在读取/写入的呈现目标无序访问视图,还是常见数据的常量缓冲区视图。
3.2 根签名(Root Signatures)
与其他管道一样,您可以通过定义根签名访问相应的资源。
具体创建于使用方法如下:
// 创建根签名(和光栅化管线一样)。
D3D12_FEATURE_DATA_ROOT_SIGNATURE featureData = {};
featureData.HighestVersion = D3D_ROOT_SIGNATURE_VERSION_1_1;
if (FAILED(device->CheckFeatureSupport(D3D12_FEATURE_ROOT_SIGNATURE,
&featureData,
sizeof(featureData))))
{
featureData.HighestVersion = D3D_ROOT_SIGNATURE_VERSION_1_0;
}
D3D12_DESCRIPTOR_RANGE1 ranges[1];
ranges[0].BaseShaderRegister = 0;
ranges[0].RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_UAV;
ranges[0].NumDescriptors = 1;
ranges[0].RegisterSpace = 0;
ranges[0].OffsetInDescriptorsFromTableStart = 0;
ranges[0].Flags = D3D12_DESCRIPTOR_RANGE_FLAG_DATA_VOLATILE;
D3D12_ROOT_PARAMETER1 rootParameters[1];
rootParameters[0].ParameterType =
D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE;
rootParameters[0].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL;
rootParameters[0].DescriptorTable.NumDescriptorRanges = 1;
rootParameters[0].DescriptorTable.pDescriptorRanges = ranges;
D3D12_VERSIONED_ROOT_SIGNATURE_DESC rootSignatureDesc;
rootSignatureDesc.Version = D3D_ROOT_SIGNATURE_VERSION_1_1;
rootSignatureDesc.Desc_1_1.Flags = D3D12_ROOT_SIGNATURE_FLAG_NONE;
rootSignatureDesc.Desc_1_1.NumParameters = 1;
rootSignatureDesc.Desc_1_1.pParameters = rootParameters;
rootSignatureDesc.Desc_1_1.NumStaticSamplers = 0;
rootSignatureDesc.Desc_1_1.pStaticSamplers = nullptr;
ID3DBlob* signatureBlob;
ID3DBlob* error;
try
{
ThrowIfFailed(D3D12SerializeVersionedRootSignature(
&rootSignatureDesc, &signatureBlob, &error));
ThrowIfFailed(mDevice->CreateRootSignature(
0, signature->GetBufferPointer(), signatureBlob->GetBufferSize(),
IID_PPV_ARGS(&rootSignature)));
rootSignature->SetName(L"Hello Compute Root Signature");
}
catch (std::exception e)
{
const char* errStr = (const char*)error->GetBufferPointer();
std::cout << errStr;
error->Release();
error = nullptr;
}
if (signatureBlob)
{
signatureBlob->Release();
signatureBlob = nullptr;
}
计算着色器
计算着色器编译比在光栅或光线追踪中容易得多,因为只有一个可编程阶段。
RWTexture2D<float4> tOutput : register(u0);
[numthreads(16, 16, 1)]
void main(uint3 groupThreadID : SV_GroupThreadID, // The current thread group (so pixel) of this group defined by `numthreads`
uint3 groupID : SV_GroupID, // The current thread group ID, the group of threads defined in `Dispatch(x,y,z)`
uint groupIndex : SV_GroupIndex, // The index of this group (so represent the group ID linearly)
uint3 dispatchThreadID: SV_DispatchThreadID) // Your current pixel
{
tOutput[dispatchThreadID.xy] = float4( float(groupThreadID.x) / 16.0, float(groupThreadID.y) / 16.0, dispatchThreadID.x / 1280.0, 1.0);
}
3.3 管道状态(Pipeline State)
计算管道只需要着色器和根签名。
具体创建于使用方法如下:
D3D12_COMPUTE_PIPELINE_STATE_DESC psoDesc = {};
psoDesc.pRootSignature = rootSignature;
D3D12_SHADER_BYTECODE csBytecode;
csBytecode.pShaderBytecode = compShader->GetBufferPointer();
csBytecode.BytecodeLength = compShader->GetBufferSize();
psoDesc.CS = csBytecode;
try
{
ThrowIfFailed(mDevice->CreateComputePipelineState(
&psoDesc, IID_PPV_ARGS(&pipelineState)));
}
catch (std::exception e)
{
std::cout << "Failed to create Compute Pipeline!";
}
if (compShader)
{
compShader->Release();
compShader = nullptr;
}
3.4 无序访问视图(Unordered Access View)
让我们创建一个呈现目标的无序访问视图,我们将向其写入数据:
具体创建于使用方法如下:
// 创建临时纹理
D3D12_DESCRIPTOR_HEAP_DESC heapDesc = {};
heapDesc.NumDescriptors = 1;
heapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE;
heapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV;
ThrowIfFailed(
mDevice->CreateDescriptorHeap(&heapDesc, IID_PPV_ARGS(&mUavHeap)));
D3D12_RESOURCE_DESC texResourceDesc = {};
texResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;
texResourceDesc.Alignment = 0;
texResourceDesc.Width = mWidth;
texResourceDesc.Height = mHeight;
texResourceDesc.DepthOrArraySize = 1;
texResourceDesc.MipLevels = 1;
texResourceDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
texResourceDesc.SampleDesc.Count = 1;
texResourceDesc.SampleDesc.Quality = 0;
texResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN;
texResourceDesc.Flags = D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS;
D3D12_CLEAR_VALUE clearValue = {};
clearValue.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
clearValue.Color[0] = clearValue.Color[1] = clearValue.Color[2] =
clearValue.Color[3] = 1.f;
D3D12_HEAP_PROPERTIES heapProps;
heapProps.Type = D3D12_HEAP_TYPE_DEFAULT;
heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN;
heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN;
heapProps.CreationNodeMask = 1;
heapProps.VisibleNodeMask = 1;
ThrowIfFailed(mDevice->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &texResourceDesc,
D3D12_RESOURCE_STATE_UNORDERED_ACCESS, nullptr,
IID_PPV_ARGS(&mTexResource)));
mTexResource->SetName(L"Compute Target");
mUAVDescriptorSize = mDevice->GetDescriptorHandleIncrementSize(
D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV);
auto AllocateDescriptor =
[&](D3D12_CPU_DESCRIPTOR_HANDLE* cpuDescriptor,
UINT descriptorIndexToUse)
{
auto descriptorHeapCpuBase =
mUavHeap->GetCPUDescriptorHandleForHeapStart();
if (descriptorIndexToUse >= mUavHeap->GetDesc().NumDescriptors)
{
descriptorIndexToUse = mDescriptorsAllocated++;
}
*cpuDescriptor = D3D12_CPU_DESCRIPTOR_HANDLE{
descriptorHeapCpuBase.ptr +
INT64(descriptorIndexToUse) * INT64(mUAVDescriptorSize)};
return descriptorIndexToUse;
};
heapIndex = AllocateDescriptor(&uavCPUHandle, heapIndex);
uavGPUHandle = D3D12_GPU_DESCRIPTOR_HANDLE{
mUavHeap->GetGPUDescriptorHandleForHeapStart().ptr +
INT64(0) * INT64(mUAVDescriptorSize)};
D3D12_UNORDERED_ACCESS_VIEW_DESC uavDesc = {};
uavDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
uavDesc.ViewDimension = D3D12_UAV_DIMENSION_TEXTURE2D;
mDevice->CreateUnorderedAccessView(mTexResource, nullptr, &uavDesc,
uavCPUHandle);
3.5 命令调用(Compute Calls)
执行计算调用的核心是使用一组给定的线程组作为参数进行调用。示例中执行屏幕空间计算着色器,因此屏幕中的每个 16x16 块都有一个组。
具体创建于使用方法如下:
void setupCommands()
{
ThrowIfFailed(commandAllocator->Reset());
ThrowIfFailed(commandList->Reset(commandAllocator, mPipelineState));
// 设置资源
commandList->SetComputeRootSignature(rootSignature);
ID3D12DescriptorHeap* pDescriptorHeaps[] = {mUavHeap};
commandList->SetDescriptorHeaps(_countof(pDescriptorHeaps),
pDescriptorHeaps);
commandList->SetComputeRootDescriptorTable(0, uavGPUHandle);
auto divCiel = [](unsigned val, unsigned x) -> unsigned
{ return val / x + ((val % x) > 0 ? 1 : 0); };
commandList->Dispatch(divCiel(width, 16), divCiel(height, 16), 1);
D3D12_RESOURCE_BARRIER preCopyBarriers[2];
preCopyBarriers[0] = CD3DX12_RESOURCE_BARRIER::Transition(
mRenderTargets[frameIndex], D3D12_RESOURCE_STATE_PRESENT,
D3D12_RESOURCE_STATE_COPY_DEST);
preCopyBarriers[1] = CD3DX12_RESOURCE_BARRIER::Transition(
mTexResource, D3D12_RESOURCE_STATE_UNORDERED_ACCESS,
D3D12_RESOURCE_STATE_COPY_SOURCE);
commandList->ResourceBarrier(ARRAYSIZE(preCopyBarriers), preCopyBarriers);
commandList->CopyResource(renderTargets[frameIndex], texResource);
D3D12_RESOURCE_BARRIER postCopyBarriers[2];
postCopyBarriers[0] = CD3DX12_RESOURCE_BARRIER::Transition(
mRenderTargets[fameIndex], D3D12_RESOURCE_STATE_COPY_DEST,
D3D12_RESOURCE_STATE_PRESENT);
postCopyBarriers[1] = CD3DX12_RESOURCE_BARRIER::Transition(
mTexResource, D3D12_RESOURCE_STATE_COPY_SOURCE,
D3D12_RESOURCE_STATE_UNORDERED_ACCESS);
CD3DX12_RESOURCE_BARRIER result = {};
D3D12_RESOURCE_BARRIER& barrier = result;
result.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
result.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
barrier.Transition.pResource = mTexResource;
barrier.Transition.StateBefore = D3D12_RESOURCE_STATE_COPY_SOURCE;
barrier.Transition.StateAfter = D3D12_RESOURCE_STATE_UNORDERED_ACCESS;
barrier.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
commandList->ResourceBarrier(ARRAYSIZE(postCopyBarriers),
postCopyBarriers);
ThrowIfFailed(commandList->Close());
}
至此,结束。
四、Ray Tracing Pipeline API
DirectX 12 中的光线追踪带来了硬件加速的光线查询API,为常用的光追算法带来了更加遍历的计算方式(和VK光追管线差不多)。
4.1 编译着色器(Compiling Shaders)
DirectX 12光线追踪引入了 5 个新的可编程着色器阶段:
-
光线生成(Ray Generation Shader) - 例如,为了创建起始光线,这些光线可以是来自视口眼睛的光线或者从着色点发出的阴影求交光线。
-
相交(Intersection Shader)- 处理命中的自定义解决方案,而不是依赖供应商计算交集的能力,您可以以编程方式报告给定光线命中量的交集。
-
任何命中(Any Hit Shader) - 给定光线中的任何命中,包括远离视点的命中,适用于厚度测试或半透明物体。
-
最近命中(Closest Hit Shader) - 给定光线的最早命中,最接近视点位置。
-
未命中 (Miss Shader)- 如果光线未能击中场景中的任何东西。对于从天空盒中采样。
具体创建于使用方法如下:
// 全局资源
RaytracingAccelerationStructure Scene : register(t0, space0);
RWTexture2D<float4> tOutput : register(u0);
cbuffer globalCB : register(b0)
{
row_major float4x4 projectionMatrix : packoffset(c0);
row_major float4x4 viewMatrix : packoffset(c4);
float3 origin : packoffset(c8);
float near : packoffset(c9.x);
float far : packoffset(c9.y);
};
struct RayPayload
{
float4 color;
};
/*
Ray Generation Shader
*/
[shader("raygeneration")]
void raygen()
{
float2 screenCoords =
(float2)DispatchRaysIndex() / (float2)DispatchRaysDimensions();
float2 ndc = screenCoords * 2.0 - float2(1.0, 1.0);
float3 rayDir =
normalize(mul(mul(viewMatrix, projectionMatrix),
float4(ndc * (far - near), far + near, far - near))).xyz;
RayDesc ray;
ray.Origin = origin.xyz;
ray.Direction = rayDir;
ray.TMin = 0.1;
ray.TMax = 300.0;
RayPayload payload = {float4(0, 0, 0, 0)};
TraceRay(Scene, RAY_FLAG_NONE, ~0, 0, 1, 0, ray, payload);
tOutput[DispatchRaysIndex().xy] = payload.color;
}
/*
Closest Hit Shader
*/
// Local Resources
struct LocalCB
{
float time;
};
ConstantBuffer<LocalCB> localCB : register(b1);
Texture2D<float4> localTex : register(t1);
SamplerState localSampler : register(s0);
[shader("closesthit")]
void closesthit(inout RayPayload payload,
in BuiltInTriangleIntersectionAttributes attr)
{
float3 barycentrics = float3(1 - attr.barycentrics.x - attr.barycentrics.y,
attr.barycentrics.x, attr.barycentrics.y);
float4 col = localTex.SampleLevel(localSampler, barycentrics.xy - barycentrics.yz * sin(localCB.time), 0.0);
payload.color = col;
}
/*
Miss Shader
*/
[shader("miss")]
void miss(inout RayPayload payload)
{
payload.color = float4(0.5, 0.5, 0.5, 1);
}
着色器可以使用 DirectX 着色器编译器离线编译:
// 运行 DirectXShaderCompiler
dxc.exe -T lib_6_3 -Fo assets/triangle.rt.dxil assets/triangle.rt.hlsl
4.2 根签名(Root Signature)
光线追踪管道需要本地和全局根签名来描述着色器可以读取的资源。
全局根签名可用于从不同光线着色器中读取/写入的任何共享数据,例如加速结构或辐射缓冲区。
具体创建于使用方法如下:
// 声明句柄
ID3D12RootSignature* globalRootSignature = nullptr;
// 全局根签名
// 这些可以在DispatchRays之前配置
// 输出辐射数据
D3D12_DESCRIPTOR_RANGE1 ranges[2];
ranges[0].BaseShaderRegister = 0;
ranges[0].RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_UAV;
ranges[0].NumDescriptors = 1;
ranges[0].RegisterSpace = 0;
ranges[0].OffsetInDescriptorsFromTableStart = 0;
ranges[0].Flags = D3D12_DESCRIPTOR_RANGE_FLAG_DATA_VOLATILE;
// 相机常量数据
ranges[1].BaseShaderRegister = 0;
ranges[1].RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_CBV;
ranges[1].NumDescriptors = 1;
ranges[1].RegisterSpace = 0;
ranges[1].OffsetInDescriptorsFromTableStart =
D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND;
ranges[1].Flags = D3D12_DESCRIPTOR_RANGE_FLAG_DATA_VOLATILE |
D3D12_DESCRIPTOR_RANGE_FLAG_DESCRIPTORS_VOLATILE;
D3D12_ROOT_PARAMETER1 rootParameters[2];
// UAV, CBV
rootParameters[0].ParameterType =
D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE;
rootParameters[0].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL;
rootParameters[0].DescriptorTable.NumDescriptorRanges =
_countof(ranges);
rootParameters[0].DescriptorTable.pDescriptorRanges = ranges;
// Acceleration Structures
rootParameters[1].ParameterType = D3D12_ROOT_PARAMETER_TYPE_SRV;
rootParameters[1].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL;
rootParameters[1].Descriptor = {};
D3D12_VERSIONED_ROOT_SIGNATURE_DESC rootSignatureDesc;
rootSignatureDesc.Version = D3D_ROOT_SIGNATURE_VERSION_1_1;
rootSignatureDesc.Desc_1_1.Flags = D3D12_ROOT_SIGNATURE_FLAG_NONE;
rootSignatureDesc.Desc_1_1.NumParameters = _countof(rootParameters);
rootSignatureDesc.Desc_1_1.pParameters = rootParameters;
rootSignatureDesc.Desc_1_1.NumStaticSamplers = 0;
rootSignatureDesc.Desc_1_1.pStaticSamplers = nullptr;
ID3DBlob* signature;
ID3DBlob* error;
try
{
ThrowIfFailed(D3D12SerializeVersionedRootSignature(
&rootSignatureDesc, &signature, &error));
ThrowIfFailed(device->CreateRootSignature(
0, signature->GetBufferPointer(), signature->GetBufferSize(),
IID_PPV_ARGS(&globalRootSignature)));
globalRootSignature->SetName(L"Global Root Signature");
}
catch (std::exception e)
{
const char* errStr = (const char*)error->GetBufferPointer();
std::cout << errStr;
error->Release();
error = nullptr;
}
if (signature)
{
signature->Release();
signature = nullptr;
}
本地根签名的独特性在于光线跟踪的每个可编程步骤都可以定义自己的本地根签名。
// 声明句柄
ID3D12RootSignature* localRootSignature = nullptr;
// 本地根签名
// 着色器表可以配置这个根签名。
D3D12_ROOT_PARAMETER1 rootParameters[2] = {};
// 视口常量数据
rootParameters[0].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL;
rootParameters[0].ParameterType =
D3D12_ROOT_PARAMETER_TYPE_32BIT_CONSTANTS;
rootParameters[0].Constants = {};
rootParameters[0].Constants.ShaderRegister = 1;
rootParameters[0].Constants.Num32BitValues =
((sizeof(mRayGenCB) - 1) / sizeof(UINT32) + 1);
// 纹理参数SRV表
D3D12_DESCRIPTOR_RANGE1 ranges[1] = {};
ranges[0].BaseShaderRegister = 1;
ranges[0].RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SRV;
ranges[0].NumDescriptors = 1;
ranges[0].RegisterSpace = 0;
ranges[0].OffsetInDescriptorsFromTableStart = 2;
ranges[0].Flags = D3D12_DESCRIPTOR_RANGE_FLAG_DATA_STATIC;
rootParameters[1].ParameterType =
D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE;
rootParameters[1].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL;
rootParameters[1].DescriptorTable.pDescriptorRanges = &ranges[0];
rootParameters[1].DescriptorTable.NumDescriptorRanges =
_countof(ranges);
D3D12_STATIC_SAMPLER_DESC sampler = {};
sampler.Filter = D3D12_FILTER_MIN_MAG_MIP_POINT;
sampler.AddressU = D3D12_TEXTURE_ADDRESS_MODE_BORDER;
sampler.AddressV = D3D12_TEXTURE_ADDRESS_MODE_BORDER;
sampler.AddressW = D3D12_TEXTURE_ADDRESS_MODE_BORDER;
sampler.MipLODBias = 0;
sampler.MaxAnisotropy = 0;
sampler.ComparisonFunc = D3D12_COMPARISON_FUNC_NEVER;
sampler.BorderColor = D3D12_STATIC_BORDER_COLOR_TRANSPARENT_BLACK;
sampler.MinLOD = 0.0f;
sampler.MaxLOD = D3D12_FLOAT32_MAX;
sampler.ShaderRegister = 0;
sampler.RegisterSpace = 0;
sampler.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL;
D3D12_VERSIONED_ROOT_SIGNATURE_DESC rootSignatureDesc;
rootSignatureDesc.Version = D3D_ROOT_SIGNATURE_VERSION_1_1;
rootSignatureDesc.Desc_1_1.Flags =
D3D12_ROOT_SIGNATURE_FLAG_LOCAL_ROOT_SIGNATURE;
rootSignatureDesc.Desc_1_1.NumParameters = _countof(rootParameters);
rootSignatureDesc.Desc_1_1.pParameters = rootParameters;
rootSignatureDesc.Desc_1_1.NumStaticSamplers = 1;
rootSignatureDesc.Desc_1_1.pStaticSamplers = &sampler;
ID3DBlob* signature;
ID3DBlob* error;
try
{
ThrowIfFailed(D3D12SerializeVersionedRootSignature(
&rootSignatureDesc, &signature, &error));
ThrowIfFailed(device->CreateRootSignature(
0, signature->GetBufferPointer(), signature->GetBufferSize(),
IID_PPV_ARGS(&localRootSignature)));
localRootSignature->SetName(L"Local Root Signature");
}
catch (std::exception e)
{
const char* errStr = (const char*)error->GetBufferPointer();
std::cout << errStr;
error->Release();
error = nullptr;
}
if (signature)
{
signature->Release();
signature = nullptr;
}
4.3 着色器表(Shader Table)
着色器记录表是着色器记录的 64 位对齐的列表、包含标识符的列表以及来自本地根签名的着色器的本地根参数。这些根参数可以是常量缓冲区的全部数据,也可以是纹理、缓冲区、无序访问视图等的描述符句柄,如D3D12_RAYTRACING_SHADER_TABLE_BYTE_ALIGNMENT。
具体创建于使用方法如下:
struct LocalCB
{
float time;
float _padding[3];
};
struct RootArguments
{
LocalCB localCB;
D3D12_GPU_DESCRIPTOR_HANDLE localTex = {};
};
struct ShaderRecord
{
unsigned char shaderIdentifier[D3D12_SHADER_IDENTIFIER_SIZE_IN_BYTES];
RootArguments rootArguments;
};
4.4 管线状态(Pipeline State)
光线追踪管线是一种硬件加速方式,用于遍历加速数据结构以查找给定场景中的对象。DirectX 光线追踪采用更具声明性的管道状态描述布局,而不是像栅格或计算管道那样的严格布局,在这种布局中,您可以使用任何需要的信息定义子对象。
创建光线追踪管线至少需要以下各项:
-
DXIL 着色器库,其中包含您在调度光线时将使用的已编译着色器。
-
着色器配置,用于定义交叉点/未命中时命中有效负载的大小。
-
管道配置,用于定义遍历射线的递归深度级别。
-
三角命中组,用于存储命中信息。
-
使用本地资源的本地根签名。
-
全局资源(如缓冲区)的全局根签名,如RWTexture radiance。
具体创建于使用方法如下:
// 声明句柄
ID3D12PipelineState* pipelineState;
ID3D12StateObjectProperties* stateObjectProperties;
// 管线状态描述
D3D12_STATE_OBJECT_DESC stateObjectDesc = {};
stateObjectDesc.Type = D3D12_STATE_OBJECT_TYPE_RAYTRACING_PIPELINE;
D3D12_STATE_SUBOBJECT subObjects[7];
stateObjectDesc.NumSubobjects = _countof(subObjects);
stateObjectDesc.pSubobjects = subObjects;
D3D12_EXPORT_DESC* exportDesc = nullptr;
D3D12_SHADER_BYTECODE rtShaderByteCode = {rsBytecodeData.data(),
rsBytecodeData.size()};
// 着色器库
D3D12_DXIL_LIBRARY_DESC dxilLibraryDesc = {rtShaderByteCode, 0u,
exportDesc};
subObjects[0u] = {D3D12_STATE_SUBOBJECT_TYPE_DXIL_LIBRARY,
&dxilLibraryDesc};
// 使用的着色器
static LPCWSTR shaderNames[3] = {
L"raygen", L"closesthit", L"miss"};
D3D12_SUBOBJECT_TO_EXPORTS_ASSOCIATION exportAssociations = {
subObjects + 4, 3u, shaderNames};
subObjects[1] = {
D3D12_STATE_SUBOBJECT_TYPE_SUBOBJECT_TO_EXPORTS_ASSOCIATION,
&exportAssociations};
// 全局共享数据Payload
D3D12_RAYTRACING_SHADER_CONFIG shaderConfig = {};
shaderConfig.MaxPayloadSizeInBytes = 4 * sizeof(float); // float4 color
shaderConfig.MaxAttributeSizeInBytes =
2 * sizeof(float); // float2 barycentrics
subObjects[2] = {D3D12_STATE_SUBOBJECT_TYPE_RAYTRACING_SHADER_CONFIG,
&shaderConfig};
// 击中组
D3D12_HIT_GROUP_DESC myClosestHitGroup = {
L"MyHitGroup", D3D12_HIT_GROUP_TYPE_TRIANGLES, nullptr,
L"closesthit", nullptr};
subObjects[3] = {D3D12_STATE_SUBOBJECT_TYPE_HIT_GROUP,
&myClosestHitGroup};
// 根签名
subObjects[4] = {D3D12_STATE_SUBOBJECT_TYPE_LOCAL_ROOT_SIGNATURE,
&localRootSignature};
subObjects[5] = {D3D12_STATE_SUBOBJECT_TYPE_GLOBAL_ROOT_SIGNATURE,
&globalRootSignature};
// 递归深度
D3D12_RAYTRACING_PIPELINE_CONFIG pipelineConfig = {1u};
pipelineConfig.MaxTraceRecursionDepth = 1;
subObjects[6] = {D3D12_STATE_SUBOBJECT_TYPE_RAYTRACING_PIPELINE_CONFIG,
&pipelineConfig};
HRESULT result = device->CreateStateObject(
&stateObjectDesc, __uuidof(**(&pipelineState)),
IID_PPV_ARGS_Helper(&pipelineState));
pipelineState->SetName(L"RT Pipeline State");
pipelineState->QueryInterface(
__uuidof(**(&stateObjectProperties)),
IID_PPV_ARGS_Helper(&stateObjectProperties));
4.5 加速结构(Acceleration Structures)
加速数据结构(或加速结构/AS)可加快给定场景的遍历速度,并表示硬件供应商用于遍历一组三角形或轴对齐边界框 (AABB) 的内部加速度数据结构。
临时缓冲区包含用于创建加速结构的中间数据。
具体创建于使用方法如下:
// 声明句柄
ID3D12Device5* device;
ID3D12Resource* vertexBuffer;
ID3D12Resource* indexBuffer;
// 声明输出数据
ID3D12Resource* asBuffer;
ID3D12Resource* asScratchBuffer;
ID3D12Resource* instanceDescs;
HRESULT hr;
// TLAS
D3D12_RAYTRACING_GEOMETRY_DESC geomDescs[1];
geomDescs[0].Type = D3D12_RAYTRACING_GEOMETRY_TYPE_TRIANGLES;
geomDescs[0].Triangles.IndexBuffer = indexBuffer->GetGPUVirtualAddress();
geomDescs[0].Triangles.IndexCount =
static_cast<UINT>(indexBuffer->GetDesc().Width) / sizeof(UINT16);
geomDescs[0].Triangles.IndexFormat = DXGI_FORMAT_R16_UINT;
geomDescs[0].Triangles.Transform3x4 = 0;
geomDescs[0].Triangles.VertexFormat = DXGI_FORMAT_R32G32B32_FLOAT;
geomDescs[0].Triangles.VertexCount =
static_cast<UINT>(vertexBuffer->GetDesc().Width) / sizeof(Vertex);
geomDescs[0].Triangles.VertexBuffer.StartAddress =
vertexBuffer->GetGPUVirtualAddress();
geomDescs[0].Triangles.VertexBuffer.StrideInBytes = sizeof(Vertex);
// 获取预构建信息
D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_INPUTS topLevelInputs[1];
topLevelInputs[0].DescsLayout = D3D12_ELEMENTS_LAYOUT_ARRAY;
topLevelInputs[0].Flags = D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_PREFER_FAST_TRACE;
topLevelInputs[0].NumDescs = 1;
topLevelInputs[0].Type = D3D12_RAYTRACING_ACCELERATION_STRUCTURE_TYPE_TOP_LEVEL;
D3D12_RAYTRACING_ACCELERATION_STRUCTURE_PREBUILD_INFO topLevelPrebuildInfo = {};
device->GetRaytracingAccelerationStructurePrebuildInfo(inputopLevelInputsts, topLevelPrebuildInfo);
D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_INPUTS bottomLevelInputs[1];
bottomLevelInputs[0] = topLevelInputs[0];
bottomLevelInputs.Type = D3D12_RAYTRACING_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL;
bottomLevelInputs.pGeometryDescs = geomDescs;
D3D12_RAYTRACING_ACCELERATION_STRUCTURE_PREBUILD_INFO bottomLevelPrebuildInfo = {};
mDevice->GetRaytracingAccelerationStructurePrebuildInfo(&bottomLevelInputs, &bottomLevelPrebuildInfo);
// 创建资源
D3D12_RESOURCE_DESC asResourceDesc;
asResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
asResourceDesc.Alignment = 0;
asResourceDesc.Width = max(topLevelPrebuildInfo.ScratchDataSizeInBytes,
bottomLevelPrebuildInfo.ScratchDataSizeInBytes);
asResourceDesc.Height = 1;
asResourceDesc.DepthOrArraySize = 1;
asResourceDesc.MipLevels = 1;
asResourceDesc.Format = DXGI_FORMAT_UNKNOWN;
asResourceDesc.SampleDesc.Count = 1;
asResourceDesc.SampleDesc.Quality = 0;
asResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
asResourceDesc.Flags = D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS;
// 创建AS资源
hr = device->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &asResourceDesc,
D3D12_RESOURCE_STATE_RAYTRACING_ACCELERATION_STRUCTURE, nullptr, IID_PPV_ARGS(&asScratchBuffer));
// 创建TLAS缓冲
asResourceDesc.width = topLevelPrebuildInfo.ResultDataMaxSizeInBytes;
hr = device->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &asResourceDesc,
D3D12_RESOURCE_STATE_RAYTRACING_ACCELERATION_STRUCTURE, nullptr, IID_PPV_ARGS(&tlasBuffer));
// 创建BLAS资源
asResourceDesc.width = bottomLevelPrebuildInfo.ResultDataMaxSizeInBytes;
hr = device->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &asResourceDesc,
D3D12_RESOURCE_STATE_RAYTRACING_ACCELERATION_STRUCTURE, nullptr, IID_PPV_ARGS(&blasBuffer));
D3D12_RAYTRACING_INSTANCE_DESC instanceDesc = {};
instanceDesc.Transform[0][0] = instanceDesc.Transform[1][1] = instanceDesc.Transform[2][2] = 1;
instanceDesc.InstanceMask = 1;
instanceDesc.AccelerationStructure = blasBuffer->GetGPUVirtualAddress();
D3D12_HEAP_PROPERTIES uploadHeapProperties = {};
uploadHeapProperties.Type = D3D12_HEAP_TYPE_UPLOAD;
uploadHeapProperties.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN;
uploadHeapProperties.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN;
uploadHeapProperties.CreationNodeMask = 1;
uploadHeapProperties.VisibleNodeMask = 1;
D3D12_RESOURCE_DESC bufferDesc = {};
bufferDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
bufferDesc.Alignment = 0;
bufferDesc.Width = sizeof(instanceDesc);
bufferDesc.Height = 1;
bufferDesc.DepthOrArraySize = 1;
bufferDesc.MipLevels = 1;
bufferDesc.Format = DXGI_FORMAT_UNKNOWN;
bufferDesc.SampleDesc.Count = 1;
bufferDesc.SampleDesc.Quality = 0;
bufferDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
bufferDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
ThrowIfFailed(pDevice->CreateCommittedResource(
&uploadHeapProperties, D3D12_HEAP_FLAG_NONE, &bufferDesc,
D3D12_RESOURCE_STATE_GENERIC_READ, nullptr, IID_PPV_ARGS(instanceDescs)));
void* pMappedData;
(*instanceDescs)->Map(0, nullptr, &pMappedData);
memcpy(pMappedData, &instanceDesc, sizeof(instanceDesc));
(*instanceDescs)->Unmap(0, nullptr);
// BLAS创建命令
D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_DESC bottomLevelBuildDesc =
{};
{
bottomLevelBuildDesc.Inputs = bottomLevelInputs;
bottomLevelBuildDesc.ScratchAccelerationStructureData =
scratchResource->GetGPUVirtualAddress();
bottomLevelBuildDesc.DestAccelerationStructureData =
blasBuffer->GetGPUVirtualAddress();
}
// TLAS 创建命令
D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_DESC topLevelBuildDesc = {};
{
topLevelInputs.InstanceDescs = instanceDescs->GetGPUVirtualAddress();
topLevelBuildDesc.Inputs = topLevelInputs;
topLevelBuildDesc.DestAccelerationStructureData =
tlasBuffer->GetGPUVirtualAddress();
topLevelBuildDesc.ScratchAccelerationStructureData =
scratchResource->GetGPUVirtualAddress();
}
// 执行命令
raytracingCommandList->BuildRaytracingAccelerationStructure(
&bottomLevelBuildDesc, 0, nullptr);
D3D12_RESOURCE_BARRIER barrier = {};
barrier.Type = D3D12_RESOURCE_BARRIER_TYPE_UAV;
barrier.UAV.pResource = blasBuffer;
commandList->ResourceBarrier(1, &barrier);
raytracingCommandList->BuildRaytracingAccelerationStructure(
&topLevelBuildDesc, 0, nullptr);
4.6 光追命令(Commands)
光线追踪管道类似于计算管道,但不像传统计算管线那样分组,需要在给定的内核大小上执行,例如执行作为辐射缓冲区每个像素的一组光线。
具体创建于使用方法如下:
ThrowIfFailed(commandAllocator->Reset());
ThrowIfFailed(commandList->Reset(commandAllocator, nullptr));
commandList->SetComputeRootSignature(globalRootSignature);
// 绑定全局根签名描述符数据
commandList->SetDescriptorHeaps(1, &srvHeap);
commandList->SetComputeRootDescriptorTable(0, srvHeap->GetGPUDescriptorHandleForHeapStart());
// 绑定TLAS
commandList->SetComputeRootShaderResourceView(
1, tlas->GetGPUVirtualAddress());
// 设置调度分配描述
D3D12_DISPATCH_RAYS_DESC dispatchDesc = {};
dispatchDesc.Width = width;
dispatchDesc.Height = height;
dispatchDesc.Depth = 1;
// 着色器绑定表
dispatchDesc.RayGenerationShaderRecord.StartAddress =
rayGenShaderTable->GetGPUVirtualAddress();
dispatchDesc.RayGenerationShaderRecord.SizeInBytes =
rayGenShaderTable->GetDesc().Width;
dispatchDesc.MissShaderTable.StartAddress =
missShaderTable->GetGPUVirtualAddress();
dispatchDesc.MissShaderTable.SizeInBytes =
missShaderTable->GetDesc().Width;
dispatchDesc.MissShaderTable.StrideInBytes =
dispatchDesc.MissShaderTable.SizeInBytes;
dispatchDesc.HitGroupTable.StartAddress =
hitShaderTable->GetGPUVirtualAddress();
dispatchDesc.HitGroupTable.SizeInBytes = hitShaderTable->GetDesc().Width;
dispatchDesc.HitGroupTable.StrideInBytes =
dispatchDesc.HitGroupTable.SizeInBytes;
// 设置管线
commandList->SetPipelineState1(pipelineState);
//分配调度光线
commandList->DispatchRays(&dispatchDesc);
更详细的官网教程可以参考DirectX Raytracing (DXR) Functional Spec
五、Mesh Geometry Pipeline API
Mesh Geometry Pipeline API参照之前文章:
DX12_Amplification/Mesh Shaders and API