Mitsuba 渲染基础

- 0. Abstract
- 1. 安装 Mitsuba2
- - 1.1 下载 Mitsuba2 源码
  - 1.2 选择后端 (variants)
  - 1.3 编译
- 2. [Mitsuba2PointCloudRenderer](https://github.com/tolgabirdal/Mitsuba2PointCloudRenderer)
- - 2.1 Mitsuba2 渲染 XML
  - 2.2 Scene 场景的 XML 文件格式
  - - 2.2.1 `chair.npy` to XML
    - 2.2.2 Scene XML 文件解析
- 3. Mitsuba2 小结
- 4. Mitsuba3
- - 4.1 Mitsuba3 安装
  - 4.2 Quickstart
  - - 4.2.1 Dr.Jit Quickstart (Similarity with NumPy)
    - 4.2.2 Mitsuba3 Quickstart

0. Abstract

最近看到一些点云相关的工作, 他们在文章中展示的点云结果很好看, 类似这样:

但自己用 matplotlib 画的结果却很丑, 发邮件问作者才知道他用了一个渲染工具 Mitsuba2PointCloudRenderer. 它用到了渲染软件 Mitsuba2, 执行:

python render_mitsuba2_pc.py chair.npy

就能渲染点云, 过程是这样的:

generates an XML file, which describes a 3D scene in the format used by Mitsuba;
calls Mitsuba2 to render the point cloud into an EXR file;
processes the EXR into a jpg file.
iterates for multiple point clouds present in the tensor (.npy)

想要了解其到底干了什么, 如何自定义渲染配置? 就进入这个仓库好好看一看.

1. 安装 Mitsuba2

那么自然要先安装 Mitsuba2, 比较麻烦, 似乎要根据官方文档 从源码编译安装, 那就照说明做.

1.1 下载 Mitsuba2 源码

git clone --recursive https://github.com/mitsuba-renderer/mitsuba2

Mitsuba2 依赖了很多其他库, --recursive 可以保证同时将其依赖的库一起 clone 下来.

1.2 选择后端 (variants)

克隆好后, 还要了解 Mitsuba2 的后端. 它是一个重定向系统, 前端的调用接口大多是一致的, 实际的后端执行者有很多版本, 各后端版本命名规则:

上面是一个后端的例子:

Computational Backend: 指计算设备及计算方式. 如果选择在 CPU 上计算, 可以是 scalar, 对应标量计算方式, 也可以是 packet, 对应并行计算方式; 如果选择在 GPU 上计算, 有 gpu 和 gpu_autodiff 可选;

大概意思是, 渲染过程需要发出很多光线, scalar 后端每次只能计算一个光路径, 而 packet 每次并行计算一组, gpu 则具有更高的并行性. 下图展示了 scalar 和 packet 的区别.

Color Representation: 指颜色的表示方式, 有 mono, rgb, spectral. 其中, mono 是单色的意思, 可能是灰白; 光谱表示可能有更加丰富的颜色; 需要注意的是, 在 spectral 模式下, 默认输出的依然是 RGB 图片, 输入仅包含 RGB 颜色信息的 scene, 也能渲染;
Polarization: 略;
Precision: 计算时使用的浮点精度, 默认是单精度, 带 _double 则表示使用双精度; 注意, GPU 不支持双精度.

全部的变体列表在 Choosing Variants 查看.

了解了 Mitsuba2 的后端后, 可在代码库的根目录配置你需要的后端类型:

cd <..mitsuba repository..>
cp resources/mitsuba.conf.template mitsuba.conf

打开 mitsuba.conf, 找到(大约在70行):

"enabled": [
    # The "scalar_rgb" variant *must* be included at the moment.
    "scalar_rgb",
    "scalar_spectral"
],

这是默认启用的两个, 你可以向这个列表中添加需要的后端.

注意: scalar_rgb 必须要有, 因为很多基础功能是基于它的; 尽量少添加, 因为后端越多, 编译时间越长, 占用空间越大.

设置默认后端: (如果使用 Mitsuba2 时未指定后端, 则使用默认的后端)

# If mitsuba is launched without any specific mode parameter,
# the configuration below will be used by default
"default": "scalar_spectral"

1.3 编译

不同系统编译要求及命令不同(Python>=3.6). Windows 下需要安装 Visual Studio, 果断放弃; Linux 下也需要一堆依赖:

# Install recent versions build tools, including Clang and libc++ (Clang's C++ library)
sudo apt install -y clang-9 libc++-9-dev libc++abi-9-dev cmake ninja-build
# Install libraries for image I/O and the graphical user interface
sudo apt install -y libz-dev libpng-dev libjpeg-dev libxrandr-dev libxinerama-dev libxcursor-dev
# Install required Python packages
sudo apt install -y python3-dev python3-distutils python3-setuptools

# >>>>>>>>>>>>> 可选 >>>>>>>>>>>>>>>>
# For running tests
sudo apt install -y python3-pytest python3-pytest-xdist python3-numpy
# For generating the documentation
sudo apt install -y python3-sphinx python3-guzzle-sphinx-theme python3-sphinxcontrib.bibtex

文档说编译器要选 clang, 因为用 gcc 会有各种问题.

export CC=clang-9
export CXX=clang++-9

如果已经有其他版本的 clang, 可以调整后缀.

然后, 进入代码根目录开始编译:

# Create a directory where build products are stored
mkdir build
cd build
cmake -GNinja ..
ninja

编译好之后, build 目录下会有一个叫 dist 的文件夹, 里面就是编译好的 Mitsuba2:

其中 mitsuba 就是可执行文件.

注意, 如果选择了 GPU 后端, 需要有 CUDA 和 OptiX 来编译.

2. Mitsuba2PointCloudRenderer

在 Abstract 中, 已经说明这个库的步骤是先生成 .xml scene 文件, 然后调用 Mitsuba2 进行渲染. 为了急于验证编译的 Mitsuba2 好不好使, 先讲述渲染部分.

2.1 Mitsuba2 渲染 XML

编译之后就可以使用了, 在这之前, 需要把 render_mitsuba2_pc.py 中的 Mitsuba2 路径修改一下:

...
# mitsuba exectuable, 改成自己的 Mitsuba2 位置
PATH_TO_MITSUBA2 = "/home/tolga/Codes/mitsuba2/build/dist/mitsuba"
...

然后执行:

python render_mitsuba2_pc.py chair.npy

就可以得到渲染后的点云图片.

Mitsuba2 的调用是通过 python 的 subprocess 包执行 shell 命令来实现的:

...
subprocess.run([PATH_TO_MITSUBA2, xmlFile])
...

这里的 xmlFile 是 Python 代码根据 numpy 数组生成的表示 scene 的 xml 文件. 此行代码相当于 shell 命令:

/home/tolga/Codes/mitsuba2/build/dist/mitsuba chair_scene.xml

由 1.2 节可知, 默认使用的是 scalar_spectral, 应该是比较慢的, 渲染这个椅子大约花费了 42s. 那就试一试 GPU 咋样:

subprocess.run([PATH_TO_MITSUBA2, xmlFile, '-m', 'gpu_rgb'])

出了意外:

2024-09-20 20:15:36 INFO  main  [optix_api.cpp:56] Dynamic loading of the Optix library ..
Caught a critical exception: [optix_api.cpp:144] optix_initialize(): libnvoptix.so.1 could not be loaded.

试试手动渲染也是一样的错误. 后来尝试各种办法均无法解决, 可能原因是我使用的环境是 docker 服务器, 管不了 NVIDIA 驱动的事.

GPU 不行就试试 packet:

subprocess.run([PATH_TO_MITSUBA2, xmlFile, '-m', 'packet_rgb'])

好像并没有变快, 足足花了 49s. 可能还有其他参数未设置吧, 比如并行数量. 后来发现是因为 rgb, 之前是 spectral, scalar_rgb 足足花了 1.24m. 可见还是加速了的.

2.2 Scene 场景的 XML 文件格式

了解 scene 文件的格式才能自由调整渲染的各种属性, 官方文档中的提供了详细说明, 但本博文就只简单了解一下根据 chair.npy 生成的 XML 文件.

2.2.1 `chair.npy` to XML

阅读 render_mitsuba2_pc.py, 可以发现, XML 文件的生成是三部分组成的: head, shape, tail, 具体的后面再说, 现在看看点云是如何转化为 XML 文件的.

最主要的是中间部分:

xml_ball_segment = \
	"""
	<shape type="sphere">
		<float name="radius" value="0.015"/>
		<transform name="toWorld">
			<translate x="{}" y="{}" z="{}"/>
		</transform>
		<bsdf type="diffuse">
			<rgb name="reflectance" value="{},{},{}"/>
		</bsdf>
	</shape>
"""

xml_segments = [xml_head]
for i in range(pcl.shape[0]):
	color = colormap(pcl[i, 0] + 0.5, pcl[i, 1] + 0.5, pcl[i, 2] + 0.5 - 0.0125)
	xml_segments.append(xml_ball_segment.format(pcl[i, 0], pcl[i, 1], pcl[i, 2], *color))
xml_segments.append(xml_tail)

通过一个 for 循环把点云 pcl 中的点映射至颜色 color (根据点位置), 然后点和颜色一起送入 xml_ball_segment 的格式化占位符中, 分别对应 shape -> transform -> translate 和 shape -> bsdf -> rgb.

2.2.2 Scene XML 文件解析

已经了解了点云如何转化为 XML 文件, 可其中的标签都代表什么还不清楚, 查阅官方文档既漫长又枯燥, 直接拿得到的 XML 文件问一问通义千问吧, 现以注释的形式记录:

<scene version="0.6.0">
	<!--
		积分器是用来计算光照传输方程的方法, 模拟不同的光照效果;
			path: 这里指定的是路径追踪积分器(Path Tracer), 模拟光线在场景中的随机行走(即光线的“路径”), 从而得到更加真实的光影效果;
			maxDepth: 路径的最大深度, 当 maxDepth 设置为 -1 时, 表示没有最大深度限制, 即允许光线无限反弹.
	-->
	<integrator type="path">
		<integer name="maxDepth" value="-1"/>
	</integrator>
	<!-- 定义了一个相机(传感器)sensor, 它用于捕捉场景并生成图像; -->
	<sensor type="perspective">  <!-- “透视”相机, 它将场景中的物体从三维空间映射到二维图像上, 以模拟人眼观察的视觉效果. -->
		<float name="farClip" value="100"/>   <!-- 远裁剪平面的距离为 100 单位长度, 任何在这个距离之外的对象都不会被绘制. -->
		<float name="nearClip" value="0.1"/>  <!-- 近裁剪平面的距离为 0.1 单位长度. 任何在这个距离之内的对象也不会被绘制. -->
		<!--
			相机的位置和方向:
				origin="3,3,3": 相机的位置是在 (3, 3, 3) 坐标处.
				target="0,0,0": 相机对准的点是在原点 (0, 0, 0).
				up="0,0,1": 向上的方向是沿着 Z 轴正方向.
		-->
		<transform name="toWorld">
			<lookat origin="3,3,3" target="0,0,0" up="0,0,1"/>
		</transform>
		<float name="fov" value="25"/>  <!-- 相机的视场角(Field of View)为 25 度. 视场角决定了相机能看到的范围大小. -->
		<!--
			独立采样器(Sampler), 用于确定渲染过程中像素的样本分布
				sampleCount: 每个像素的采样数量为 256. 更多的样本可以减少噪点, 但也增加了渲染时间.
		-->
		<sampler type="independent">
			<integer name="sampleCount" value="256"/>
		</sampler>
		<!-- 渲染输出的胶片(Film)类型为 HDR(高动态范围)-->
		<film type="hdrfilm">
			<integer name="width" value="1920"/>  <!-- 图像宽度为1920像素 -->
			<integer name="height" value="1080"/> <!-- 图像高度为1080像素 -->
			<rfilter type="gaussian"/>  <!-- 使用高斯滤波器进行像素重建, 以减少输出图像的噪声 -->
		</film>
	</sensor>

	<!-- 材料的表面反射模型(BSDF) -->
	<bsdf type="roughplastic" id="surfaceMaterial">  <!-- 粗糙塑料材质的 BSDF(双向散射分布函数) -->
		<string name="distribution" value="ggx"/>
		<float name="alpha" value="0.05"/>  <!-- 微表面模型的粗糙度参数alpha为0.05. 该值越小, 表面看起来越光滑；越大, 则越粗糙. -->
		<float name="intIOR" value="1.46"/> <!-- 折射率影响物体的透明度和折射效果. 对于塑料而言, 1.46是一个合理的折射率值 -->
		<!--
			材料的漫反射颜色, 默认情况下是白色(1,1,1), 这里的值覆盖了默认的 0.5, 使得材料看起来更加白色. 
			漫反射颜色影响物体在没有直接光照情况下的颜色表现. 
		-->
		<rgb name="diffuseReflectance" value="1,1,1"/> <!-- default 0.5 -->
	</bsdf>

	<shape type="sphere">
		<float name="radius" value="0.015"/>
		<transform name="toWorld">
			<translate x="-0.28334808349609375" y="0.3507939875125885" z="-0.43566787242889404"/>
		</transform>
		<bsdf type="diffuse">
			<rgb name="reflectance" value="0.24634252336480672,0.9673892625993827,0.05893535263050671"/>
		</bsdf>
	</shape>
	<!-- ... 很多 shape ... -->
	<shape type="rectangle">
		<ref name="bsdf" id="surfaceMaterial"/>  <!-- 引用前面的 bsdf, 以说明本长方形的表面材质 -->
		<transform name="toWorld">
			<scale x="10" y="10" z="1"/>
			<translate x="0" y="0" z="-0.5"/>
		</transform>
	</shape>
	
	<shape type="rectangle">
		<transform name="toWorld">
			<scale x="10" y="10" z="1"/>
			<lookat origin="-4,4,20" target="0,0,0" up="0,0,1"/>
		</transform>
		<emitter type="area">
			<rgb name="radiance" value="6,6,6"/>
		</emitter>
	</shape>
</scene>

深追无益, 现在只关注几个要点, 用以调整生成点云渲染图的视角和颜色效果:

sensor -> transform -> lookat 影响视角: 查看方向和朝上的方向决定了你看到图片的效果;
<float name="fov" value="25"/> 视场角决定了相机能看到的范围大小;
sensor -> film 影响生成图片的分辨率;

3. Mitsuba2 小结

有了这些, 已经足够渲染点云, 也能更改渲染时的各种属性. 存在的问题在于:

Windows 下安装麻烦, 不想安装 Visual Studio, 因为它太大了;
目前无法使用 GPU;

然后, 目光转向:

4. Mitsuba3

Mitsuba2 已经被弃用, 不会再有更新和维护. Mitsuba3 已发布, 解决了之前的很多问题, 那就看一看咋样吧.

4.1 Mitsuba3 安装

Mitsuba3 官方已经打包了 PYPI 包, 可以直接用 pip 安装了(暂时没有 conda 渠道, 不过 Windows 下也能轻松安装了):

pip install mitsuba

Requirements

Python >= 3.8
(optional) For computation on the GPU: NVidia driver >= 495.89
(optional) For vectorized / parallel computation on the CPU: LLVM >= 11.1

无需自己手动编译, 可直接在 Python 脚本中调用:

import mitsuba as mi

mi.set_variant('scalar_rgb')      # 设置后端
scene_dict = mi.cornell_box()     # 包内提供的一个描述场景信息的字典, 等价于 xml 文件
scene = mi.load_dict(scene_dict)  # 加载场景
img = mi.render(scene)            # 渲染
mi.Bitmap(img).write('cbox.exr')  # 保存

当然也可以命令行调用:

~# mitsuba -h
Mitsuba version 3.5.2 (master[29d6537], Linux, 64bit, 64 threads, 8-wide SIMD)
Copyright 2022, Realistic Graphics Lab, EPFL
Enabled processor features: cuda llvm avx f16c sse4.2 x86_64

PYPI 包默认编译了 4 个后端:

>>> mi.variants()
['scalar_rgb', 'scalar_spectral', 'cuda_ad_rgb', 'llvm_ad_rgb']
# 注意, 在 Mitsuba3 中, 后端名不太一样, packet --> llvm, gpu --> cuda

基本够用了, 但如果你需要其他后端, 可以像 Mitsuba2 一样进行配置和编译. 比较麻烦, 不过比 Mitsuba2 好的地方在于, 我发现 Mitsuba3 的代码仓库中有 pyproject.toml 文件, 那么就可能通过:

cd <root of the repository>
pip install .

进行安装, 过程中会进行源代码的编译和安装. 我试了试, 无论是直接 cmake .. 还是 pip install ., 都失败了, 算了.

4.2 Quickstart

Mitsuba3 使用一个叫 Dr.Jit 的后端计算工具库, 这可能会在后面的高级用法中用到, 本节先介绍一下 Dr.Jit 的 Quickstart, 然后介绍 Mitsuba3 的 Quickstart.

Dr.Jit 是为 Mitsuba3 特意设计的, 但也可以用作其他计算用途.

重磅: 后面会牵扯到不少渲染, scalar 和 llvm 都比较慢(后来发现, 即使在 scalar 下, CPU 都会 100%), 于是到 Colab 尝试, 分得 T4 GPU, 渲染竟然瞬间完成. 可结果却是模糊的…

4.2.1 Dr.Jit Quickstart (Similarity with NumPy)

Dr.Jit 语法很像 NumPy, 且两个库是可互操作的:

from drjit.llvm import Float, UInt32

# Create some floating-point arrays
a = Float([1.0, 2.0, 3.0, 4.0])
b = Float([4.0, 3.0, 2.0, 1.0])

# Perform simple arithmetic
c = a + 2.0 * b
print(f'c -> ({type(c)}) = {c}')

# Convert to NumPy array
d = np.array(c)
print(f'd -> ({type(d)}) = {d}')

这里 from drjit.llvm import Float, UInt32 有问题, 虽然能运行, 但 IDE 中写代码会报错: 找不到 llvm, 其他后端如 cuda 也一样. 这可能是其 .pyi 文档有问题. 改成 from drjit import llvm 就没问题了, 然后通过 llvm.Float, llvm.UInt32 访问其中的类.
注意, 尝试了修改其 .pyi 文件, 得到的结果是连运行都不行了.

与 NumPy 不同, Dr.Jit 可以选择在 GPU 上运算:

其他就不多说了, 如有需要, 可参考 Dr.Jit Quickstart.

4.2.2 Mitsuba3 Quickstart

上面安装后的测试已经是 Mitsuba3 的 Quickstart 了, 这里重复一遍, 并加以说明:

>>> import mitsuba as mi
>>> mi.variants()
['scalar_rgb', 'scalar_spectral', 'cuda_ad_rgb', 'llvm_ad_rgb']

这是默认的四个后端, 可以选择其中一个:

>>> mi.set_variant("scalar_rgb")

然后加载场景文件, 支持 XML 格式和 DICT 格式:

scene = mi.load_file("./scenes/cbox.xml")

scenes 可在这里下载. 渲染 (We can for example pass the desired number of samples per pixel (SPP)):

image = mi.render(scene, spp=256)

可用 matplotlib 画出渲染结果:

import matplotlib.pyplot as plt

plt.axis("off")
plt.imshow(image ** (1.0 / 2.2)); # approximate sRGB tonemapping

在这里插入图片描述
可以保存为各种格式的图片:

mi.util.write_bitmap("my_first_render.png", image)
mi.util.write_bitmap("my_first_render.exr", image)