【读论文】Instant Neural Graphics Primitives with a Multiresolution Hash Encoding

news2025/1/22 19:02:04

文章目录

  • 1. What
  • 2. Why
    • 2.1 Introduction
    • 2.2 Related work and background
  • 3. How: Multiresolution hash encoding
    • 3.1 Structure
    • 3.2 Input coordinate
    • 3.3 Hash mapping
    • 3.4 Interpolation
    • 3.5 Performance vs. quality
    • 3.6 Hash collision
  • 4. Experiment on Nerf

1. What

To reduce the cost of a fully connected network, this paper utilizes a multiresolution hash table of trainable features that can permit the use of a smaller network without sacrificing quality.

2. Why

2.1 Introduction

Computer graphics primitives are fundamentally represented by mathematical functions. Functions represented by MLPs, used as neural graphics primitives, have been shown to have the ability to capture high-frequency and local details. It can map neural network inputs to a higher-dimensional space.

Most successful among these encodings are trainable,
task-specific data structures. However such data structures rely on heuristics and structural modifications (such as pruning, splitting, or merging), limit the method to a specific task, or limit performance on GPUs.

The current method utilizes hash encoding which will not update the structure during training and only needs O ( 1 ) O(1) O(1) when looking up value. These are its adaptivity and efficiency.

2.2 Related work and background

  1. Encoding: Frequency encodings such as sin and cos are common methods. Recently, state-of-the-art results have been achieved by parametric encodings which blur the line between classical data structures and neural approaches. Grid and tree are common in this encoding, but tree needs a greater computational cost.
  2. When using the grid, the dense grid is wasteful in two ways. One is the number of parameters grows as O ( N 3 ) O(N^3) O(N3) , while the visible surface of interest has surface area that grows only as O ( N 2 ) O(N^2) O(N2). The other is the natural scenes exhibit smoothness, motivating the use of a multi-resolution decomposition.

3. How: Multiresolution hash encoding

在这里插入图片描述

3.1 Structure

Given a fully connected neural network m ( y ; Φ ) m(y; \Phi) m(y;Φ), we are interested in an encoding of its inputs y = e n c ( x ; θ ) y=enc(\mathbf{x};\theta) y=enc(x;θ). The parameters θ \theta θ are trainable and the encoding structure is arranged into L L L levels, each containing up to T T T feature vectors with dimensionality F F F.

Each level is independent and conceptually stores feature vectors at the vertices of a grid, the resolution of which is chosen to be a geometric progression between the coarsest and finest resolutions [ N m i n , N m a x ] [N_{min}, N_{max}] [Nmin,Nmax]:

N l : = ⌊ N m i n ⋅ b C l ⌋ b : = exp ⁡ ( ln ⁡ N max ⁡ − ln ⁡ N min ⁡ L − 1 ) . N_{l}:=\left\lfloor N_{\mathrm{min}}\cdot b_{C}^{l}\right\rfloor \\ b:=\exp\biggl(\frac{\ln N_{\max}-\ln N_{\min}}{L-1}\biggr) . Nl:=NminbClb:=exp(L1lnNmaxlnNmin).

在这里插入图片描述

According to the hyperparameters we choose, b ∈ [ 1.26 , 6 ] b \in [1.26,6] b[1.26,6].

3.2 Input coordinate

When we have a normalized input x ∈ R d \mathbf{x}\in\mathbb{R}^{d} xRd, it is scaled by the level’s grid resultion before rounding down and up:

⌊ x l ⌋ : = ⌊ x ⋅ N l ⌋ , ⌈ x l ⌉ : = ⌈ x ⋅ N l ⌉ . \lfloor\mathbf{x}_l\rfloor:= \lfloor\mathbf{x}\cdot N_{l}\rfloor,\lceil\mathbf{x}_{l}\rceil:=\lceil\mathbf{x}\cdot N_{l}\rceil. xl:=xNl,xl:=xNl.

⌊ x l ⌋ \lfloor\mathbf{x}_l\rfloor xl and ⌈ x l ⌉ \lceil\mathbf{x}_{l}\rceil xl can be mapped to the integer vertices in the grid and they span 2 d 2^d 2d voxel.

3.3 Hash mapping

Each integer vertex on the grid will correspond to a position in the hash table. It is calculated by:

h ( x ) = ( ⨁ i = 1 d x i π i ) m o d    T h(\mathbf{x})=\left(\bigoplus_{i=1}^dx_i\pi_i\right)\mod T h(x)=(i=1dxiπi)modT

where we choose π 1 : = 1 , π 2 : = 2654435761 , π 3 : = 80549861 \pi_1:=1,\pi_2:=2654435761,\pi_3:=80549861 π1:=1,π2:=2654435761,π3:=80549861 and ⨁ \bigoplus represents the bit-wise XOR.

After this transformation, each integer vertex will be reflected to an integer index in the hash table with a 2 dimensions feature.

For coarse levels where a dense grid requires fewer than T T T parameters, i.e. ( N l + 1 ) d ≤ T (N_l+1)^d\leq T (Nl+1)dT, this mapping is 1:1. At finer levels, we use a hash function h : Z d → Z T h:\mathbb{Z}^d\to\mathbb{Z}_T h:ZdZT to index into the array, effectively treating it as a hash table, although there is no explicit collision handling.

3.4 Interpolation

Lastly, the feature vectors at each corner are 𝑑-linearly interpolated according to the relative position of x within its hypercube, i.e. the interpolation weight is w l : = x l − ⌊ x l ⌋ . \mathbf{w}_{l}:=\mathbf{x}_{l}-\lfloor\mathbf{x}_{l}\rfloor. wl:=xlxl.

Recall that this process takes place independently for each of the L L L levels. The interpolated feature vectors of each level, as well as auxiliary inputs ξ ∈ R E \xi\in\mathbb{R}^E ξRE (such as the encoded view direction and textures in neural radiance caching), are concatenated to produce y ∈ R L F + E \mathbf{y}\in\mathbb{R}^{LF+E} yRLF+E, which is the encoded input enc ( x ; θ ) (\mathbf{x};\theta) (x;θ) to the MLP m ( y ; Φ ) m(\mathbf{y};\Phi) m(y;Φ).

3.5 Performance vs. quality

The hyperparameters L L L(number of levels), F F F **(number of feature dimensions), and T T T(table size) trade off quality and performance

3.6 Hash collision

In finer resolution, disparate points that hash to the same table entry mean collision.

When training samples collide, their gradients average. Samples rarely have equal importance to the final reconstruction. A point on a visible surface of a radiance field strongly contributes to the image, causing large changes to its table entries. In contrast, a point in empty space referring to the same entry has a smaller weight. Thus, the gradients of more important samples dominate, optimizing the aliased table entry to reflect the needs of the higher-weighted point.

4. Experiment on Nerf

  1. Model Architecture: Informed by the analysis in Figure 10, our results were generated with a 1-hidden-layer density MLP and a 2-hidden-layer color MLP, both 64 neurons wide.

    在这里插入图片描述

  2. Accelerated ray marching: We concentrate samples near surfaces by maintaining an occupancy grid that coarsely marks empty vs. nonempty space.

    We utilize three techniques with imperceivable error to optimize our implementation:
    (1) exponential stepping for large scenes,
    (2) skipping of empty space and occluded regions, and
    (3) compaction of samples into dense buffers for efficient execution. More information please refer to Appendix E.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1923757.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Pygame开发五子棋之人机对战游戏

引言 Pygame是一个基于Python的开源游戏开发库,它包含了丰富的多媒体功能,尤其是针对游戏开发所需的各种组件。如果你对游戏开发感兴趣,但又不想从底层开始编写所有东西,Pygame可以成为一个理想的起点。本文将介绍Pygame的基本概…

C++:类和对象 I(访问限定符、this指针)

目录 类的定义 类的大小 访问限定符 实例化 this指针 类的定义 class就是类,class是C中的一个关键字 当然类也可以是C语言中的struct,C兼容struct,甚至还有一些升级 定义类的方式 class Date {}; 和C语言的struct一样,c…

【信息收集】域名信息收集

域名介绍 域名(Domain Name),简称域名、网域,是由一串用点分隔的名字组成的Internet上某一台计算机或计算机组的名称,用于在数据传输时标识计算机的电子方位(有时也指地理位置)。 DNS&#xf…

【Python】jupyter notebook平台的使用·

目录 一、安装Anaconda 二、 将BreadCancer.zip上传到jupyter notebook平台中 三、了解BreadCancerClassifier.ipynb文件在jupyter notebook的单元格中的python代码,并运行。 3.1 导入mainFun文件 3.2 读入数据 3.3 开始训练 3.4 读入测试数据 3.5 开始测试…

[笔记] SEW的振动分析工具DUV40A

1.便携式振动分析仪 DUV40A 文档编号:26871998/EN SEW是一家国际化的大型的机械设备供应商。产品线涵盖电机,减速机,变频器等全系列动力设备。DUV40A是他自己设计的一款振动分析工具。 我们先看一下它的软硬件参数: 内置两路传…

i7-13700K负载过高时出现无故自动重启(蓝屏问题)

现象:电脑无故自动重启,关闭故障自动重启后,发现系统蓝屏,然后需要手动重启。经测试,当CPU负载高时,就会有一定概率出现蓝屏。 配置:CPU为i7-13700K,系统为Win11 解决方法 现象刚…

Python那些优质可视化工具!

作者:Lty美丽人生 https://blog.csdn.net/weixin_44208569 本次分享10个适用于多个学科的Python数据可视化库,其中有名气很大的也有鲜为人知的! 1、matplotlib 两个直方图 matplotlib 是Python可视化程序库的泰斗。经过十几年它任然是Pytho…

mitmproxy介绍及使用

mitmproxy介绍 mitmproxy又名中间人攻击代理,是一个抓包工具,类似于WireShark、Filddler,并且它支持抓取HTTP和HTTPS协议的数据包,只不过它是一个控制台的形式操作。另外,它还有两个非常有用的组件,一个mi…

漏扫处理:SSH弱算法问题解决

目录 漏洞说明解决方法1. 查看可用的算法2. 禁用弱算法3.检查ssh配置4.重启ssh服务5.ssh测试连接是否正常6.漏扫测试参考链接漏洞说明 通过漏扫得出,服务器SSH支持密钥交换算法,而此算法被认为是弱算法,存在高风险问题。 启用了以下弱算法: diffie-hellman-group-exchage…

前端JS特效第33波:jQuery旋转木马焦点图轮播插件PicCarousel

jQuery旋转木马焦点图轮播插件PicCarousel&#xff0c;先来看看效果&#xff1a; 部分核心的代码如下&#xff1a; <!doctype html> <html> <head> <meta charset"utf-8"> <meta http-equiv"X-UA-Compatible" content"IE…

Go 初始化一个字典

&#x1f49d;&#x1f49d;&#x1f49d;欢迎莅临我的博客&#xff0c;很高兴能够在这里和您见面&#xff01;希望您在这里可以感受到一份轻松愉快的氛围&#xff0c;不仅可以获得有趣的内容和知识&#xff0c;也可以畅所欲言、分享您的想法和见解。 推荐:「stormsha的主页」…

Python量化交易学习——Part12:回归模型的典型应用

回归模型在很多的时候被应用于对股票的基本面数据进行分析&#xff0c;例如经典的CAPM模型、Fama-French三因子模型以及最新的PB_ROE模型等。这些都是已经应用于现实中的金融市场并获得较好收益的经典模型。本章将通过介绍PB_ROE模型&#xff0c;进一步讲解回归分析在实战过程中…

深入探讨【C++容器适配器】:现代编程中的【Stack与Queue】的实现

目录 一、Stack&#xff08;栈&#xff09; 1.1 Stack的介绍 1.2 Stack的使用 1.3 Stack的模拟实现 二、Queue&#xff08;队列&#xff09; 2.1 Queue的介绍 2.2 Queue的使用 2.3 Queue的模拟实现 三、容器适配器 3.1 什么是适配器 3.2 为什么选择deque作为stack和…

【web】-sql注入-login

根据网址提示打开如图&#xff1a; 查看源代码前台并没有过滤限制、扫描后台也没有发现特殊文件。看到标题显示flag is in database&#xff0c;尝试sql注入。 由于post,bp抓包如下&#xff1a; 运行python sqlmap.py -r 1.txt --dump 获取flag 42f4ebc342b6ed4af4aadc1ea75f…

solidity实战练习3——荷兰拍卖

//SPDX-License-Identifier:MIT pragma solidity ^0.8.24; interface IERC721{function transFrom(address _from,address _to,uint nftid) external ; }contract DutchAuction { address payable immutable seller;//卖方uint immutable startTime;//拍卖开始时间uint immut…

Facebook 开源计算机视觉 (CV) 和 增强现实 (AR) 框架 Ocean

Ocean 是一个独立于平台的框架&#xff0c;支持所有主要操作系统&#xff0c;包括 iOS、Android、Quest、macOS、Windows 和 Linux。它旨在彻底改变计算机视觉和混合现实应用程序的开发。 Ocean 主要使用 C 编写&#xff0c;包括计算机视觉、几何、媒体处理、网络和渲染&#x…

git安装使用gitlab

第一步&#xff1a;下载git 第二步&#xff1a;安装 第三步&#xff1a;配置sshkey 第四步&#xff1a;处理两台电脑的sshkey问题 第一步下载git 网址&#xff1a;Git点Downloads根据你的操作系统选择对应的版本&#xff0c;我的是Windows&#xff0c;所以我选择了Windows …

细数「人力资源」的「六宗罪」

细数「人力资源」的「六宗罪」 不要让人力资源成为企业发展的障碍 人力资源的六宗罪: 招聘与配置培训与开发薪酬与绩效请您先「点赞」+「在看」+「收藏」+关注@netkiller,转发给你的朋友,再慢慢看,方便查看往期精彩文章,以防手划找不到,您的支持就是我最大的动力。 人力…

玩转springboot之SpringBoot打成jar包的结构

SpringBoot打成jar包的结构 springboot通常会打成jar包&#xff0c;然后使用java -jar来进行执行&#xff0c;那么这个jar包里的结构是什么样的呢 其中 BOOT-INF 中包含的classes是我们程序中所有的代码编译后的class文件&#xff0c;lib是程序所引用的外部依赖 META-INF 这个…

解答|服务器只能开22端口可以申请IP地址SSL证书吗?

IP地址SSL证书&#xff0c;是一种专门颁发给公网IP地址的SSL证书&#xff0c;而不是常见的基于域名的SSL证书。SSL证书主要用于保障数据在客户端&#xff08;如用户的浏览器&#xff09;和服务器之间传输时的加密性和安全性&#xff0c;以防止数据被截取或篡改。 服务器只能开…