【读论文】【精读】3D Gaussian Splatting for Real-Time Radiance Field Rendering

news2025/1/21 6:28:06

文章目录

    • 1. What:
    • 2. Why:
    • 3. How:
      • 3.1 Real-time rendering
      • 3.2 Adaptive Control of Gaussians
      • 3.3 Differentiable 3D Gaussian splatting
    • 4. Self-thoughts

1. What:

What kind of thing is this article going to do (from the abstract and conclusion, try to summarize it in one sentence)

To simultaneously satisfy the requirements of efficiency and quality, this article begins by establishing a foundation with sparse points using 3D Gaussian distributions to preserve desirable space. It then progresses to optimizing anisotropic covariance to achieve an accurate representation. Lastly, it introduces a cutting-edge, visibility-aware rendering algorithm designed for rapid processing, thereby achieving state-of-the-art results in the field.

2. Why:

Under what conditions or needs this research plan was proposed (Intro), what problems/deficiencies should be solved at the core, what others have done, and what are the innovation points? (From Introduction and related work)

Maybe contain Background, Question, Others, Innovation:

Three aspects of related work can explain this question.

  1. Traditional reconstructions such as SfM and MVS need to re-project and
    blend the input images into the novel view camera, and use the
    geometry to guide this re-projection(From 2D to 3D).

    Sad: Cannot completely recover from unreconstructed regions, or from “over-reconstruction”, when MVS generates inexistent geometry.

  2. Neural Rendering and Radiance Fields

    Neural rendering represents a broader category of techniques that leverage deep learning for image synthesis, while radiance field is a specific technique within neural rendering focused on the scene representation of light and color in 3D spaces.

  • Deep Learning was mainly used on MVS-based geometry before, which is also its major drawback.

  • Nerf is along the way of volumetric representation, which introduced positional encoding and importance sampling.

  • Faster training methods focus on the use of spatial data structures to store (neural) features that are subsequently interpolated during volumetric ray-marching, different encodings, and MLP capacity.

  • Today, notable works include InstantNGP and Plenoxels both rely on Spherical Harmonics.

    Understand Spherical Harmonics as a set of basic functions to fit a geometry in a 3D spherical coordinate system.

    球谐函数介绍(Spherical Harmonics) - 知乎 (zhihu.com)

  1. Point-Based Rendering and Radiance Fields
  • The methods in human performance capture inspired the choice of 3D Gaussians as scene representation.
  • Point-based and spherical rendering is achieved before.

3. How:

请添加图片描述

Through the Gradient Flow in this paper’s pipeline, we are trying to connect Part4, 5, and 6 in this paper.

Firstly, start from the loss function, which is combined by a L 1 {\mathcal L}_{1} L1 loss and a S S I M SSIM SSIM index, just as shown below:

L = ( 1 − λ ) L 1 + λ L D − S S I M . (1) {\mathcal L}=(1-\lambda){\mathcal L}_{1}+\lambda{\mathcal L}_{\mathrm{D-SSIM}}.\tag{1} L=(1λ)L1+λLDSSIM.(1)

It found a relation between the actual image and the rendering image. So to finish the optimization, we need to dive into the process of rendering. From the chapter on related work, we know Point-based α \alpha α-blending and NeRF-style volumetric rendering share essentially the same image formation model. That is

C = ∑ i = 1 N T i ( 1 − exp ⁡ ( − σ i δ i ) ) c i w i t h T i = exp ⁡ ( − ∑ j = 1 i − 1 σ j δ j ) . (2) C=\sum_{i=1}^{N}T_{i}(1-\exp(-\sigma_{i}\delta_{i}))c_{i}\quad\mathrm{with}\quad T_{i}=\exp\left(-\sum_{j=1}^{i-1}\sigma_{j}\delta_{j}\right).\tag{2} C=i=1NTi(1exp(σiδi))ciwithTi=exp(j=1i1σjδj).(2)

And this paper actually uses a typical neural point-based approach just like (2), which can be represented as:

C = ∑ i ∈ N c i α i ∏ j = 1 i − 1 ( 1 − α j ) (3) C=\sum_{i\in N}c_{i}\alpha_{i}\prod_{j=1}^{i-1}(1-\alpha_{j}) \tag{3} C=iNciαij=1i1(1αj)(3)

From this formulation, we can know what the representation of volume should contain the information of color c c c and transparency α \alpha α. These are attached to the gaussian, where Spherical Harmonics was used to represent color, just like Plenoxels. The other attributes used are the position and covariance matrix. So, now we have introduced the four attributes to represent the scene, that is positions 𝑝, 𝛼, covariance Σ, and SH coefficients representing color 𝑐 of each Gaussian.
After knowing the basic elements we need to use, now let’s work backward, starting with rendering, which was addressed in the author’s previous paper.

3.1 Real-time rendering

This method is independent of the propagation of gradients but is critical for real-time performance, which was published in the author’s paper before.
在这里插入图片描述

In the previous game, someone had tried to model the world in ellipsoid and render it. This is the same as the render process of Gaussian splatting. But the latter uses lots of techniques in the utilization of threads and GPU.

  • Firstly, it starts by splitting the screen into 16×16 tiles and then proceeds to cull 3D Gaussians against the view frustum and each tile, only keeping Gaussians with a 99% confidence interval intersecting the view frustum.
  • Then instantiate each Gaussian according to the number of tiles they overlap and assign each instance a key that combines view space depth and tile ID.
  • Then sort Gaussians based on these keys using a single fast GPU Radix sort.
  • Finally, launching one thread block for each tile, for a given pixel, accumulate color and transparency values by traversing the lists front-to-back, until α \alpha α goes to one.

3.2 Adaptive Control of Gaussians

In the process of fitting gaussian to the scene, we should utilize the number and volume of gaussian to strengthen the representation of the scene. It contained two methods named clone and split, as shown below.

在这里插入图片描述

These were judged by the view-space positional gradients. Both under-reconstruction and over-construction have large view-space positional gradients. We will clone or split the gaussian according to different conditions.

3.3 Differentiable 3D Gaussian splatting

We have known the process of rendering and control of gaussian. Finally, we will talk about how to backward the gradients to where we can optimize. This is mainly about the processing of Gaussian function.

The basic simplified formulation of 3D Gaussain can be represented as:

G ( x ) = e − 1 2 ( x ) T Σ − 1 ( x ) . (4) G(x)=e^{-\frac{1}{2}(x)^{T}\Sigma^{-1}(x)}.\tag{4} G(x)=e21(x)TΣ1(x).(4)

We will use α \alpha α-blending to combine it to generate the rendering picture, so that we can calculate the loss function and finish the optimization. So now we need to know how to optimize and calculate the gradients of Gaussian.

When rasterizing, the three-dimensional scene needs to be transformed into a two-dimensional space. The author hopes that the 3D Gaussian will maintain its distribution during the transformation (otherwise, if the raster finish has nothing to do with Gaussian, all the efforts will be in vain). So we should choose a method to transfer the covariance matrix to camera coordinate without change the affine relation. That is

Σ ′ = J W Σ W T J T , (5) \Sigma'=JW\Sigma W^{T}J^{T},\tag{5} Σ=JWΣWTJT,(5)

where J J J is the Jacobian of the affine approximation of the projective transformation.

Another problem is that the covariance matrix must be semi-definite. So we use a scaling matrix 𝑆 and rotation matrix 𝑅 to assure it. That is

Σ = R S S T R T (6) \Sigma=RSS^{T}R^{T}\tag{6} Σ=RSSTRT(6)

And then we can use a 3D vector 𝑠 for scaling and a quaternion 𝑞 to represent rotation. The gradients will backward to them. These are the whole process of optimization.

4. Self-thoughts

  1. Summary of different representation
  • Explicit representation: Mesh, Point Cloud
  • Implicit representation
    • Volumetric representation: Nerf

      The density value returned by the sample points reflects whether there is geometric occupancy here.

    • Surface representation: SDF(Signed Distance Function)

      Outputs the distance to the nearest surface in the space from this point, where a positive value indicates outside the surface, and a negative value indicates inside the surface.

Refer:

[1]: 3D Gaussian Splatting:用于实时的辐射场渲染-CSDN博客

[2]: 【三维重建】3D Gaussian Splatting:实时的神经场渲染-CSDN博客

[3]: 3D Gaussian Splatting中的数学推导 - 知乎 (zhihu.com)

[4]: [NeRF坑浮沉记]3D Gaussian Splatting入门:如何表达几何 - 知乎 (zhihu.com)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1518175.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

FL Studio21水果软件2024最新中文版功能介绍及下载

FL Studio21,也被众多用户亲切地称为“水果”,是一款功能强大的数字音乐工作站软件。它为用户提供了一个完整的音乐制作环境,从编曲、录音、编辑到混音,几乎涵盖了音乐制作的所有环节。 FL Studio 21 Win-安装包下载如下: https:…

蓝桥杯深度优先搜索|剪枝|N皇后问题|路径之谜(C++)

搜索:暴力法算法思想的具体实现 搜索:通用的方法,一个问题如果比较难,那么先尝试一下搜索,或许能启发出更好的算法 技巧:竞赛时遇到不会的难题,用搜索提交一下,说不定部分判题数据很…

30-Java数据访问对象模式 ( Data Access Object )

Java数据访问对象模式 实现范例 数据访问对象模式(Data Access Object Pattern)或 DAO 模式用于把低级的数据访问 API 或操作从高级的业务服务中分离出来数据访问模式涉及到的参与者有: 数据访问对象接口(Data Access Object Inte…

计算机视觉——目标检测(R-CNN、Fast R-CNN、Faster R-CNN )

前言、相关知识 1.闭集和开集 开集:识别训练集不存在的样本类别。闭集:识别训练集已知的样本类别。 2.多模态信息融合 文本和图像,文本的语义信息映射成词向量,形成词典,嵌入到n维空间。 图片内容信息提取特征&…

五星门店小程序性能优化实践

一、背景介绍 1.1 业务介绍 五星门店小程序主要服务于五星线下门店交易场景,目前已有79个城市267家门店(包括超级体验店、城旗店、京东Mall等)在使用,用户可以通过小程序便捷地查看和购买门店的商品。五星门店小程序已实现基于T…

用Stable Diffusion生成同角色不同pose的人脸

随着技术的不断发展,我们现在可以使用稳定扩散技术(Stable Diffusion)来生成同一角色但不同姿势的人脸图片。本文将介绍这一方法的具体步骤,以及如何通过合理的提示语和模型选择来生成出更加真实和多样化的人脸图像。 博客首发地…

什么是VPS?如何使用VPS?

什么是VPS?VPS有什么用? VPS是Virtual Private Server的缩写,中文则为虚拟专用服务器,VPS是利用虚拟服务器软件在一台物理服务器上创建多个相互隔离的小服务器,是托管在机房物理服务器上的虚拟机。每个VPS服务器都可分…

基于Java+SpringBoot+vue的图书购物商城系统详细设计和实现

基于JavaSpringBootvue的图书购物商城系统详细设计和实现 博主介绍:多年java开发经验,专注Java开发、定制、远程、文档编写指导等,csdn特邀作者、专注于Java技术领域 作者主页 央顺技术团队 Java毕设项目精品实战案例《1000套》 欢迎点赞 收藏 ⭐留言 文…

解决驱动开发中<stdlib.h> no such file 的问题

前言 在进行驱动开发时&#xff0c;需要使用malloc等函数&#xff0c;导入C库<stdlib.h>出现bug。 嵌入式驱动学习专栏将详细记录博主学习驱动的详细过程&#xff0c;未来预计四个月将高强度更新本专栏&#xff0c;喜欢的可以关注本博主并订阅本专栏&#xff0c;一起讨论…

稳定可靠:PW2163降压芯片,实现5V至3.3V/3V高效转换,3A电流稳定输出

在现代电子设备中&#xff0c;电源管理芯片发挥着至关重要的作用。PW2163作为一款高效稳定的500kHz同步降压DC-DC转换器&#xff0c;凭借其出色的性能和广泛的应用领域&#xff0c;已成为众多电子设备中的电源管理新选择。 一、PW2163的显著特点与优势 PW2163具有内部集成低RD…

MPQ电源方案-MPQ79700与MPQ79500电源管理(续写中...)

MPQ电源方案 1.MPQ79500简介 MPQ79500是一款专为汽车安全应用设计的 6 通道电压监测器&#xff0c;每个通道都可以配置OV/UV检测&#xff0c;集成内置自检 (BIST) 等安全机制&#xff0c;诊断以及写保护来实现ASILD的应用要求。 2.MPQ79700简介 MPQ79700是一款 12 通道功能安全…

VB窗体单元格验证事件

缘由https://bbs.csdn.net/topics/396522344 Public Class VB解答专用窗体Private Sub VB解答专用窗体_Load(sender As Object, e As EventArgs) Handles MyBase.LoadDim 数据列表 New DataGridView数据列表.Parent Me数据列表.Columns.Add("序列", "序列&qu…

Linux 部署 Samba 服务

一、Ubuntu 部署 Samba 1、安装 Samba # 更新本地软件包列表 sudo apt update# 安装Samba sudo apt install samba# 查看版本 smbd --version2、创建共享文件夹&#xff0c;并配置 Samba 创建需要共享的文件夹&#xff0c;并赋予权限&#xff1a; sudo mkdir /home/test sud…

普林斯顿算法讲义(三)

原文&#xff1a;普林斯顿大学算法课程 译者&#xff1a;飞龙 协议&#xff1a;CC BY-NC-SA 4.0 4.2 有向图 原文&#xff1a;algs4.cs.princeton.edu/42digraph 译者&#xff1a;飞龙 协议&#xff1a;CC BY-NC-SA 4.0 有向图。 一个有向图&#xff08;或有向图&#xff09;是…

1.1 课程架构介绍:STM32H5信息安全特性概览

1.1 课程架构介绍&#xff1a;STM32H5信息安全特性概览 1. 概述 开发者在打造嵌入式系统时&#xff0c;安全和性能是产品开发设计的考量重点。为实现这一目标&#xff0c;ST推出了STM32H5系列&#xff0c;该系列作为微控制器新标杆面向工业应用市场&#xff0c;将为用户带来更…

服务器数据恢复—raid5热备盘上线同步数据失败的如何恢复数据

服务器数据恢复环境&故障&分析&#xff1a; 一台存储上有一组由多块硬盘组建的raid5阵列&#xff0c;该raid5阵列中的一块硬盘掉线&#xff0c;热备盘自动上线同步数据的过程中&#xff0c;raid阵列中又有一块硬盘掉线&#xff0c;热备盘的数据同步被中断&#xff0c;r…

云仓酒庄2024年新动态客户满意化战略

云仓酒庄2024年客户满意化战略&#xff1a;深化性价比与服务&#xff0c;提升复购率 在竞争激烈的酒业市场中&#xff0c;客户满意度已成为企业持续发展的关键因素。云仓酒庄深知&#xff0c;客户的满意不仅源于产品本身的质量&#xff0c;更在于其所能提供的性价比与优质服务…

2023年代理业绩同比增长12%,国民技术为世强先进颁发优秀代理商奖

近日&#xff0c;在国民技术&#xff08;300077&#xff09;主办的“2024年国民技术营销策略大会”上&#xff0c;世强先进&#xff08;深圳&#xff09;科技股份有限公司&#xff08;下称“世强先进”&#xff09;凭借过硬的创新研发以及产品推新实力&#xff0c;荣膺“2023年…

使用maven打生产环境可执行包

一、程序为什么要打包 程序打包的主要目的是将项目的源代码、依赖库和其他资源打包成一个可执行的文件或者部署包&#xff0c;方便程序的发布和部署。以下是一些打包程序的重要理由&#xff1a; 方便部署和分发&#xff1a;打包后的程序可以作为一个独立的实体&#xff0c;方便…

如何布局马斯克推特上喊的meme币赛道

2024年的牛市正如火如荼的开展&#xff0c;截止当下&#xff0c;比特币已经站上了7.3万美元&#xff0c;远超2021年高点的6.9万美元&#xff0c;比特币的未来是一片大海。 除了比特币的一枝独秀之外&#xff0c;meme板块可以说是市场资金最青睐的。尤其是马斯克在X分享PEPE相关…