【读论文】【泛读】S-NERF: NEURAL RADIANCE FIELDS FOR STREET VIEWS

news2025/1/11 23:59:14

文章目录

    • 0. Abstract
    • 1. Introduction
    • 2. Related work
    • 3. Methods-NERF FOR STREET VIEWS
      • 3.1 CAMERA POSE PROCESSING
      • 3.2 REPRESENTATION OF STREET SCENES
      • 3.3 DEPTH SUPERVISION
      • 3.4 Loss function
    • 4. EXPERIMENTS
    • 5. Conclusion
    • Reference

0. Abstract

Problem introduction:

However, we conjugate that this paradigm does not fit the nature of the street views that are collected by many self-driving cars from the large-scale unbounded scenes. Also,the onboard cameras perceive scenes without much overlapping.

Solutions:

  • Consider novel view synthesis of both the large-scale background scenes and the foreground moving vehicles jointly.

  • Improve the scene parameterization function and the camera poses.

  • We also use the the noisy and sparse LiDAR points to boost the training and learn a robust geometry and reprojection-based confidence to address the depth outliers.

  • Extend our S-NeRF for reconstructing moving vehicles

Effect:

Reduce 7 40% of the mean-squared error in the street-view synthesis and a 45% PSNR gain for the moving vehicles rendering

1. Introduction

Overcome the shortcomings of the work of predecessors:

  • MipNeRF-360 (Barron et al., 2022) is designed for training in unbounded scenes. But it still needs enough intersected camera rays

  • BlockNeRF (Tancik et al., 2022) proposes a block-combination strategy with refined poses, appearances, and exposure on the MipNeRF (Barron et al., 2021) base model for processing large-scale outdoor scenes. But it needs a special platform to collect the data, and is hard to utilize the existing dataset.

  • Urban-NeRF (Rematas et al., 2022) takes accurate dense LiDAR depth as supervision for the reconstruction of urban scenes. But the accurate dense LiDAR is too expensive. S-Nerf just needs the noisy sparse LiDAR signals.

2. Related work

There are lots of papers in the field of Large-scale NeRF and Depth supervised NeRF

在这里插入图片描述

This paper is included in the field of both of them.

3. Methods-NERF FOR STREET VIEWS

3.1 CAMERA POSE PROCESSING

SfM used in previous NeRFs fails. Therefore, we proposed two different methods to reconstruct the camera poses for the static background and the foreground moving vehicles.

  1. Background scenes

    For the static background, we use the camera parameters achieved by sensor-fusion SLAM and IMU of the self-driving cars (Caesar et al., 2019; Sun et al., 2020) and further reduce the inconsistency between multi-cameras with a learning-based pose refinement network.

  2. Moving vehicles
    在这里插入图片描述

    We now transform the coordinate system by setting the target object’s center as the coordinate system’s origin.
    P ^ i = ( P i P b − 1 ) − 1 = P b P i − 1 , P − 1 = [ R T − R T T 0 T 1 ] . \hat{P}_i=(P_iP_b^{-1})^{-1}=P_bP_i^{-1},\quad P^{-1}=\begin{bmatrix}R^T&-R^TT\\\mathbf{0}^T&1\end{bmatrix}. P^i=(PiPb1)1=PbPi1,P1=[RT0TRTT1].

    And, P = [ R T 0 T 1 ] P=\begin{bmatrix}R&T\\\mathbf{0}^T&1\end{bmatrix} P=[R0TT1]represents the old position of the camera or the target object.

    After the transformation, only the camera is moving which is favorable in training NeRFs.

    How to convert the parameter matrix.

3.2 REPRESENTATION OF STREET SCENES

  1. Background scenes

    constrain the whole scene into a bounded range

    image-20231121223229096

    This part is from mipnerf-360

  2. Moving Vehicles

    Compute the dense depth maps for the moving cars as an extra supervision

    • We follow GeoSim (Chen et al., 2021b) to reconstruct coarse mesh from multi-view images and the sparse LiDAR points.

    • After that, a differentiable neural renderer (Liu et al., 2019) is used to render the corresponding depth map with the camera parameter (Section 3.2).

    • The backgrounds are masked during the training by an instance segmentation network (Wang et al., 2020).

    There are three references

3.3 DEPTH SUPERVISION

image-20231123205037308

To provide credible depth supervisions from defect LiDAR depths, we first propagate the sparse depths and then construct a confidence map to address the depth outliers(异常值).

  1. LiDAR depth completion

    Use NLSPN (Park et al., 2020) to propagate the depth information from LiDAR points to surrounding pixels.

  2. Reprojection confidence

    Measure the accuracy of the depths and locate the outliers.

    The warping operation can be represented as:

    X t = ψ ( ψ − 1 ( X s , P s ) , P t ) \mathbf{X}_t=\psi(\psi^{-1}(\mathbf{X}_s,P_s),P_t) Xt=ψ(ψ1(Xs,Ps),Pt)

    And we use three features to measure the similarity:

    C r g b = 1 − ∣ I s − I ^ s ∣ , C s s i m = S S I M ( I s , I ^ s ) ) , C v g g = 1 − ∥ F s − F ^ s ∥ . \mathcal{C}_{\mathrm{rgb}}=1-|\mathcal{I}_{s}-\hat{\mathcal{I}}_{s}|,\quad\mathcal{C}_{\mathrm{ssim}}=\mathrm{SSIM}(\mathcal{I}_{s},\hat{\mathcal{I}}_{s})),\quad\mathcal{C}_{\mathrm{vgg}}=1-\|\mathcal{F}_{s}-\hat{\mathcal{F}}_{s}\|. Crgb=1IsI^s,Cssim=SSIM(Is,I^s)),Cvgg=1FsF^s∥.

  3. Geometry confidence

    Measure the geometry consistency of the depths and flows across different views.

    The depth:

    C d e p t h = γ ( ∣ d t − d ^ t ) ∣ / d s ) , γ ( x ) = { 0 , if x ≥ τ , 1 − x / τ , otherwise . \left.\mathcal{C}_{depth}=\gamma(|d_t-\hat{d}_t)|/d_s),\quad\gamma(x)=\left\{\begin{array}{cc}0,&\text{if}x\geq\tau,\\1-x/\tau,&\text{otherwise}.\end{array}\right.\right. Cdepth=γ(dtd^t)∣/ds),γ(x)={0,1x/τ,ifxτ,otherwise.

    The flow:

    C f l o w = γ ( ∥ Δ x , y − f s → t ( x s , y s ) ∥ ∥ Δ x , y ∥ ) , Δ x , y = ( x t − x s , y t − y s ) . \mathcal{C}_{flow}=\gamma(\frac{\|\Delta_{x,y}-f_{s\rightarrow t}(x_{s},y_{s})\|}{\|\Delta_{x,y}\|}),\quad\Delta_{x,y}=(x_{t}-x_{s},y_{t}-y_{s}). Cflow=γ(Δx,yΔx,yfst(xs,ys)),Δx,y=(xtxs,ytys).

  4. Learnable confidence combination

    The final confidence map can be learned as C ^ = ∑ i ω i C i , \hat{\mathcal{C}}=\sum_{i}{\omega_{i}\mathcal{C}_{i}}, C^=iωiCi, where ∑ i w i = 1 \sum_iw_i=1 iwi=1 and i ∈ { r g b , s s i m , v g g , d e p t h , f l o w } i \in \{rgb,ssim,vgg,depth,flow\} i{rgb,ssim,vgg,depth,flow}

3.4 Loss function

A RGB loss, a depth loss, and an edge-aware smoothness constraint to penalize large variances in depth.
L c o l o r = ∑ r ∈ R ∥ I ( r ) − I ^ ( r ) ∥ 2 2 L d e p t h = ∑ C ^ ⋅ ∣ D − D ^ ∣ L s m o o t h = ∑ ∣ ∂ x D ^ ∣ exp ⁡ − ∣ ∂ x I ∣ + ∣ ∂ y D ^ ∣ exp ⁡ − ∣ ∂ y I ∣ L t o t a l = L c o l o r + λ 1 L d e p t h + λ 2 L s m o o t h \begin{aligned} &\mathcal{L}_{\mathrm{color}}=\sum_{\mathbf{r}\in\mathcal{R}}\|I(\mathbf{r})-\hat{I}(\mathbf{r})\|_{2}^{2} \\ &\mathcal{L}_{depth}=\sum\hat{\mathcal{C}}\cdot|\mathcal{D}-\hat{\mathcal{D}}| \\ &\mathcal{L}_{smooth}=\sum|\partial_{x}\hat{\mathcal{D}}|\exp^{-|\partial_{x}I|}+|\partial_{y}\hat{\mathcal{D}}|\exp^{-|\partial_{y}I|} \\ &\mathcal{L}_{\mathrm{total}}=\mathcal{L}_{\mathrm{color}}+\lambda_{1}\mathcal{L}_{depth}+\lambda_{2}\mathcal{L}_{smooth} \end{aligned} Lcolor=rRI(r)I^(r)22Ldepth=C^DD^Lsmooth=xD^expxI+yD^expyILtotal=Lcolor+λ1Ldepth+λ2Lsmooth

4. EXPERIMENTS

  1. Dataset: nuScenes and Waymo
  • For the foreground vehicles, we extract car crops from nuScenes and Waymo video sequences.
  • For the large-scale background scenes, we use scenes with 90∼180 images.
  • In each scene, the ego vehicle moves around 10∼40 meters, and the whole scene span more than 200m.
  1. Foreground Vehicles

    Different experiments in static vehicles and moving vehicles, compared with Origin-NeRF and GeoSim(the latest non-NeRF car reconstruction method).

    image-20231128233015946

    There are large room for improvement in PSNR

  2. Background scenes

    Compared with the state-of-the-art methods Mip-NeRF (Barron et al., 2021), Urban-NeRF (Rematas et al., 2022), and Mip-NeRF 360.
    在这里插入图片描述
    Also, show a 360-degree panorama to emphasize some details:
    在这里插入图片描述

  3. BACKGROUND AND FOREGROUND FUSION
    Depth-guided placement and inpainting (e.g. GeoSim Chen et al. (2021b)) and joint NeRF rendering (e.g. GIRAFFE Niemeyer & Geiger (2021)) heavily rely on accurate depth maps and 3D geometry information.
    A new method was used without an introduction?

  4. Ablation study
    在这里插入图片描述

    Split into RGB, depth confidence, and smooth loss.

5. Conclusion

In the future, we will use the block merging as proposed in Block-NeRF (Tancik et al., 2022) to learn a larger city-level neural representation

Reference

[1] S-NeRF (ziyang-xie.github.io)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1264863.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

小狐狸ChatGPT付费创作系统V2.3.4独立版 +WEB端+ H5端最新去弹窗授权

ChatGPT付费创作系统V2.3.4版本优化了很多细节,如果使用着2.2.9版本建议没升级的必要。该版本为编译版无开源,2.3.X版本开始官方植入了更多的后门和更隐性的弹窗代码,后门及弹窗处理起来更麻烦。特别针对后台弹窗网址、暗链后门网址全部进行了…

UDS 相关时间参数

文章目录 UDS 全部时间参数UDS 应用层诊断时间参数1、P2 Client P2 Server P2* Client P2* Server 图例2、S3 Client S3 Server 图例 UDS CNA-TP网络层时间参数1、N_As/N_Ar 图例2、N_Bs 图例3、 N_Br 图例4、N_Cs 图例N_Cr 图例 UDS 网络层流控制时间参数 UDS 全部时间参数 UD…

vue3+ts 全局函数和变量的使用

<template><div>{{ $env }}<br />{{ $filters.format("的飞机") }}</div> </template><script setup lang"ts"> import { getCurrentInstance } from "vue"; const app getCurrentInstance(); console.log…

CVE-2020-11651(SaltStack认证绕过)漏洞复现

简介 SaltStack是使用Python开发的一个服务器基础架构集中化管理平台,底层采用动态的连接总线,使其可以用于编配,远程执行, 配置管理等等。 Salt非常容易设置和维护,而不用考虑项目的大小。从数量可观的本地网络系统,到跨数据中心的互联网部署,Salt设计为在任意数量的…

matlab频谱合成音乐《追光者》

选择你喜欢的一首钢琴曲&#xff0c;下载并分析曲谱&#xff0c;用matlab工具用频谱合成方法完成这首曲子的音乐合成。 前言&#xff1a;此文章为个人使用Matlab合成一首《追光者》音乐&#xff0c;且带混响和声效果 文章目录 一.题目二.要求三.课程设计目的四.概要设计五.详细…

Django项目部署本地windows IIS(详细版)和static文件设置(页面样式正常显示)

目录 必要条件&#xff1a; 一、下载并启用wfastcgi 二、window安装 IIS功能 三、IIS管理器中添加网站 1、复制项目 2、复制wfastcgi.py文件 3、创建文件web.config 4、添加网站&#xff0c;填写信息 5、启动fastcgi程序 6、修改进程标识 四、static文件设置和正确显…

吉利展厅 | 透明OLED拼接2x2:科技与艺术的完美融合

产品&#xff1a;4块55寸OLED透明拼接屏 项目地点&#xff1a;南宁 项目时间&#xff1a;2023年11月 应用场景&#xff1a;吉利展厅 在2023年11月的南宁&#xff0c;吉利展厅以其独特的展示设计吸引了众多参观者的目光。其中最引人注目的亮点是展厅中央一个由四块55寸OLED透…

pandas教程:USDA Food Database USDA食品数据库

文章目录 14.4 USDA Food Database&#xff08;美国农业部食品数据库&#xff09; 14.4 USDA Food Database&#xff08;美国农业部食品数据库&#xff09; 这个数据是关于食物营养成分的。存储格式是JSON&#xff0c;看起来像这样&#xff1a; {"id": 21441, &quo…

4、stable diffusion

github 安装anaconda环境 conda env create -f environment.yaml conda activate ldm安装依赖 conda install pytorch1.12.1 torchvision0.13.1 torchaudio0.12.1 cudatoolkit11.3 -c pytorch pip install transformers4.19.2 diffusers invisible-watermark pip install -e…

快速筛出EXCEL行中的重复项

比如A列是一些恶意IP需要导入防火墙&#xff0c;但包括一些重复项&#xff0c;为不产生错误&#xff0c;需要把重复项筛出来&#xff1a; 1、给A列排序&#xff0c;让重复项的内容排在相邻的行 2、在B列中写一个条件函数&#xff1a;IF(A1A2,1,0)&#xff0c;然后下拉至行尾完成…

2023-简单点-机器学习中常用的特殊函数,激活函数[sigmoid tanh ]

机器学习中的特殊函数 Sigmoidsoftplus函数tanhReLu(x)Leaky-ReluELUSiLu/ SwishMish伽玛函数beta函数Ref Sigmoid 值域: 【0,1】 定义域&#xff1a;【负无穷,正无穷】 特殊点记忆&#xff1a; 经过 [0 , 0.5] 关键点[0,0.5]处的导数是 0.025 相关导数&#xff1a; softplu…

群晖NAS配置之自有服务器frp实现内网穿透

什么是frp frp 是一个专注于内网穿透的高性能的反向代理应用&#xff0c;支持 TCP、UDP、HTTP、HTTPS 等多种协议&#xff0c;且支持 P2P 通信。可以将内网服务以安全、便捷的方式通过具有公网 IP 节点的中转暴露到公网。今天跟大家分享一下frp实现内网穿透 为什么使用 frp &a…

selenium 工具 的基本使用

公司每天要做工作汇报&#xff0c;汇报使用的网页版&#xff0c; 所以又想起 selenium 这个老朋友了。 再次上手&#xff0c;发现很多接口都变了&#xff0c; 怎么说呢&#xff0c; 应该是易用性更强了&#xff0c; 不过还是得重新看看&#xff0c; 我这里是python3。 pip安装…

Blender动画导入Three.js

你是否在把 Blender 动画导入你的 ThreeJS 游戏(或项目)中工作时遇到问题? 您的 .glb (glTF) 文件是否正在加载,但没有显示任何内容? 你的骨骼没有正确克隆吗? 如果是这样,请阅读我如何使用 SkeletonUtils.js 解决此问题 1、前提条件 你正在使用 Blender 3.1+(此版本…

微服务--03--OpenFeign 实现远程调用 (负载均衡组件SpringCloudLoadBalancer)

提示&#xff1a;文章写完后&#xff0c;目录可以自动生成&#xff0c;如何生成可参考右边的帮助文档 文章目录 OpenFeign其作用就是基于SpringMVC的常见注解&#xff0c;帮我们优雅的实现http请求的发送。 RestTemplate实现了服务的远程调用 OpenFeign快速入门负载均衡组件Spr…

【Linux】Linux第一个小程序 --- 进度条

&#x1f466;个人主页&#xff1a;Weraphael ✍&#x1f3fb;作者简介&#xff1a;目前正在学习c和Linux还有算法 ✈️专栏&#xff1a;Linux &#x1f40b; 希望大家多多支持&#xff0c;咱一起进步&#xff01;&#x1f601; 如果文章有啥瑕疵&#xff0c;希望大佬指点一二 …

内置函数【MySQL】

文章目录 MySQL 内置函数日期和时间函数字符串函数数学函数信息函数参考资料 MySQL 内置函数 MySQL 的内置函数主要分为以下几种&#xff1a; 字符串函数&#xff1a;用于对字符串进行操作&#xff0c;如连接、截取、替换、反转、格式化等。数值函数&#xff1a;用于对数值进…

Phpstudy v8.0/8.1小皮升级Apache至最新,同时升级openssl版本httpd-2.4.58 apache 2.4.58

1.apache官网下载最新版本的apache 2.4.58 2.phpstudy下apache停止运行&#xff0c;把原来的Apache文件夹备份一份 复制图中的文件替换apache目录下文件 3.phpstudy中开启apache

西南科技大学(数据结构A)期末自测练习二

一、填空题(每空1分,共10分) 1、在线性表的下列运算中,不改变数据元素之间结构关系的运算是( D ) A、插入 B、删除 C、排序 D、定位 2、顺序表中第一个元素的存储地址是100,每个元素的长度为2,则第5个元素的地址是( B ) A.110 B.108 C.100 …

11.28C++

#include <iostream>using namespace std;int main() {string str;cout << "请输入一个字符串&#xff1a;" << endl;getline(cin,str);int size str.size();int a0,b0,c0,d0,e0;for(int i0; i < size; i){if(str.at(i) > A && str…