【综述】Diffusion Models: A Comprehensive Survey of Methods and Applications

news2024/11/14 21:45:00

Diffusion Models: A Comprehensive Survey of Methods and Applications

论文:https://arxiv.org/abs/2209.00796

github:https://github.com/YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy

目录

Diffusion Models: A Comprehensive Survey of Methods and Applications

Algorithm Taxonomy

1. Efficient Sampling

1.1 Learning-Free Sampling

1.1.1 SDE Solver

1.1.2 ODE Solver

1.2 Learning-Based Sampling

1.2.1 Optimized Discretization

1.2.2 Knowledge Distillation

1.2.3 Truncated Diffusion

2. Improved Likelihood

2.1. Noise Schedule Optimization

2.2. Reverse Variance Learning

2.3. Exact Likelihood Computation

3. Data with Special Structures

3.1. Data with Manifold Structures

3.1.1 Known Manifolds

3.1.2 Learned Manifolds

3.2. Data with Invariant Structures

3.3 Discrete Data

Application Taxonomy

1. Computer Vision

2. Natural Language Processing

3. Temporal Data Modeling

4. Multi-Modal Learning

5. Robust Learning

6. Molecular Graph Modeling

7. Material Design

8. Medical Image Reconstruction

Connections with Other Generative Models

1. Variational Autoencoder

2. Generative Adversarial Network

3. Normalizing Flow

4. Autoregressive Models

5. Energy-Based Models


Algorithm Taxonomy

1. Efficient Sampling

1.1 Learning-Free Sampling

1.1.1 SDE Solver

Score-Based Generative Modeling through Stochastic Differential Equations

Adversarial score matching and improved sampling for image generation

Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

Gotta Go Fast When Generating Data with Score-Based Models

Elucidating the Design Space of Diffusion-Based Generative Models

Generative modeling by estimating gradients of the data distribution

1.1.2 ODE Solver

Denoising Diffusion Implicit Models

gDDIM: Generalized denoising diffusion implicit models

Elucidating the Design Space of Diffusion-Based Generative Models

DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Step

Pseudo Numerical Methods for Diffusion Models on Manifolds

Fast Sampling of Diffusion Models with Exponential Integrator

Poisson flow generative models

1.2 Learning-Based Sampling

1.2.1 Optimized Discretization

Learning to Efficiently Sample from Diffusion Probabilistic Models

GENIE: Higher-Order Denoising Diffusion Solvers

Learning fast samplers for diffusion models by differentiating through sample quality

1.2.2 Knowledge Distillation

Progressive Distillation for Fast Sampling of Diffusion Models

Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed

1.2.3 Truncated Diffusion

Accelerating Diffusion Models via Early Stop of the Diffusion Process

Truncated Diffusion Probabilistic Models

2. Improved Likelihood

2.1. Noise Schedule Optimization

Improved denoising diffusion probabilistic models

Variational diffusion models

2.2. Reverse Variance Learning

Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models

Improved denoising diffusion probabilistic models

Stable Target Field for Reduced Variance Score Estimation in Diffusion Models

2.3. Exact Likelihood Computation

Score-Based Generative Modeling through Stochastic Differential Equations

Maximum likelihood training of score-based diffusion models

A variational perspective on diffusion-based generative models and score matching

Score-Based Generative Modeling through Stochastic Differential Equations

Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching

Maximum Likelihood Training of Implicit Nonlinear Diffusion Models

3. Data with Special Structures

3.1. Data with Manifold Structures

3.1.1 Known Manifolds

Riemannian Score-Based Generative Modeling

Riemannian Diffusion Models

3.1.2 Learned Manifolds

Score-based generative modeling in latent space

Diffusion priors in variational autoencoders

Hierarchical text-conditional image generation with clip latents

High-resolution image synthesis with latent diffusion models

3.2. Data with Invariant Structures

GeoDiff: A Geometric Diffusion Model for Molecular Conformation Generation

Permutation invariant graph generation via score-based generative modeling

Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations

DiGress: Discrete Denoising diffusion for graph generation

Learning gradient fields for molecular conformation generation

Graphgdp: Generative diffusion processes for permutation invariant graph generation

SwinGNN: Rethinking Permutation Invariance in Diffusion Models for Graph Generation

3.3 Discrete Data

Vector quantized diffusion model for text-to-image synthesis

Structured Denoising Diffusion Models in Discrete State-Spaces

Vector Quantized Diffusion Model with CodeUnet for Text-to-Sign Pose Sequences Generation

Deep Unsupervised Learning using Non equilibrium Thermodynamics.

A Continuous Time Framework for Discrete Denoising Models

Application Taxonomy

1. Computer Vision

  • Conditional Image Generation (Image Super Resolution, Inpainting, Translation, Manipulation)

    • SRDiff: Single Image Super-Resolution with Diffusion Probabilistic Models

    • Image Super-Resolution via Iterative Refinement

    • High-Resolution Image Synthesis with Latent Diffusion Models

    • Repaint: Inpainting using denoising diffusion probabilistic models.

    • Palette: Image-to-image diffusion models.

    • Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models

    • Cascaded Diffusion Models for High Fidelity Image Generation.

    • Conditional image generation with score-based diffusion models

    • Unsupervised Medical Image Translation with Adversarial Diffusion Models

    • Score-based diffusion models for accelerated MRI

    • Solving Inverse Problems in Medical Imaging with Score-Based Generative Models

    • MR Image Denoising and Super-Resolution Using Regularized Reverse Diffusion

    • Sdedit: Guided image synthesis and editing with stochastic differential equations

    • Soft diffusion: Score matching for general corruptions

    • Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training

    • ControlNet: Adding Conditional Control to Text-to-Image Diffusion Models

    • Image Restoration with Mean-Reverting Stochastic Differential Equations

  • Semantic Segmentation
    • Label-Efficient Semantic Segmentation with Diffusion Models.
    • Decoder Denoising Pretraining for Semantic Segmentation.
    • Diffusion models as plug-and-play priors

  • Video Generation
    • Flexible Diffusion Modeling of Long Videos
    • Video diffusion models
    • Diffusion probabilistic modeling for video generation
    • MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model.

  • 3D Generation
    • 3d shape generation and completion through point-voxel diffusion
    • Diffusion probabilistic models for 3d point cloud generation
    • A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion
    • Let us Build Bridges: Understanding and Extending Diffusion Generative Models.
    • LION: Latent Point Diffusion Models for 3D Shape Generation
    • Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior
    • Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
    • RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
    • HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images
    • Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
    • DiffRF: Rendering-Guided 3D Radiance Field Diffusion
    • DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models
    • 3D Neural Field Generation using Triplane Diffusion

  • Anomaly Detection
    • AnoDDPM: Anomaly Detection With Denoising Diffusion Probabilistic Models Using Simplex Noise
    • Remote Sensing Change Detection (Segmentation) using Denoising Diffusion Probabilistic Models.

  • Object Detection
    • DiffusionDet: Diffusion Model for Object Detection

2. Natural Language Processing

  • Structured denoising diffusion models in discrete state-spaces
  • Diffusion-LM Improves Controllable Text Generation.
  • Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
  • DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

3. Temporal Data Modeling

  • Time Series Imputation
    • CSDI: Conditional score-based diffusion models for probabilistic time series imputation
    • Diffusion-based Time Series Imputation and Forecasting with Structured State Space Models
    • Neural Markov Controlled SDE: Stochastic Optimization for Continuous-Time Data

  • Time Series Forecasting
    • Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting
    • Diffusion-based Time Series Imputation and Forecasting with Structured State Space Models

  • Waveform Signal Processing
    • WaveGrad: Estimating Gradients for Waveform Generation.
    • DiffWave: A Versatile Diffusion Model for Audio Synthesis

4. Multi-Modal Learning

  • Text-to-Image Generation
    • Blended diffusion for text-driven editing of natural images
    • Hierarchical Text-Conditional Image Generation with CLIP Latents
    • Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
    • GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
    • Vector quantized diffusion model for text-to-image synthesis.
    • Frido: Feature Pyramid Diffusion for Complex Image Synthesis.
    • DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
    • Imagic: Text-Based Real Image Editing with Diffusion Models
    • UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image
    • DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
    • One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
    • TextDiffuser: Diffusion Models as Text Painters

  • Text-to-3D Generation
    • Magic3D: High-Resolution Text-to-3D Content Creation
    • DreamFusion: Text-to-3D using 2D Diffusion
    • Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior
    • Shap·E: Generating Conditional 3D Implicit Functions
    • Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation
    • Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
    • ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation

  • Scene Graph-to-Image Generation
    • Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training

  • Text-to-Audio Generation
    • Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
    • Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
    • Diffsound: Discrete Diffusion Model for Text-to-sound Generation
    • ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation
    • Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models
    • EdiTTS: Score-based Editing for Controllable Text-to-Speech.
    • ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech.
    • Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model

  • Text-to-Motion Generation
    • Human motion diffusion model
    • Motiondiffuse: Text-driven human motion generation with diffusion model
    • Flame: Free-form language-based motion synthesis & editing

  • Text-to-Video Generation/Editting
    • Make-a-video: Text-to-video generation without text-video data
    • Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
    • FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
    • Imagen video: High definition video generation with diffusion models
    • Conditional Image-to-Video Generation with Latent Flow Diffusion Models
    • Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
    • Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
    • Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos
    • Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
    • ControlVideo: Training-free Controllable Text-to-Video Generation

5. Robust Learning

  • Data Purification
    • Diffusion Models for Adversarial Purification
    • Adversarial purification with score-based generative models
    • Threat Model-Agnostic Adversarial Defense using Diffusion Models
    • Guided Diffusion Model for Adversarial Purification
    • Guided Diffusion Model for Adversarial Purification from Random Noise
    • PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition.

  • Generating Synthetic Data for Robust Learning
    • Generating high fidelity data from low-density regions using diffusion models
    • Don’t Play Favorites: Minority Guidance for Diffusion Models
    • Better diffusion models further improve adversarial training

6. Molecular Graph Modeling

  • Torsional Diffusion for Molecular Conformer Generation.
  • Equivariant Diffusion for Molecule Generation in 3D
  • Protein Structure and Sequence Generation with Equivariant Denoising Diffusion Probabilistic Models
  • GeoDiff: A Geometric Diffusion Model for Molecular Conformation Generation
  • Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem
  • Diffusion-based Molecule Generation with Informative Prior Bridge
  • Learning gradient fields for molecular conformation generation
  • Predicting molecular conformation via dynamic graph score matching.
  • DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking
  • 3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction
  • Learning Joint 2D & 3D Diffusion Models for Complete Molecule Generation

7. Material Design

  • Crystal Diffusion Variational Autoencoder for Periodic Material Generation
  • Antigen-specific antibody design and optimization with diffusion-based generative models

8. Medical Image Reconstruction

  • Solving Inverse Problems in Medical Imaging with Score-Based Generative Models
  • MR Image Denoising and Super-Resolution Using Regularized Reverse Diffusion
  • Score-based diffusion models for accelerated MRI
  • Towards performant and reliable undersampled MR reconstruction via diffusion model sampling
  • Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction

Connections with Other Generative Models

1. Variational Autoencoder

  • Understanding Diffusion Models: A Unified Perspective
  • A variational perspective on diffusion-based generative models and score matching
  • Score-based generative modeling in latent space

2. Generative Adversarial Network

  • Diffusion-GAN: Training GANs with Diffusion.
  • Tackling the generative learning trilemma with denoising diffusion gans

3. Normalizing Flow

  • Diffusion Normalizing Flow
  • Interpreting diffusion score matching using normalizing flow
  • Maximum Likelihood Training of Implicit Nonlinear Diffusion Models

4. Autoregressive Models

  • Autoregressive Diffusion Models.
  • Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting.

5. Energy-Based Models

  • Learning Energy-Based Models by Diffusion Recovery Likelihood
  • Latent Diffusion Energy-Based Model for Interpretable Text Modeling

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1111229.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

双指针——复写零

一,题目要求 给你一个长度固定的整数数组 arr ,请你将该数组中出现的每个零都复写一遍,并将其余的元素向右平移。 注意:请不要在超过该数组长度的位置写入元素。请对输入的数组 就地 进行上述修改,不要从函数返回任何东…

【Java基础面试三十一】、String a = “abc“; ,说一下这个过程会创建什么,放在哪里?

文章底部有个人公众号:热爱技术的小郑。主要分享开发知识、学习资料、毕业设计指导等。有兴趣的可以关注一下。为何分享? 踩过的坑没必要让别人在再踩,自己复盘也能加深记忆。利己利人、所谓双赢。 面试官:String a “abc”; &am…

web:[极客大挑战 2019]HardSQL

题目 打开页面显示为 查看源代码没有发现其他的提示信息,随便尝试一下 错误 题目名为hardsql,先来尝试有无sql注入存在 尝试输入单引号输入 显示页面存在注入 这里按照常规思路继续使用order by函数和union select函数进行查询,但是页面没有…

019-第三代软件开发-Git提交规范

第三代软件开发-Git提交规范 文章目录 第三代软件开发-Git提交规范项目介绍Git提交规范分支规范Commit Message FormatHeaderBodyFooterRevert 总结一下 关键字: Qt、 Qml、 git、 Commit、 release 项目介绍 欢迎来到我们的 QML & C 项目!这个…

【三维重建】DreamGaussian:高斯splatting的单视图3D内容生成(原理+代码)

文章目录 摘要一、前言二、相关工作2.1 3D表示2.2 Text-to-3D2.3 Image-to-3D 三、本文方法3.1生成式 高斯 splitting3.2 高效的 mesh 提取3.3 UV空间的纹理优化 四. 实验4.1实施细节4.2 定性比较4.3 定量比较4.4 消融实验 总结(特点、局限性) 五、安装与…

【MySQL系列】- Select查询SQL执行过程详解

【MySQL系列】- Select查询SQL执行过程详解 文章目录 【MySQL系列】- Select查询SQL执行过程详解一、SQL查询语句的执行过程二、SQL执行过程详解2.1. 连接器2.2. 查询缓存2.3. 分析器2.4. 优化器2.5. 执行器 三、undo log 和 redo log作⽤3.1. redo log (重做日志&a…

面试算法31:最近最少使用缓存

题目 请设计实现一个最近最少使用(Least Recently Used,LRU)缓存,要求如下两个操作的时间复杂度都是O(1)。 get(key):如果缓存中存在键key,则返回它对应的值…

lunux查找占用内存前10的进程

1、使用Top命令查询进程 输入 top 命令,然后按下大写M按照内存MEM排序,按下大写P按照CPU排序。 2、查询占用CPU最高的前10个进程 ps aux|head -1;ps aux|grep -v PID|sort -rn -k 3|head 3、查询占用内存最大的前10个进程 ps aux|head -1;ps aux|grep …

Visual Studio Code官网下载、vscode下载很慢、vscode下载不了 解决方案

前言 开发界的小伙伴们对于Visual Studio Code开发环境来可以说非常熟悉了,但由于在Visual Studio Code官网的下载速度非常的慢,即便开了代理也是一样的很慢、甚至下载被中断,几乎不能下载。 解决方案 1、在Web浏览器上打开vscode官网&#…

日常中msvcp71.dll丢失怎样修复?分享5个修复方法

在 Windows 系统中,msvcp71.dll 是一个非常重要的动态链接库文件,它承载了许多应用程序和游戏的运行。如果您的系统中丢失了这个文件,那么您可能会遇到无法打开程序、程序崩溃或出现错误提示等问题。本文将介绍 5 个快速修复 msvcp71.dll 丢失…

最新视频/图集去水印小程序源码/步数小程序源码/王者战力小程序源码/红包封面小程序源码

自带多平台解析接口 短视频去水印图集水印小程序源码 ,这是一款支持多种平台去水印的一款微信小程序源码 支持短视频去水印,还有图集去水印等。内含多平台去水印接口,响应的速度也是非常的快,这是一款非常值得推荐的一款小程序源…

2008-2021年上市公司实体企业金融化程度测算数据(原始数据+stata代码)

2008-2021年上市公司实体企业金融化程度测算(原始数据stata代码) 1、时间:2008-2021年 2、指标:股票代码、年份、交易性金融资产、衍生金融资产、发放贷款及垫款净额、可供出售金融资产净额、持有至到期投资净额、长期债权投资净…

【Java基础面试二十四】、String类有哪些方法?

文章底部有个人公众号:热爱技术的小郑。主要分享开发知识、学习资料、毕业设计指导等。有兴趣的可以关注一下。为何分享? 踩过的坑没必要让别人在再踩,自己复盘也能加深记忆。利己利人、所谓双赢。 面试官:String类有哪些方法&…

PTQ量化和QAT量化

目录 1--PTQ量化 2--QAT量化 1--PTQ量化 PTQ量化表示训练后量化(Post Training Quantization)。使用一批校准数据对训练好的模型进行校准,将训练好的FP32网络直接转换为定点计算的网络,过程中无需对原始模型进行任何训练&#x…

【扩散模型】如何用最几毛钱生成壁纸

通过学习扩散模型了解到了统计学的美好,然后顺便记录下我之前文生图的基础流程~ 扩散模型简介 这次是在DataWhale的组队学习里学习的,HuggingFace开放扩散模型学习地址 扩散模型训练时通过对原图增加高斯噪声,在推理时通过降噪来得到原图&…

【QT 读取JSON】 深入浅出 使用QT内置的QJson模块解析Json文件 匠心之作

目录 0 引言1 Json数据分析2 解析Json数据 🙋‍♂️ 作者:海码007📜 专栏:QT专栏💥 标题:【QT 读取JSON】 使用QT内置的QJson模块解析Json文件❣️ 寄语:人生的意义或许可以发挥自己全部的潜力&…

大中小企业自招人力及劳务派遣全行业招聘来抖音招聘流量大效果佳

抖音直播招聘报白是通过抖音直播方式展现职位信息,并与求职者进行互动的招聘方式。在抖音平台上,企业或者人力资源公司可以通过直播的形式,将职位以视频直播的方式展现出来。通过抖音直播招聘报白,企业或者人力资源公司可以利用抖…

英语——分享篇——每日200词——1601-1800

1601——starve——[stɑːv]——vi.挨饿,饿——starve——star星星(熟词)ve维E(谐音)——星星吃了维E就不用挨饿——We gave them bread, lest they should starve.——我们给他们面包,以免他们饿死。 1602——blossom——[blɒs(ə)m]——vi.开花&…

新手如何找到Docker容器(redis)中的持久化文件?

具体步骤 要查看Docker容器的dump.rdb和appendonly.aof文件(如果启用了AOF持久化)的位置,我们需要知道容器中Redis配置文件的内容或者容器的数据卷的挂载位置。 这里是一般步骤: 查找容器的数据卷挂载位置 使用docker inspect命令…

MySQL的索引——索引的介绍及其数据结构B+树 索引的类型 索引的使用及其失效场景 相关名词解释

前言 索引是存储引擎用于快速查找数据纪录的一种数据结构,索引是数据库中经常提及的一个词,究竟什么是索引,索引的数据结构是什么,索引有什么类型? 本篇博客尝试阐述数据库索引的相关内容,涉及什么是索引…