Apple researches diffusion model generalization; Hugging Face details Stable Diffusion tuning

Apple Machine Learning Research TIER_1 · 2026-04-28 00:00

Local Mechanisms of Compositional Generalization in Conditional Diffusion

Conditional diffusion models appear capable of compositional generalization, i.e., generating convincing samples for out-of-distribution combinations of conditioners, but the mechanisms underlying this ability remain unclear. To make this concrete, we study length generalization,…

Hugging Face Blog TIER_1 · 2023-09-29 00:00

Finetune Stable Diffusion Models with DDPO via TRL

Hugging Face Blog TIER_1 · 2023-07-14 00:00

Fine-tuning Stable Diffusion models on Intel CPUs

Hugging Face Blog TIER_1 · 2023-05-25 00:00

Optimizing Stable Diffusion for Intel CPUs with NNCF and 🤗 Optimum

Hugging Face Blog TIER_1 · 2023-05-23 00:00

Instruction-tuning Stable Diffusion with InstructPix2Pix

Hugging Face Blog TIER_1 · 2023-03-28 00:00

Accelerating Stable Diffusion Inference on Intel CPUs

Hugging Face Blog TIER_1 · 2023-01-26 00:00

Using LoRA for Efficient Stable Diffusion Fine-Tuning

Hugging Face Blog TIER_1 · 2022-10-05 00:00

Japanese Stable Diffusion

Hugging Face Blog TIER_1 · 2022-06-07 00:00

The Annotated Diffusion Model

arXiv cs.AI TIER_1 · Shubhankar Mohapatra · 2026-05-12 14:26

DriftXpress: Faster Drifting Models via Projected RKHS Fields

Drifting Models have emerged as a new paradigm for one-step generative modeling, achieving strong image quality without iterative inference. The premise is to replace the iterative denoising process in diffusion models with a single evaluation of a generator. However, this create…

arXiv cs.CL TIER_1 · Haoliang Li · 2026-05-12 09:39

Self-Distilled Trajectory-Aware Boltzmann Modeling: Bridging the Training-Inference Discrepancy in Diffusion Language Models

Diffusion Language Models (DLMs) have recently emerged as a promising alternative to autoregressive language models, offering stronger global awareness and highly parallel generation. However, post-training DLMs with standard Negative Evidence Lower Bound (NELBO)-based supervised…

arXiv cs.LG TIER_1 · Rui Yu · 2026-05-11 16:21

Elucidating Representation Degradation Problem in Diffusion Model Training

Diffusion models have achieved remarkable success, yet their training remains inefficient due to a severe optimization bottleneck, which we term Representation Degradation. As noise levels increase, the outputs of the trained model exhibit progressive structural distortion, which…

arXiv cs.LG TIER_1 Deutsch(DE) · Floor Eijkelboom · 2026-05-11 15:33

Kernel Gradient Drifting Models

We propose kernel-gradient drifting, a one-step generative modeling framework that replaces the fixed Euclidean displacement direction in drifting models with directions induced by the kernel itself. Standard drifting is attractive because it enables fast, high-quality generation…

Hugging Face Daily Papers TIER_1 Deutsch(DE) · 2026-05-11 15:33

Kernel Gradient Drifting Models

We propose kernel-gradient drifting, a one-step generative modeling framework that replaces the fixed Euclidean displacement direction in drifting models with directions induced by the kernel itself. Standard drifting is attractive because it enables fast, high-quality generation…

arXiv cs.LG TIER_1 · Aaron R. Dinner · 2026-05-11 14:29

Composing diffusion priors with explicit physical context via generative Gibbs sampling

Pretrained diffusion models provide powerful learned priors, but in scientific sampling the target distribution often depends on physical context that is not fully represented by one generative model. We introduce Generative Gibbs for Physics-Aware Sampling (GG-PA), a training-fr…

arXiv cs.CL TIER_1 · Difan Zou · 2026-05-11 08:58

Relative Score Policy Optimization for Diffusion Language Models

Diffusion large language models (dLLMs) offer a promising route to parallel and efficient text generation, but improving their reasoning ability requires effective post-training. Reinforcement learning with verifiable rewards (RLVR) is a natural choice for this purpose, yet its a…

arXiv cs.AI TIER_1 · Andrea M. Tonello · 2026-05-11 08:46

Empty SPACE: Cross-Attention Sparsity for Concept Erasure in Diffusion Models

Erasing specific concepts from text-to-image diffusion models is essential for avoiding the generation of copyrighted and explicit content. Closed-form concept erasure methods offer a fast alternative to backpropagation-based techniques, but they become less effective when scalin…

arXiv cs.CL TIER_1 · Hongsheng Li · 2026-05-10 15:31

Edit-Based Refinement for Parallel Masked Diffusion Language Models

Masked diffusion language models enable parallel token generation and offer improved decoding efficiency over autoregressive models. However, their performance degrades significantly when generating multiple tokens simultaneously, due to a mismatch between token-level training ob…

arXiv cs.LG TIER_1 · Erhan Bayraktar · 2026-05-08 16:30

When Diffusion Model Can Ignore Dimension: An Entropy-Based Theory

Diffusion models perform remarkably well on high-dimensional data such as images, often using only a modest number of reverse-time steps. Despite this practical success, existing convergence theory does not fully explain why such samplers remain efficient in high dimensions. Many…

arXiv cs.CL TIER_1 · Dmitry Vetrov · 2026-05-08 16:05

How to Train Your Latent Diffusion Language Model Jointly With the Latent Space

Latent diffusion models offer an attractive alternative to discrete diffusion for non-autoregressive text generation by operating on continuous text representations and denoising entire sequences in parallel. The major challenge in latent diffusion modeling is constructing a suit…

arXiv cs.CL TIER_1 · Tim Van de Cruys · 2026-05-08 13:12

Guidance Is Not a Hyperparameter: Learning Dynamic Control in Diffusion Language Models

Classifier-Free Guidance (CFG) is a widely used mechanism for controlling diffusion-based generative models, yet its guidance scale is typically treated as a fixed hyperparameter throughout generation. This static design yields a suboptimal controllability and quality tradeoff, a…

arXiv cs.LG TIER_1 · Chun Kai Ling · 2026-05-08 09:02

Inference-Time Attribute Distribution Alignment for Unconditional Diffusion

Inference-time controllable generation is essential for real-world applications of unconditional diffusion models. However, most existing techniques focus on individual samples, struggling in applications that require the sample population to follow specific attribute distributio…

arXiv cs.LG TIER_1 · Sankarshana Venugopal (Seoul National University), Mohammad Mostafavi (Seoul National University), Jonghyun Choi (Seoul National University) · 2026-05-08 04:00

DBMSolver: A Training-free Diffusion Bridge Sampler for High-Quality Image-to-Image Translation

arXiv:2605.05889v1 Announce Type: cross Abstract: Diffusion-based image-to-image (I2I) translation excels in high-fidelity generation but suffers from slow sampling in state-of-the-art Diffusion Bridge Models (DBMs), often requiring dozens of function evaluations (NFEs). We intro…

arXiv cs.LG TIER_1 · Ahmad Aghapour, Erhan Bayraktar, Asaf Cohen · 2026-05-08 04:00

Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees

arXiv:2605.05387v1 Announce Type: new Abstract: We study zero-shot conditional sampling with pretrained diffusion models for linear inverse problems, including inpainting and super-resolution. In these problems, the observation determines only part of the unknown signal. The rema…

arXiv cs.LG TIER_1 · Pierre Marion, Yu-Han Wu · 2026-05-08 04:00

Understanding diffusion models requires rethinking (again) generalization

arXiv:2605.06077v1 Announce Type: new Abstract: This position paper argues that understanding generalization in diffusion models requires fundamentally new theoretical frameworks that go beyond both classical statistical learning theory and the benign overfitting paradigm develop…

arXiv cs.LG TIER_1 · Canyu Zhao, Hao Chen, Yunze Tong, Yu Qiao, Jiacheng Li, Chunhua Shen · 2026-05-08 04:00

MARBLE: Multi-Aspect Reward Balance for Diffusion RL

arXiv:2605.06507v1 Announce Type: cross Abstract: Reinforcement learning fine-tuning has become the dominant approach for aligning diffusion models with human preferences. However, assessing images is intrinsically a multi-dimensional task, and multiple evaluation criteria need t…

arXiv cs.LG TIER_1 · Pengqi Lu · 2026-05-08 04:00

Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers

arXiv:2605.06169v1 Announce Type: new Abstract: Scaling Diffusion Transformers (DiTs) to hundreds of layers introduces a structural vulnerability: networks can enter a silent, mean-dominated collapse state that homogenizes token representations and suppresses centered variation. …

arXiv cs.LG TIER_1 · Benjamin Sterling, Yousef El-Laham, M\'onica F. Bugallo · 2026-05-08 04:00

Defending Diffusion Models Against Membership Inference Attacks via Higher-Order Langevin Dynamics

arXiv:2509.14225v3 Announce Type: replace Abstract: Recent advances in generative artificial intelligence applications have raised new data security concerns. This paper focuses on defending diffusion models against membership inference attacks. This type of attack occurs when th…

arXiv cs.LG TIER_1 · Manyi Li, Yufan Liu, Lai Jiang, Bing Li, Yuming Li, Weiming Hu · 2026-05-08 04:00

The Illusion of Forgetting: Attack Unlearned Diffusion via Initial Latent Variable Optimization

arXiv:2602.00175v2 Announce Type: replace Abstract: Text-to-image diffusion models (DMs) are frequently abused to produce harmful or copyrighted content, violating public interests. Concept erasure (unlearning) is a promising paradigm to alleviate this issue. However, there exist…

arXiv cs.LG TIER_1 · Tongda Xu, Mingwei He, Shady Abu-Hussein, Jose Miguel Hernandez-Lobato, Chunhang Zheng, Kai Zhao, Chao Zhou, Ya-Qin Zhang, Yan Wang · 2026-05-08 04:00

Making Reconstruction FID Predictive of Diffusion Generation FID

arXiv:2603.05630v2 Announce Type: replace-cross Abstract: It is well known that the reconstruction FID (rFID) of a VAE is poorly correlated with the generation FID (gFID) of a latent diffusion model. We propose interpolated FID (iFID), a simple variant of rFID that exhibits a str…

arXiv cs.LG TIER_1 · Flavio Nicoletti, Chenxiao Ma, Enrico Ventura, Luca Saglietti, Stefano Sarao Mannelli · 2026-05-08 04:00

The Interplay of Data Structure and Imbalance in the Learning Dynamics of Diffusion Models

arXiv:2605.06367v1 Announce Type: cross Abstract: Real-world datasets are inherently heterogeneous, yet how per-class structural differences and sampling imbalance shape the training dynamics of diffusion models-and potentially exacerbate disparities-remains poorly understood. Wh…

arXiv cs.CL TIER_1 Français(FR) · Hongcan Guo, Qinyu Zhao, Yian Zhao, Shen Nie, Rui Zhu, Qiushan Guo, Feng Wang, Tao Yang, Hengshuang Zhao, Guoqiang Wei, Yan Zeng · 2026-05-08 04:00

Continuous Latent Diffusion Language Model

arXiv:2605.06548v1 Announce Type: new Abstract: Large language models have achieved remarkable success under the autoregressive paradigm, yet high-quality text generation need not be tied to a fixed left-to-right order. Existing alternatives still struggle to jointly achieve gene…

arXiv cs.LG TIER_1 · Gal Vinograd, Idan Achituve, Ethan Fetaya · 2026-05-08 04:00

Diverse Sampling in Diffusion Models with Marginal Preserving Particle Guidance

arXiv:2605.06553v1 Announce Type: new Abstract: We present EDDY (Exact-marginal Diversification via Divergence-free dYnamics), a guidance mechanism for diffusion and flow matching models that promotes diversity among samples generated while maintaining quality. EDDY exploits symm…

arXiv cs.LG TIER_1 · Meira Iske, Carola-Bibiane Sch\"onlieb · 2026-05-08 04:00

Expressivity of Bi-Lipschitz Normalizing Flows: A Score-Based Diffusion Perspective

arXiv:2605.06172v1 Announce Type: cross Abstract: Many normalizing flow architectures impose regularity constraints, yet their distributional approximation properties are not fully characterized. We study the expressivity of bi-Lipschitz normalizing flows through the lens of scor…

arXiv cs.LG TIER_1 · Eugenio Lomurno, Filippo Balzarini, Francesco Benelle, Francesca Pia Panaccione, Matteo Matteucci · 2026-05-08 04:00

Inference-Time Refinement Closes the Synthetic-Real Gap in Tabular Diffusion

arXiv:2605.06261v1 Announce Type: new Abstract: Diffusion-based generators set the current state of the art for synthetic tabular data. These methods approach but rarely exceed real-data utility, and closing this synthetic-real gap has so far been pursued exclusively at training …

arXiv cs.LG TIER_1 · Alexander Conzelmann, Albert Catalan-Tatjer, Shiwei Liu · 2026-05-08 04:00

Layer Collapse in Diffusion Language Models

arXiv:2605.06366v1 Announce Type: new Abstract: Diffusion language models (DLMs) have recently emerged as competitive alternatives to autoregressive (AR) language models, yet differences in their activation dynamics remain poorly understood. We characterize these dynamics in LLaD…

arXiv cs.LG TIER_1 · Matias G. Delgadino, Sebastien Motsch, Advait Parulekar, William Porteous, Sanjay Shakkottai · 2026-05-08 04:00

Diffusion-Based Posterior Sampling: A Feynman-Kac Analysis of Bias and Stability

arXiv:2605.06538v1 Announce Type: new Abstract: Diffusion-based posterior samplers use pretrained diffusion priors to sample from measurement- or reward-conditioned posteriors, and are widely used for inverse problems. Yet their theoretical behavior remains poorly understood: eve…

arXiv cs.LG TIER_1 · Ethan Fetaya · 2026-05-07 16:49

Diverse Sampling in Diffusion Models with Marginal Preserving Particle Guidance

We present EDDY (Exact-marginal Diversification via Divergence-free dYnamics), a guidance mechanism for diffusion and flow matching models that promotes diversity among samples generated while maintaining quality. EDDY exploits symmetries of the Fokker-Planck equation, using drif…

arXiv cs.LG TIER_1 · Sanjay Shakkottai · 2026-05-07 16:37

Diffusion-Based Posterior Sampling: A Feynman-Kac Analysis of Bias and Stability

Diffusion-based posterior samplers use pretrained diffusion priors to sample from measurement- or reward-conditioned posteriors, and are widely used for inverse problems. Yet their theoretical behavior remains poorly understood: even with exact prior scores, their outputs are bia…

arXiv cs.LG TIER_1 · Adrien Jacquet Cr\'etides, Mouad Abrini, Hamed Rahimi, Mohamed Chetouani · 2026-05-07 04:00

Encoding Predictability and Legibility for Style-Conditioned Diffusion Policy

arXiv:2603.16368v2 Announce Type: replace-cross Abstract: Striking a balance between efficiency and transparent motion is a core challenge in human-robot collaboration, as highly expressive movements often incur unnecessary time and energy costs. In collaborative environments, le…

arXiv cs.LG TIER_1 · Michael Rottoli, Subhankar Roy, Stefano Paraboschi · 2026-05-07 04:00

Predict-then-Diffuse: Adaptive Response Length for Compute-Budgeted Inference in Diffusion LLMs

arXiv:2605.04215v1 Announce Type: new Abstract: Diffusion-based Large Language Models (D-LLMs) represent a promising frontier in generative AI, offering fully parallel token generation that can lead to significant throughput advantages and superior GPU utilization over traditiona…

arXiv cs.LG TIER_1 · Arthur Gretton, Li Kevin Wenliang, Alexandre Galashov, James Thornton, Valentin De Bortoli, Arnaud Doucet · 2026-05-07 04:00

On the Wasserstein Gradient Flow Interpretation of Drifting Models

arXiv:2605.05118v1 Announce Type: new Abstract: Recently, Deng et al. (2026) proposed Generative Modeling via Drifting (GMD), a novel framework for generative tasks. This note presents an analysis of GMD through the lens of Wasserstein Gradient Flows (WGF), i.e., the path of stee…

arXiv cs.LG TIER_1 · Christopher Nemeth · 2026-05-07 04:00

Hypergraph Generation via Structured Stochastic Diffusion

arXiv:2605.05024v1 Announce Type: cross Abstract: Hypergraphs model higher-order interactions, but realistic hypergraph generation remains difficult because incidence, hyperedge-size heterogeneity, and overlap structure are not faithfully captured by pairwise reductions. We propo…

arXiv cs.LG TIER_1 Deutsch(DE) · Xiaoyu Wu, Yifei Wang, Tsu-Jui Fu, Liang-Chieh Chen, Zhe Gan, Chen Wei · 2026-05-07 04:00

Taming Outlier Tokens in Diffusion Transformers

arXiv:2605.05206v1 Announce Type: cross Abstract: We study outlier tokens in Diffusion Transformers (DiTs) for image generation. Prior work has shown that Vision Transformers (ViTs) can produce a small number of high-norm tokens that attract disproportionate attention while carry…

arXiv cs.LG TIER_1 · Kaiwen Zheng, Yuji Wang, Qianli Ma, Huayu Chen, Jintao Zhang, Yogesh Balaji, Jianfei Chen, Ming-Yu Liu, Jun Zhu, Qinsheng Zhang · 2026-05-07 04:00

Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency

arXiv:2510.08431v3 Announce Type: replace-cross Abstract: Although continuous-time consistency models (e.g., sCM, MeanFlow) are theoretically principled and empirically powerful for fast academic-scale diffusion, its applicability to large-scale text-to-image and video tasks rema…

arXiv cs.LG TIER_1 · Riccardo de Lutio, Tobias Fischer, Yen-Yu Chang, Yuxuan Zhang, Jay Zhangjie Wu, Xuanchi Ren, Tianchang Shen, Katarina Tothova, Zan Gojcic, Haithem Turki · 2026-05-07 04:00

ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models

arXiv:2603.00492v2 Announce Type: replace-cross Abstract: Per-scene optimization methods such as 3D Gaussian Splatting provide state-of-the-art novel view synthesis quality but extrapolate poorly to under-observed areas. Methods that leverage generative priors to correct artifact…

arXiv cs.LG TIER_1 · Alexandre Alouadi, Pierre Henry-Labord\`ere, Gr\'egoire Loeper, Othmane Mazhar, Huy\^en Pham, Nizar Touzi · 2026-05-06 04:00

LightSBB-M: Bridging Schr\"odinger and Bass for Generative Diffusion Modeling

arXiv:2601.19312v2 Announce Type: replace Abstract: The Schrodinger Bridge and Bass (SBB) formulation, which jointly controls drift and volatility, is an established extension of the classical Schrodinger Bridge (SB). Building on this framework, we introduce LightSBB-M, an algori…

arXiv cs.LG TIER_1 · James Rowbottom, Elizabeth L. Baker, Nick Huang, Ben Adcock, Carola-Bibiane Sch\"onlieb, Alexander Denker · 2026-05-06 04:00

GRIFDIR: Graph Resolution-Invariant FEM Diffusion Models in Function Spaces over Irregular Domains

arXiv:2605.03497v1 Announce Type: new Abstract: Score-based diffusion models in infinite-dimensional function spaces provide a mathematically principled framework for modelling function-valued data, offering key advantages such as resolution invariance and the ability to handle i…

arXiv cs.LG TIER_1 · Eitan Kosman, Gabriele Serussi, Chaim Basking · 2026-05-06 04:00

Structured Diffusion Bridges: Inductive Bias for Denoising Diffusion Bridges

arXiv:2605.02973v1 Announce Type: new Abstract: Modality translation is inherently under-constrained, as multiple cross-modal mappings may yield the same marginals. Recent work has shown that diffusion bridges are effective for this task. However, most existing approaches rely on…

arXiv cs.LG TIER_1 · Francisco M. Castro-Mac\'ias, Pablo Morales-\'Alvarez, Saifuddin Syed, Daniel Hern\'andez-Lobato, Rafael Molina, Jos\'e Miguel Hern\'andez-Lobato · 2026-05-06 04:00

Conditional Diffusion Sampling

arXiv:2605.04013v1 Announce Type: cross Abstract: Sampling from unnormalized multimodal distributions with limited density evaluations remains a fundamental challenge in machine learning and natural sciences. Successful approaches construct a bridge between a tractable reference …

arXiv cs.LG TIER_1 · Andreas Makris, Paul Fearnhead, Chris Nemeth · 2026-05-06 04:00

Tempered Guided Diffusion

arXiv:2605.03712v1 Announce Type: cross Abstract: Training-free conditional diffusion provides a flexible alternative to task-specific conditional model training, but existing samplers often allocate computation inefficiently: independent guided trajectories can vary widely in qu…

arXiv cs.LG TIER_1 · Aaron Havens, Brian Karrer, Neta Shaul · 2026-05-06 04:00

Flow Sampling: Learning to Sample from Unnormalized Densities via Denoising Conditional Processes

arXiv:2605.03984v1 Announce Type: new Abstract: Sampling from unnormalized densities is analogous to the generative modeling problem, but the target distribution is defined by a known energy function instead of data samples. Because evaluating the energy function is often costly,…

arXiv cs.LG TIER_1 · Francesca Romana Crucinio · 2026-05-06 04:00

A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows

arXiv:2507.04330v2 Announce Type: replace-cross Abstract: We consider the problem of sampling from a probability distribution $\pi$ which admits a density w.r.t. a dominating measure. It is well known that this can be written as an optimisation problem over the space of probabili…

arXiv cs.LG TIER_1 · José Miguel Hernández-Lobato · 2026-05-05 17:36

Conditional Diffusion Sampling

Sampling from unnormalized multimodal distributions with limited density evaluations remains a fundamental challenge in machine learning and natural sciences. Successful approaches construct a bridge between a tractable reference and the target distribution. Parallel Tempering (P…

arXiv cs.AI TIER_1 · Neta Shaul · 2026-05-05 17:07

Flow Sampling: Learning to Sample from Unnormalized Densities via Denoising Conditional Processes

Sampling from unnormalized densities is analogous to the generative modeling problem, but the target distribution is defined by a known energy function instead of data samples. Because evaluating the energy function is often costly, a primary challenge is to learn an efficient sa…

arXiv cs.AI TIER_1 · Min Zhang · 2026-05-05 15:37

DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models

Dataset distillation enables efficient training by distilling the information of large-scale datasets into significantly smaller synthetic datasets. Diffusion based paradigms have emerged in recent years, offering novel perspectives for dataset distillation. However, they typical…

arXiv cs.LG TIER_1 · Chris Nemeth · 2026-05-05 13:00

Tempered Guided Diffusion

Training-free conditional diffusion provides a flexible alternative to task-specific conditional model training, but existing samplers often allocate computation inefficiently: independent guided trajectories can vary widely in quality, and additional function evaluations along a…

arXiv cs.LG TIER_1 · Alexander Denker · 2026-05-05 08:33

GRIFDIR: Graph Resolution-Invariant FEM Diffusion Models in Function Spaces over Irregular Domains

Score-based diffusion models in infinite-dimensional function spaces provide a mathematically principled framework for modelling function-valued data, offering key advantages such as resolution invariance and the ability to handle irregular discretisations. However, practical imp…

arXiv cs.LG TIER_1 · Tongzhen Dang, Weiyang Ding, Michael K. Ng · 2026-05-05 04:00

Complex Diffusion Maps with $\omega$-Parameterized Kernels Revealing Inherent Harmonic Representations

arXiv:2605.01691v1 Announce Type: new Abstract: In this paper, we propose Complex Diffusion Maps (CDM), a novel diffusion mapping framework that aims to reveal the dominant complex harmonics of high-dimensional data. Inspired by the local Gaussian kernel relevant to the heat equa…

arXiv cs.LG TIER_1 · Phil Sidney Ostheimer, Mayank Nagda, Andriy Balinskyy, Gabriel Vicente Rodrigues, Jean Radig, Carl Herrmann, Stephan Mandt, Marius Kloft, Sophie Fellenz · 2026-05-05 04:00

Skipping the Zeros in Diffusion Models for Sparse Data Generation

arXiv:2605.01817v1 Announce Type: new Abstract: Diffusion models (DMs) excel on dense continuous data, but are not designed for sparse continuous data. They do not model exact zeros that represent the deliberate absence of a signal. As a result, they erase sparsity patterns and p…

arXiv cs.LG TIER_1 Deutsch(DE) · Hasan Amin, Yuan Gao, Yaser Souri, Subhojit Som, Ming Yin, Rajiv Khanna, Xia Song · 2026-05-04 04:00

Consistent Diffusion Language Models

arXiv:2605.00161v1 Announce Type: new Abstract: Diffusion language models (DLMs) are an attractive alternative to autoregressive models because they promise sublinear-time, parallel generation, yet practical gains remain elusive as high-quality samples still demand hundreds of re…

arXiv cs.LG TIER_1 · Carles Domingo-Enrich, Yuanqi Du, Michael S. Albergo · 2026-05-04 04:00

A unified perspective on fine-tuning and sampling with diffusion and flow models

arXiv:2605.00229v1 Announce Type: cross Abstract: We study the problem of training diffusion and flow generative models to sample from target distributions defined by an exponential tilting of a base density; a formulation that subsumes both sampling from unnormalized densities a…

arXiv cs.LG TIER_1 · Zihan Zhou, Chenguang Wang, Hongyi Ye, Yongtao Guan, Tianshu Yu · 2026-05-04 04:00

Incomplete Data, Complete Dynamics: A Diffusion Approach

arXiv:2509.20098v2 Announce Type: replace Abstract: Learning physical dynamics from data is a fundamental challenge in machine learning and scientific modeling. Real-world observational data are inherently incomplete and irregularly sampled, posing significant challenges for exis…

arXiv cs.AI TIER_1 · Yonggan Fu, Lexington Whalen, Zhifan Ye, Xin Dong, Shizhe Diao, Jingyu Liu, Chengyue Wu, Hao Zhang, Enze Xie, Song Han, Maksim Khadkevich, Jan Kautz, Yingyan Celine Lin, Pavlo Molchanov · 2026-05-01 04:00

Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed

arXiv:2512.14067v2 Announce Type: replace-cross Abstract: Diffusion language models (dLMs) have emerged as a promising paradigm that enables parallel, non-autoregressive generation, but their learning efficiency lags behind that of autoregressive (AR) language models when trained…

arXiv cs.AI TIER_1 · Michael Cardei, Huu Binh Ta, Ferdinando Fioretto · 2026-05-01 04:00

Simple Self-Conditioning Adaptation for Masked Diffusion Models

arXiv:2604.26985v1 Announce Type: cross Abstract: Masked diffusion models (MDMs) generate discrete sequences by iterative denoising under an absorbing masking process. In standard masked diffusion, if a token remains masked after a reverse update, the model discards its clean-sta…

arXiv cs.AI TIER_1 · Gabe Guo, Thanawat Sornwanee, Lutong Hao, Elon Litman, Stefano Ermon, Jose Blanchet · 2026-05-01 04:00

ABC: Any-Subset Autoregression via Non-Markovian Diffusion Bridges in Continuous Time and Space

arXiv:2604.27443v1 Announce Type: cross Abstract: Generating continuous-time, continuous-space stochastic processes (e.g., videos, weather forecasts) conditioned on partial observations (e.g., first and last frames) is a fundamental challenge. Existing approaches, (e.g., diffusio…

arXiv cs.CL TIER_1 · Yihong Dong, Zhaoyu Ma, Xue Jiang, Zhiyuan Fan, Jiaru Qian, Yongmin Li, Jianha Xiao, Zhi Jin, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li, Ge Li · 2026-04-30 04:00

Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model

arXiv:2510.18165v3 Announce Type: replace-cross Abstract: Diffusion language models (DLMs) are emerging as a compelling alternative to the dominant autoregressive paradigm, offering inherent advantages in parallel generation and bidirectional context modeling. However, for the ta…

arXiv cs.LG TIER_1 · Hyukjun Lim, Soojung Yang, Lucas Pin\`ede, Miguel Steiner, Yuanqi Du, Rafael G\'omez-Bombarelli · 2026-04-30 04:00

A Priori Sampling of Transition States with Guided Diffusion

arXiv:2603.25980v2 Announce Type: replace-cross Abstract: Transition states, the first-order saddle points on the potential energy surfaces, govern the kinetics and mechanisms of chemical reactions and conformational changes. Locating them is challenging because transition pathwa…

arXiv cs.LG TIER_1 · Yuxiang Wang, Yu Xiang, Baojian Zhou, Qifang Zhao, Keyue Jiang, Yanghua Xiao, Xiaoxiao Xu · 2026-04-29 04:00

On the Trainability of Masked Diffusion Language Models via Blockwise Locality

arXiv:2604.24832v1 Announce Type: new Abstract: Masked diffusion language models (MDMs) have recently emerged as a promising alternative to standard autoregressive large language models (AR-LLMs), yet their optimization can be substantially less stable. We study blockwise MDMs an…

arXiv cs.LG TIER_1 · Zixuan Zhang, Kaixuan Huang, Tuo Zhao, Mengdi Wang, Minshuo Chen · 2026-04-29 04:00

Diffusion Model for Manifold Data: Score Decomposition, Curvature, and Statistical Complexity

arXiv:2603.20645v2 Announce Type: replace Abstract: Diffusion models have become a leading framework in generative modeling, yet their theoretical understanding -- especially for high-dimensional data concentrated on low-dimensional structures -- remains incomplete. This paper in…

Hugging Face Daily Papers TIER_1 · 2026-04-28 06:53

Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds

Practically, training diffusion models typically requires explicit time conditioning to guide the network through the denoising sampling process. Especially in deterministic methods like DDIM, the absence of time conditioning leads to significant performance degradation. However,…

arXiv cs.LG TIER_1 · Enshu Liu, Xuefei Ning, Yu Wang, Zinan Lin · 2026-04-28 04:00

NI Sampling: Accelerating Discrete Diffusion Sampling by Token Order Optimization

arXiv:2604.18471v2 Announce Type: replace Abstract: Discrete diffusion language models (dLLMs) have recently emerged as a promising alternative to traditional autoregressive approaches, offering the flexibility to generate tokens in arbitrary orders and the potential of parallel …

arXiv cs.LG TIER_1 · Dong Liu, Haisheng Wang, Yanxuan Yu · 2026-04-28 04:00

Accelerating Frequency Domain Diffusion Models with Error-Feedback Event-Driven Caching

arXiv:2604.22901v1 Announce Type: new Abstract: Diffusion models achieve remarkable success in time series generation. However, slow inference limits their practical deployment. We propose E$^2$-CRF (Error-Feedback Event-Driven Cumulative Residual Feature caching) to accelerate f…

arXiv cs.LG TIER_1 · Yiming Zhang, Sitong Liu, Ke Li, Zhihong Wu, Alex Cloninger, Melvin Leok · 2026-04-28 04:00

GeoEdit: Local Frames for Fast, Training-Free On-Manifold Editing in Diffusion Models

arXiv:2604.24238v1 Announce Type: new Abstract: Diffusion models are a leading paradigm for data generation, but training-free editing typically re-runs the full denoising trajectory for every edit strength, making iterative refinement expensive. To address this issue, we instead…

arXiv cs.LG TIER_1 · Aditi De · 2026-04-28 04:00

Symmetric Equilibrium Propagation for Thermodynamic Diffusion Training

arXiv:2604.23806v1 Announce Type: new Abstract: The reverse process in score-based diffusion models is formally equivalent to overdamped Langevin dynamics in a time-dependent energy landscape. In our prior work we showed that a bilinearly-coupled analog substrate can physically r…

arXiv cs.LG TIER_1 · Dake Bu, Wei Huang, Andi Han, Hau-San Wong, Qingfu Zhang, Taiji Suzuki, Atsushi Nitanda · 2026-04-28 04:00

DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models

arXiv:2604.24357v1 Announce Type: new Abstract: Diffusion language models generate without a fixed left-to-right order, making token ordering a central algorithmic choice: which tokens should be revealed, retained, revised or verified at each step? Existing systems mainly use ran…

arXiv cs.LG TIER_1 · Weiguo Gao, Ming Li · 2026-04-28 04:00

Toward Theoretical Insights into Diffusion Trajectory Distillation via Operator Merging

arXiv:2505.16024v2 Announce Type: replace Abstract: Diffusion trajectory distillation accelerates sampling by training a student model to approximate the multi-step denoising trajectories of a pretrained teacher model using far fewer steps. Despite strong empirical results, the t…

arXiv cs.LG TIER_1 · Zicheng Lyu, Zengfeng Huang · 2026-04-28 04:00

Radial Load--Reserve Certificates for Wasserstein Propagation in Isotropic Diffusion Samplers

arXiv:2603.19670v3 Announce Type: replace Abstract: Nonasymptotic diffusion analyses often decompose sampling error into score estimation, continuous reverse-time propagation, discretization, and terminal conversion. We isolate the propagation module on certified scalar-isotropic…

arXiv cs.AI TIER_1 · Atsushi Nitanda · 2026-04-27 11:50

DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models

Diffusion language models generate without a fixed left-to-right order, making token ordering a central algorithmic choice: which tokens should be revealed, retained, revised or verified at each step? Existing systems mainly use random masking or confidence-driven ordering. Rando…

arXiv cs.LG TIER_1 · Melvin Leok · 2026-04-27 09:47

GeoEdit: Local Frames for Fast, Training-Free On-Manifold Editing in Diffusion Models

Diffusion models are a leading paradigm for data generation, but training-free editing typically re-runs the full denoising trajectory for every edit strength, making iterative refinement expensive. To address this issue, we instead edit near the data manifold, where small local …

arXiv cs.LG TIER_1 · Luca Ambrogioni · 2026-04-27 04:00

How Out-of-Equilibrium Phase Transitions can Seed Pattern Formation in Trained Diffusion Models

arXiv:2603.20092v4 Announce Type: replace Abstract: Diffusion models generate structure by progressively transforming noise into data, yet the mechanisms underlying this transition remain poorly understood. In this work, we show that pattern formation in trained diffusion models …

arXiv stat.ML TIER_1 · Jing Jia, Liyue Shen, Guanyang Wang · 2026-05-13 04:00

Couple to Control: Joint Initial Noise Design in Diffusion Models

arXiv:2605.11311v1 Announce Type: cross Abstract: Diffusion models typically generate image batches from independent Gaussian initial noises. We argue that this independence assumption is only one choice within a broader class of valid joint noise designs. Instead, one can specif…

arXiv stat.ML TIER_1 · Hao Chen, Renzheng Zhang, Scott S. Howard · 2026-05-13 04:00

DAPS++: Rethinking Diffusion Inverse Problems with Decoupled Posterior Annealing

arXiv:2511.17038v3 Announce Type: replace-cross Abstract: From a Bayesian perspective, score-based diffusion solves inverse problems through joint inference, embedding the likelihood with the prior to guide the sampling process. However, this formulation fails to explain its prac…

arXiv stat.ML TIER_1 · Guillaume Coqueret, Martial Laguerre · 2026-05-13 04:00

Overparametrized models with posterior drift

arXiv:2506.23619v2 Announce Type: replace-cross Abstract: This paper investigates the impact of posterior drift on out-of-sample forecasting accuracy in overparametrized machine learning models. We document the loss in performance when the loadings of the data generating process …

arXiv cs.CV TIER_1 · Dong-Jun Han · 2026-05-12 13:39

Disentangled Sparse Representations for Concept-Separated Diffusion Unlearning

Unlearning specific concepts in text-to-image diffusion models has become increasingly important for preventing undesirable content generation. Among prior approaches, sparse autoencoder (SAE)-based methods have attracted attention due to their ability to suppress target concepts…

arXiv cs.CV TIER_1 · Konstantin Kulikov · 2026-05-12 10:11

Few-Shot Synthetic Data Generation with Diffusion Models for Downstream Vision Tasks

Class imbalance is a persistent challenge in visual recognition, particularly in safety-critical domains where collecting positive examples is expensive and rare events are inherently underrepresented. We propose a lightweight synthetic data augmentation pipeline that fine-tunes …

arXiv stat.ML TIER_1 · Guanyang Wang · 2026-05-11 22:56

Couple to Control: Joint Initial Noise Design in Diffusion Models

Diffusion models typically generate image batches from independent Gaussian initial noises. We argue that this independence assumption is only one choice within a broader class of valid joint noise designs. Instead, one can specify a coupling of the initial noises: each noise rem…

arXiv cs.CV TIER_1 · Wang Chen · 2026-05-11 12:09

Filtering Memorization from Parameter-Space in Diffusion Models

Low-Rank Adaptation (LoRA) has become a widely used mechanism for customizing diffusion models, enabling users to inject new visual concepts or styles through lightweight parameter updates. However, LoRAs can memorize training images, causing generated outputs to reproduce copyri…

arXiv stat.ML TIER_1 · Enrico Ventura, Beatrice Achilli, Luca Ambrogioni, Carlo Lucibello · 2026-05-11 04:00

Emergence of Distortions in High-Dimensional Guided Diffusion Models

arXiv:2602.00716v4 Announce Type: replace Abstract: Classifier-free guidance (CFG) is the de facto standard for conditional sampling in diffusion models, yet it often reduces sample diversity. Using tools from statistical physics, we analyze the emergence of generative distortion…

arXiv stat.ML TIER_1 · Dongqing Li, Geoff K. Nicholls, Shiyi Sun, You Luo · 2026-05-11 04:00

A Differentiable Bayesian Relaxation for Latent Partial-Order Inference

arXiv:2605.06976v1 Announce Type: new Abstract: Many ranking and agent trace datasets are recorded as linear orders even though their latent structure is only partially ordered. This is especially common in agent and workflow traces, where observed order may reflect arbitrary lin…

arXiv stat.ML TIER_1 · James Matthew Young, Paula Cordero-Encinar, Sebastian Reich, Andrew Duncan, O. Deniz Akyildiz · 2026-05-11 04:00

Diffusion Path Samplers via Sequential Monte Carlo

arXiv:2601.21951v2 Announce Type: replace Abstract: We develop diffusion-based samplers for target distributions known up to a normalising constant. To this end, we rely on the well-known diffusion path that smoothly interpolates between a simple base distribution and the target,…

arXiv stat.ML TIER_1 · Simon Bienewald, Lukas Trottner · 2026-05-11 04:00

Statistical Convergence of Spherical First Hitting Diffusion Models

arXiv:2605.07625v1 Announce Type: cross Abstract: Denoising diffusion models have evolved into a state-of-the-art method for tasks in various fields, such as denoising and generation of images, text generation, or generation of synthetic data for training of other machine learnin…

arXiv stat.ML TIER_1 · Qiao Wang · 2026-05-11 04:00

Expectation-Maximization as a Spectrally Governed Relaxation Flow

arXiv:2605.07818v1 Announce Type: new Abstract: The expectation--maximization (EM) algorithm combines global monotonicity, local linear convergence, and strong practical robustness, but these features are usually analyzed separately. Global descent is nonlinear, whereas local con…

arXiv stat.ML TIER_1 · Arnaud Doucet · 2026-05-10 16:57

Metropolis-Adjusted Diffusion Models

Sampling from score-based diffusion models incurs bias due to both time discretisation and the approximation of the score function. A common strategy for reducing this bias is to apply corrector steps based on the unadjusted Langevin algorithm (ULA) at each noise level within a p…

arXiv cs.CV TIER_1 · Kaushik Roy · 2026-05-08 16:32

HEART: Hyperspherical Embedding Alignment via Kent-Representation Traversal in Diffusion Models

Text-to-image diffusion models can generate visually stunning images, yet, controlling what appears and how it appears, remains surprisingly difficult, especially when operating solely within the constraints of the text-conditioning space. For example, changing a subject or adjus…

arXiv cs.CV TIER_1 · Yali Wang · 2026-05-08 15:52

What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion

Tokenizers are a crucial component of latent diffusion models, as they define the latent space in which diffusion models operate. However, existing tokenizers are primarily designed to improve reconstruction fidelity or inherit pretrained representations, leaving unclear what kin…

arXiv stat.ML TIER_1 · Qiao Wang · 2026-05-08 14:49

Expectation-Maximization as a Spectrally Governed Relaxation Flow

The expectation--maximization (EM) algorithm combines global monotonicity, local linear convergence, and strong practical robustness, but these features are usually analyzed separately. Global descent is nonlinear, whereas local convergence is governed by the spectrum of the line…

arXiv cs.CV TIER_1 · Baoru Huang · 2026-05-08 14:36

SARA: Semantically Adaptive Relational Alignment for Video Diffusion Models

Recent video diffusion models (VDMs) synthesize visually convincing clips, yet still drop entities, mis-bind attributes, and weaken the interactions specified in the prompt. Representation-alignment objectives such as VideoREPA and MoAlign improve fine-grained text following by d…

arXiv stat.ML TIER_1 · Lukas Trottner · 2026-05-08 11:54

Statistical Convergence of Spherical First Hitting Diffusion Models

Denoising diffusion models have evolved into a state-of-the-art method for tasks in various fields, such as denoising and generation of images, text generation, or generation of synthetic data for training of other machine learning models. First hitting diffusion models (FHDM) ar…

arXiv stat.ML TIER_1 · You Luo · 2026-05-07 21:47

A Differentiable Bayesian Relaxation for Latent Partial-Order Inference

Many ranking and agent trace datasets are recorded as linear orders even though their latent structure is only partially ordered. This is especially common in agent and workflow traces, where observed order may reflect arbitrary linearization rather than true prerequisites. We in…

arXiv cs.CV TIER_1 Français(FR) · Yan Zeng · 2026-05-07 16:44

Continuous Latent Diffusion Language Model

Large language models have achieved remarkable success under the autoregressive paradigm, yet high-quality text generation need not be tied to a fixed left-to-right order. Existing alternatives still struggle to jointly achieve generation efficiency, scalable representation learn…

arXiv cs.CV TIER_1 · Chunhua Shen · 2026-05-07 16:20

MARBLE: Multi-Aspect Reward Balance for Diffusion RL

Reinforcement learning fine-tuning has become the dominant approach for aligning diffusion models with human preferences. However, assessing images is intrinsically a multi-dimensional task, and multiple evaluation criteria need to be optimized simultaneously. Existing practice d…

arXiv stat.ML TIER_1 · Stefano Sarao Mannelli · 2026-05-07 14:44

The Interplay of Data Structure and Imbalance in the Learning Dynamics of Diffusion Models

Real-world datasets are inherently heterogeneous, yet how per-class structural differences and sampling imbalance shape the training dynamics of diffusion models-and potentially exacerbate disparities-remains poorly understood. While models typically transition from an initial ph…

arXiv stat.ML TIER_1 · Carola-Bibiane Schönlieb · 2026-05-07 12:54

Expressivity of Bi-Lipschitz Normalizing Flows: A Score-Based Diffusion Perspective

Many normalizing flow architectures impose regularity constraints, yet their distributional approximation properties are not fully characterized. We study the expressivity of bi-Lipschitz normalizing flows through the lens of score-based diffusion models. For the probability flow…

arXiv cs.CV TIER_1 · Bartlomiej Sobieski, Matthew Tivnan, Dawid P{\l}udowski, Micha{\l} Jan W{\l}odarczyk, Pengfei Jin, Przemyslaw Biecek, Quanzheng Li · 2026-05-07 04:00

Local Intrinsic Dimension Unveils Hallucinations in Diffusion Models

arXiv:2605.05026v1 Announce Type: new Abstract: Diffusion models are prone to generating structural hallucinations - samples that match the statistical properties of the training data yet defy underlying structural rules, resulting in anomalies like hands with more than five fing…

arXiv cs.CV TIER_1 · Yiran Qiao, Yiren Lu, Yunlai Zhou, Disheng Liu, Linlin Hou, Rui Yang, Yu Yin, Jing Ma · 2026-05-07 04:00

Structured 3D Latents Are Surprisingly Powerful: Unleashing Generalizable Style with 2D Diffusion

arXiv:2605.04412v1 Announce Type: new Abstract: 3D asset generation plays a pivotal role in fields such as gaming and virtual reality, enabling the rapid synthesis of high-fidelity 3D objects from a single or multiple images. Building on this capability, enabling style-controllab…

arXiv cs.CV TIER_1 Deutsch(DE) · Chen Wei · 2026-05-06 17:59

Taming Outlier Tokens in Diffusion Transformers

We study outlier tokens in Diffusion Transformers (DiTs) for image generation. Prior work has shown that Vision Transformers (ViTs) can produce a small number of high-norm tokens that attract disproportionate attention while carrying limited local information, but their role in g…

arXiv stat.ML TIER_1 · Arnaud Doucet · 2026-05-06 16:48

On the Wasserstein Gradient Flow Interpretation of Drifting Models

Recently, Deng et al. (2026) proposed Generative Modeling via Drifting (GMD), a novel framework for generative tasks. This note presents an analysis of GMD through the lens of Wasserstein Gradient Flows (WGF), i.e., the path of steepest descent for a functional in the space of pr…

arXiv cs.CV TIER_1 · Quanzheng Li · 2026-05-06 15:22

Local Intrinsic Dimension Unveils Hallucinations in Diffusion Models

Diffusion models are prone to generating structural hallucinations - samples that match the statistical properties of the training data yet defy underlying structural rules, resulting in anomalies like hands with more than five fingers. Recent research studied this failure mode f…

arXiv stat.ML TIER_1 · Christopher Nemeth · 2026-05-06 15:19

Hypergraph Generation via Structured Stochastic Diffusion

Hypergraphs model higher-order interactions, but realistic hypergraph generation remains difficult because incidence, hyperedge-size heterogeneity, and overlap structure are not faithfully captured by pairwise reductions. We propose \HEDGE, a generative model defined directly on …

arXiv cs.CV TIER_1 · Ruibin Min, Yexin Liu, Aimin Pan, Changsheng Lu, Jiafei Wu, Kelu Yao, Xiaogang Xu, Harry Yang · 2026-05-06 04:00

AHPA: Adaptive Hierarchical Prior Alignment for Diffusion Transformers

arXiv:2605.03317v1 Announce Type: new Abstract: Representation alignment has recently emerged as an effective paradigm for accelerating Diffusion Transformer training. Despite their success, existing alignment methods typically impose a fixed supervision target or a fixed alignme…

arXiv cs.CV TIER_1 · Qichao Wang, Yunhong Lu, Hengyuan Cao, Junyi Zhang, Min Zhang · 2026-05-06 04:00

DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models

arXiv:2605.03877v1 Announce Type: new Abstract: Dataset distillation enables efficient training by distilling the information of large-scale datasets into significantly smaller synthetic datasets. Diffusion based paradigms have emerged in recent years, offering novel perspectives…

arXiv cs.CV TIER_1 · An Huang, Junggab Son, Zuobin Xiong · 2026-05-05 04:00

Watch Your Step: Information Injection in Diffusion Models via Shadow Timestep Embedding

arXiv:2605.00935v1 Announce Type: cross Abstract: Diffusion models have become the foundation of modern generative systems, with most research focusing primarily on improving generation efficiency and output quality. The timestep embedding component is a crucial part of the diffu…

arXiv cs.CV TIER_1 · Fangzheng Wu, Brian Summa · 2026-05-05 04:00

SteeringDiffusion: A Bottlenecked Activation Control Interface for Diffusion Models

arXiv:2605.01653v1 Announce Type: new Abstract: We introduce SteeringDiffusion, a bottlenecked activation-level control interface for diffusion models that exposes a smooth, monotonic, and runtime-adjustable control surface over the content--style trade-off. Our method keeps the …

arXiv stat.ML TIER_1 · Kanishka Reddy · 2026-05-05 04:00

Diffusion Operator Geometry of Feedforward Representations

arXiv:2605.01107v1 Announce Type: cross Abstract: Neural networks transform data through learned representations whose geometry affects separation, contraction, and generalization. Recent work studies this geometry using discrete curvature on neighborhood graphs, suggesting Ricci…

arXiv cs.CV TIER_1 · Xun Su, Hiroyuki Kasai · 2026-05-05 04:00

Noise is All You Need: Solving Linear Inverse Problems by Noise Combination Sampling with Diffusion Models

arXiv:2510.23633v2 Announce Type: replace-cross Abstract: Pretrained diffusion models have demonstrated strong capabilities in zero-shot inverse problem solving by incorporating observation information into the generation process of the diffusion models. However, this presents an…

arXiv cs.CV TIER_1 · Harry Yang · 2026-05-05 03:07

AHPA: Adaptive Hierarchical Prior Alignment for Diffusion Transformers

Representation alignment has recently emerged as an effective paradigm for accelerating Diffusion Transformer training. Despite their success, existing alignment methods typically impose a fixed supervision target or a fixed alignment granularity throughout the entire denoising t…

arXiv cs.CV TIER_1 · Anne Harrington, A. Sophia Koepke, Shyamgopal Karthik, Trevor Darrell, Alexei A. Efros · 2026-05-04 04:00

It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models

arXiv:2601.00090v2 Announce Type: replace Abstract: Contemporary text-to-image models exhibit a surprising degree of mode collapse, as can be seen when sampling several images given the same text prompt. Previous work has attempted to address this issue by steering the model usin…

arXiv cs.CV TIER_1 Italiano(IT) · Simeon Allmendinger, Domenique Zipperling, Lukas Struppek, Niklas K\"uhl · 2026-05-04 04:00

CollaFuse: Collaborative Diffusion Models

arXiv:2406.14429v3 Announce Type: replace-cross Abstract: In the landscape of generative artificial intelligence, diffusion-based models have emerged as a promising method for generating synthetic images. However, the application of diffusion models poses numerous challenges, par…

arXiv cs.CV TIER_1 · Song Yan, Chenfeng Wang, Wei Zhai, Xinliang Bi, Jian Yang, Yancheng Cai, Yusen Zhang, Yunwei Lan, Tao Zhang, GuanYe Xiong, Min Li, Zheng-Jun Zha · 2026-05-04 04:00

The Determinism of Randomness: Latent Space Degeneracy in Diffusion Model

arXiv:2511.07756v4 Announce Type: replace Abstract: Diffusion models initialize generation from an isotropic Gaussian latent, yet changing only the random seed can substantially alter prompt faithfulness, composition, and visual quality. We explain this gap by distinguishing the …

arXiv cs.CV TIER_1 · Saeed Mohseni-Sehdeh, Walid Saad, Kei Sakaguchi, Tao Yu · 2026-05-04 04:00

Diffusion Models for Solving Inverse Problems via Posterior Sampling with Piecewise Guidance

arXiv:2507.18654v2 Announce Type: replace-cross Abstract: Diffusion models are powerful tools for sampling from high-dimensional distributions by progressively transforming pure noise into structured data through a denoising process. When equipped with a guidance mechanism, these…

arXiv stat.ML TIER_1 · Kanishka Reddy · 2026-05-01 21:27

Diffusion Operator Geometry of Feedforward Representations

Neural networks transform data through learned representations whose geometry affects separation, contraction, and generalization. Recent work studies this geometry using discrete curvature on neighborhood graphs, suggesting Ricci-flow-like behavior across layers. We develop a sm…

arXiv stat.ML TIER_1 · Michael S. Albergo · 2026-04-30 21:06

A unified perspective on fine-tuning and sampling with diffusion and flow models

We study the problem of training diffusion and flow generative models to sample from target distributions defined by an exponential tilting of a base density; a formulation that subsumes both sampling from unnormalized densities and reward fine-tuning of pre-trained models. This …

arXiv cs.CV TIER_1 · Zhirong Shen, Rui Huang, Jiacheng Liu, Chang Zou, Peiliang Cai, Shikang Zheng, Zhengyi Shi, Liang Feng, Linfeng Zhang · 2026-04-30 04:00

Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Models

arXiv:2604.26365v1 Announce Type: new Abstract: To address the high sampling cost of Diffusion Transformers (DiTs), feature caching offers a training-free acceleration method. However, existing methods rely on hand-crafted forecasting formulas that fail under aggressive skipping.…

arXiv cs.CV TIER_1 · Yang Yang, Feifan Meng, Han Fang, Weiming Zhang · 2026-04-30 04:00

ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance

arXiv:2604.26348v1 Announce Type: new Abstract: Diffusion models have achieved remarkable success in image generation, yet their training is predominantly driven by full-reference objectives that enforce pixel-wise similarity to ground-truth images.Such supervision, while effecti…

arXiv cs.CV TIER_1 · Rui Xu, Jiepeng Wang, Hao Pan, Yang Liu, Xin Tong, Shiqing Xin, Changhe Tu, Taku Komura, Wenping Wang · 2026-04-30 04:00

ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models

arXiv:2405.13729v3 Announce Type: replace-cross Abstract: In this paper, we study an under-explored but important factor of diffusion generative models, i.e., the combinatorial complexity. Data samples are generally high-dimensional, and for various structured generation tasks, a…

arXiv cs.CV TIER_1 · Miguel Espinosa, Eva Gmelich Meijling, Valerio Marsocci, Elliot J. Crowley, Mikolaj Czerkawski · 2026-04-30 04:00

COP-GEN: Latent Diffusion Transformer for Copernicus Earth Observation Data

arXiv:2603.03239v2 Announce Type: replace Abstract: Earth observation applications increasingly rely on data from multiple sensors, including optical, radar, elevation, and land-cover. Relationships between modalities are fundamental for data integration but are inherently non-in…

arXiv cs.CV TIER_1 Italiano(IT) · Haosen Li, Wenshuo Chen, Lei Wang, Shaofeng Liang, Bowen Tian, Soning Lai, Yutao Yue · 2026-04-30 04:00

Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Models

arXiv:2604.26503v1 Announce Type: new Abstract: Diffusion models have achieved remarkable success in synthesizing complex static and temporal visuals, a breakthrough largely driven by Classifier-Free Guidance (CFG). However, despite its pivotal role in aligning generated content …

arXiv cs.CV TIER_1 Italiano(IT) · Yutao Yue · 2026-04-29 10:08

Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Models

Diffusion models have achieved remarkable success in synthesizing complex static and temporal visuals, a breakthrough largely driven by Classifier-Free Guidance (CFG). However, despite its pivotal role in aligning generated content with textual prompts, standard CFG relies on a g…

arXiv cs.CV TIER_1 · Linfeng Zhang · 2026-04-29 07:22

Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Models

To address the high sampling cost of Diffusion Transformers (DiTs), feature caching offers a training-free acceleration method. However, existing methods rely on hand-crafted forecasting formulas that fail under aggressive skipping. We propose L2P (Learnable Linear Predictor), a …

arXiv cs.CV TIER_1 · Weiming Zhang · 2026-04-29 07:00

ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance

Diffusion models have achieved remarkable success in image generation, yet their training is predominantly driven by full-reference objectives that enforce pixel-wise similarity to ground-truth images.Such supervision, while effective for fidelity, may insufficient in terms of su…

arXiv cs.CV TIER_1 · Liuzhuozheng Li, Zhiyuan Zhan, Shuhong Liu, Dengyang Jiang, Zanyi Wang, Guang Dai, Jingdong Wang, Mengmeng Wang · 2026-04-29 04:00

Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds

arXiv:2604.25289v1 Announce Type: cross Abstract: Practically, training diffusion models typically requires explicit time conditioning to guide the network through the denoising sampling process. Especially in deterministic methods like DDIM, the absence of time conditioning lead…

arXiv cs.CV TIER_1 · Nishit Anand, Manan Suri, Christopher Metzler, Dinesh Manocha, Ramani Duraiswami · 2026-04-29 04:00

Learning Illumination Control in Diffusion Models

arXiv:2604.24877v1 Announce Type: new Abstract: Controlling illumination in images is essential for photography and visual content creation. While closed-source models have demonstrated impressive illumination control, open-source alternatives either require heavy control inputs …

arXiv cs.CV TIER_1 · Mengmeng Wang · 2026-04-28 06:53

Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds

Practically, training diffusion models typically requires explicit time conditioning to guide the network through the denoising sampling process. Especially in deterministic methods like DDIM, the absence of time conditioning leads to significant performance degradation. However,…

arXiv stat.ML TIER_1 · Fan Chen, Sinho Chewi, Constantinos Daskalakis, Alexander Rakhlin · 2026-04-28 04:00

High-accuracy sampling for diffusion models and log-concave distributions

arXiv:2602.01338v2 Announce Type: replace-cross Abstract: We present algorithms for diffusion model sampling which obtain $\delta$-error in $\mathrm{polylog}(1/\delta)$ steps, given access to $\widetilde O(\delta)$-accurate score estimates in $L^2$. This is an exponential improve…

arXiv cs.CV TIER_1 · Haosen Li, Wenshuo Chen, Shaofeng Liang, Lei Wang, Kaishen Yuan, Yutao Yue · 2026-04-28 04:00

$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models

arXiv:2604.23536v1 Announce Type: new Abstract: Diffusion models have achieved unprecedented success in text-aligned generation, largely driven by Classifier-Free Guidance (CFG). However, standard CFG operates strictly on instantaneous gradients, omitting the intrinsic curvature …

arXiv stat.ML TIER_1 · Bingqing Jiang, Difan Zou · 2026-04-28 04:00

On the Memorization of Consistency Distillation for Diffusion Models

arXiv:2604.23552v1 Announce Type: cross Abstract: Diffusion models are central to modern generative modeling, and understanding how they balance memorization and generalization is critical for reliable deployment. Recent work has shown that memorization in diffusion models is sha…

arXiv cs.CV TIER_1 · Buddhi Wijenayake, Nichula Wasalathilake, Roshan Godaliyadda, Vijitha Herath, Parakrama Ekanayake, Vishal M. Patel · 2026-04-28 04:00

Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation

arXiv:2602.04749v3 Announce Type: replace Abstract: Long-tailed class imbalance remains a fundamental obstacle in semantic segmentation of high-resolution remote-sensing imagery, where dominant classes shape learned representations and rare classes are systematically under-segmen…

arXiv cs.CV TIER_1 · Zhongjie Duan, Hong Zhang, Yingda Chen · 2026-04-28 04:00

Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion

arXiv:2604.24351v1 Announce Type: cross Abstract: Controllable diffusion methods have substantially expanded the practical utility of diffusion models, but they are typically developed as isolated, backbone-specific systems with incompatible training pipelines, parameter formats,…

arXiv cs.CV TIER_1 · Ramani Duraiswami · 2026-04-27 18:03

Learning Illumination Control in Diffusion Models

Controlling illumination in images is essential for photography and visual content creation. While closed-source models have demonstrated impressive illumination control, open-source alternatives either require heavy control inputs like depth maps or do not release their data and…

arXiv cs.CV TIER_1 · Yingda Chen · 2026-04-27 11:44

Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion

Controllable diffusion methods have substantially expanded the practical utility of diffusion models, but they are typically developed as isolated, backbone-specific systems with incompatible training pipelines, parameter formats, and runtime hooks. This fragmentation makes it di…

arXiv cs.CV TIER_1 · Jincheng Ying, Yitao Chen, Li Wenlin, Minghui Xu, Yinhao Xiao · 2026-04-27 04:00

Efficient Diffusion Distillation via Embedding Loss

arXiv:2604.22379v1 Announce Type: new Abstract: Recent advances in distilling expensive diffusion models into efficient few-step generators show significant promise. However, these methods typically demand substantial computational resources and extended training periods, limitin…

arXiv cs.CV TIER_1 · Mingxing Rao, Bowen Qu, Daniel Moyer · 2026-04-27 04:00

Score-based Membership Inference on Diffusion Models

arXiv:2509.25003v2 Announce Type: replace-cross Abstract: Membership inference attacks (MIAs) against Diffusion Models (DMs) raise pressing privacy concerns by revealing whether a sample was part of the training set. While existing methods typically rely on measuring reconstructi…

arXiv stat.ML TIER_1 · Difan Zou · 2026-04-26 06:22

On the Memorization of Consistency Distillation for Diffusion Models

Diffusion models are central to modern generative modeling, and understanding how they balance memorization and generalization is critical for reliable deployment. Recent work has shown that memorization in diffusion models is shaped by training dynamics, with generalization and …

arXiv cs.CV TIER_1 · Yinhao Xiao · 2026-04-24 09:16

Efficient Diffusion Distillation via Embedding Loss

Recent advances in distilling expensive diffusion models into efficient few-step generators show significant promise. However, these methods typically demand substantial computational resources and extended training periods, limiting accessibility for resource-constrained researc…

arXiv stat.ML TIER_1 · Chang Liu · 2026-04-23 16:04

Quotient-Space Diffusion Models

Diffusion-based generative models have reformed generative AI, and have enabled new capabilities in the science domain, for example, generating 3D structures of molecules. Due to the intrinsic problem structure of certain tasks, there is often a symmetry in the system, which iden…

arXiv stat.ML TIER_1 · Houman Owhadi · 2026-04-20 15:22

Adaptive Kernel Selection for Kernelized Diffusion Maps

Selecting an appropriate kernel is a central challenge in kernel-based spectral methods. In \emph{Kernelized Diffusion Maps} (KDM), the kernel determines the accuracy of the RKHS estimator of a diffusion-type operator and hence the quality and stability of the recovered eigenfunc…

arXiv stat.ML TIER_1 · Molei Tao · 2026-04-20 05:47

Efficient Diffusion Models under Nonconvex Equality and Inequality constraints via Landing

Generative modeling within constrained sets is essential for scientific and engineering applications involving physical, geometric, or safety requirements (e.g., molecular generation, robotics). We present a unified framework for constrained diffusion models on generic nonconvex …

Smol AINews TIER_1 · 2024-06-12 22:08

The Last Hurrah of Stable Diffusion?

**Stability AI** launched **Stable Diffusion 3 Medium** with models ranging from **450M to 8B parameters**, featuring the MMDiT architecture and T5 text encoder for image text rendering. The community has shown mixed reactions following the departure of key researchers like Emad …

Smol AINews TIER_1 · 2024-03-05 22:30

Stable Diffusion 3 — Rombach & Esser did it again!

**Over 2500 new community members joined following Soumith Chintala's shoutout, highlighting growing interest in SOTA LLM-based summarization. The major highlight is the detailed paper release of **Stable Diffusion 3 (SD3)**, showcasing advanced text-in-image control and complex …

Hacker News — AI stories ≥50 points TIER_1 · benanne · 2026-05-06 18:46

Learning the Integral of a Diffusion Model

Practical AI TIER_1 · Practical AI LLC · 2022-09-13 22:20

Stable Diffusion

<p>The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episo…

COVERAGE [152]