ICML 2026 Oral

Rex: Algebraically reversible solvers for diffusion models

Modern generative models — the engines behind today's image generators and molecular simulators — turn noise into data by following a smooth trajectory, integrated step by step with a numerical solver. Many tasks (editing a generated image, recovering the noise that produced it, running a simulation backwards in time) require retracing that trajectory in reverse, but ordinary solvers leak tiny rounding errors at every step and the path home drifts away from where it started. Rex is a new family of solvers that are algebraically reversible: the reverse step exactly undoes the forward step, so the full trajectory can be replayed in either direction with near‑machine precision. It handles both the deterministic (ODE) and stochastic (SDE) settings used by diffusion and flow models, comes with strong theoretical guarantees on accuracy and stability, and delivers measurable improvements on Boltzmann sampling, image generation, and image editing.

1AITHYRA· 2Clarkson University
*Work was partially completed while a Ph.D. student at Clarkson.
non-reversible solver ε > 0
Rex · reversible ε = 0 ✓
noise conformation ← reverse return

Drag a panel to rotate · scroll/pinch to zoom · space to play/pause · R to reset

Figure 1. Every atom of a tri-alanine peptide (Ala–Ala–Ala) is encoded from noise to a folded conformation (forward integration), then decoded back to noise (backward integration). Top: a generic non-reversible solver accumulates truncation error at every step; the round-trip endpoint $\boldsymbol x'_0$ drifts away from $\boldsymbol x_0$. Bottom: Rex retraces the forward chords exactly, and every atom returns to its starting position.
01 — Inversion of diffusion models

Diffusion sampling is a flow; existing solvers cannot reverse it exactly

Diffusion models  have quickly become the state of the art across many generation tasks; sampling from them amounts to integrating an ODE or SDE from a simple prior to a complex data distribution . For diffusion models the reverse-time generative process is itself an SDE , and many downstream applications require solving the same equation in the encoding direction. Encoding samples from the data distribution back into the model's underlying prior is referred to as the inversion of the diffusion model, and it requires a bijective map between the two distributions.

Inversion underpins a range of tasks: image editing encodes a real image back to its latent noise and re-decodes under a new prompt ; Boltzmann sampling uses the change-of-variables formula to assign exact likelihoods ; gradient-based fine-tuning and reward-guided sampling require accurate adjoints . In each of these settings the round trip $\boldsymbol x_0 \!\to\! \boldsymbol x_T \!\to\! \boldsymbol x_0$ must close exactly; an approximate inverse is not sufficient.

The continuous flow itself is a bijection; truncation error from discretization is what breaks the round trip. Several prior works have proposed reversible solvers for the probability flow ODE, namely EDICT , BDIA , and BELM . These schemes are, however, plagued by issues of low order of convergence and a lack of linear stability, amongst other undesirable properties. To the best of our knowledge, no prior scheme achieves exact inversion for diffusion SDEs without storing the entire Brownian trajectory in memory, which precludes adaptive step sizes and is impractical at scale.

Moreover, reversibility has so far been constructed one solver at a time; no general procedure exists for converting an explicit (S)RK scheme — Euler, midpoint, RK4, Dormand–Prince, Euler–Maruyama — into a reversible solver tailored to the semi-linear structure of the diffusion ODE/SDE. This is the gap Rex closes.

Standard Euler forward then back — doesn't quite return t = 0   data step size h = T/5 t = T   noise x0 x1 x2 x3 x4 x5 x'0 ε > 0 xn+1 = xn + h·f(xn) Rex — algebraically reversible backward step is the exact inverse of the forward step t = 0   data step size h = T/5 t = T   noise x0 x1 x2 x3 x4 x5 ε = 0
Figure 2. Five forward Euler steps $\boldsymbol x_{n+1} = \boldsymbol x_n + h\,\boldsymbol f(\boldsymbol x_n)$ encode a data point to noise; the dotted curve is the true ODE solution and the bold polyline is what Euler computes. Top: the reverse Euler iteration uses different tangents at each step, so the recovered point $\boldsymbol x'_0$ drifts from $\boldsymbol x_0$ and the round trip incurs a nonzero error $\varepsilon > 0$. Bottom: in Rex, each backward step is the closed-form algebraic inverse of the corresponding forward step; the same chords are retraced and the trajectory closes exactly.

Two largely independent research traditions have circled this problem from opposite directions. The first grew up inside the diffusion community itself, retrofitting bijective structure onto specific samplers: EDICT’s coupled latent trick, BDIA’s bidirectional integration, and BELM’s linear-multistep construction each reverse-engineer reversibility into one particular solver. The second comes from the neural differential-equation community, where reversibility has been pursued as a memory-efficiency device for training — from the original Neural ODE adjoint to MALI, reversible Heun, and the recent McCallum–Foster framework. Rex sits at the confluence: it borrows the McCallum–Foster coupling from the neural-DE lineage and adapts it to the exponential-integrator structure that the diffusion lineage relies on, yielding a single construction that subsumes both.

Lineages of reversibility
2018
Neural ODEs Adjoint method; reverse-time assumed exact.
2019
2020
Neural SDEs Stochastic adjoint for SDE training.
2021
Score-SDE / DDPM Continuous-time; no exact discrete inverse.
MALI Asynchronous leapfrog; ODE-only.
Reversible Heun Reversible SDE solver; vanishing stability.
2022
2023
EDICT Coupled-latent trick for DDIM.
2024
BDIA Bidirectional integration.
McCallum–Foster Arbitrary-order reversible ODE solver.
BELM / O-BELM Linear multi-step method
2025
2026 Rex A single construction that turns any explicit (S)RK scheme into a reversible solver tailored to the diffusion ODE/SDE — subsuming both lineages.

The construction itself is the subject of the next section. We first present the explicit (S)RK family that Rex operates on, then the exponential–integrator step that absorbs the semi-linear drift, and finally the McCallum–Foster coupling that closes the round trip in closed form.

02 — The Rex construction

Constructing reversible solvers from arbitrary explicit (S)RK schemes

We propose Rex, a Reversible Exponential (Stochastic) Runge–Kutta family of solvers for diffusion models. Given any explicit (S)RK scheme $\boldsymbol\Phi$, Rex constructs an algebraically reversible solver tailored to the semi-linear structure of the diffusion ODE/SDE. The construction proceeds in two stages: an exponential-integrator step which absorbs the linear drift, followed by a McCallum–Foster-style coupling which yields a closed-form inverse.

The Rex construction

  1. Φ — explicit (S)RK schemeAn explicit (stochastic) Runge–Kutta scheme: Euler, midpoint, RK4, Dormand–Prince, Euler–Maruyama, ShARK .
  2. Ψ — exponential integrator (Princeps)Apply Lawson methods to fold $\boldsymbol\Phi$ around the semi-linear drift of the reparameterized diffusion equation. We call this intermediate family Princeps; it already subsumes DDIM , DPM-Solver , gDDIM , and SEEDS-1  as special cases.
  3. Υ — algebraically reversible solver (Rex)Wrap $\boldsymbol\Psi$ in a McCallum–Foster coupling  with an auxiliary state $\hat{\boldsymbol x}$. The forward and backward steps are closed-form inverses of one another.

The coupling in step $\boldsymbol\Upsilon$ follows McCallum & Foster (2024) , which is, to our knowledge, the only prior algebraically reversible ODE solver with a non-trivial region of stability and arbitrarily high convergence order.McCallum & Foster refer to their scheme simply as reversible X, where X is the underlying single-step solver. We follow the paper convention and call it the McCallum–Foster method. Earlier algebraically reversible solvers  — the asynchronous leapfrog and reversible Heun methods — were either restricted to ODEs or possessed a vanishing region of stability. Rex generalizes the McCallum–Foster construction to exponential integrators and to stochastic differential equations.

Algebraic reversibility

Every numerical scheme is reversible in the analytic sense: one can recover the previous step by fixed-point iteration on the update equation.Provided the step size $h$ is small enough for the fixed-point iteration to converge. This is, however, both expensive and only approximate. A scheme is algebraically reversible if the inverse map admits a closed-form expression. Rex maintains an auxiliary state $\hat{\boldsymbol x}_n$ alongside $\boldsymbol x_n$ precisely so that the inverse map can be written in closed form.

Forward / Backward step (data prediction, ODE form shown)

Forward step
$\boldsymbol x_{n+1} = \tfrac{\kappa_{n+1}}{\kappa_n}\!\big(\zeta\,\boldsymbol x_n + (1{-}\zeta)\,\hat{\boldsymbol x}_n\big) + \kappa_{n+1}\,\boldsymbol\Psi_h(\varsigma_n,\hat{\boldsymbol x}_n,\boldsymbol W_n)$

$\hat{\boldsymbol x}_{n+1} = \tfrac{\kappa_{n+1}}{\kappa_n}\hat{\boldsymbol x}_n - \kappa_{n+1}\,\boldsymbol\Psi_{-h}(\varsigma_{n+1},\boldsymbol x_{n+1},\boldsymbol W_n)$
Backward step (closed form)
$\hat{\boldsymbol x}_n = \tfrac{\kappa_n}{\kappa_{n+1}}\hat{\boldsymbol x}_{n+1} + \kappa_n\,\boldsymbol\Psi_{-h}(\varsigma_{n+1},\boldsymbol x_{n+1},\boldsymbol W_n)$

$\boldsymbol x_n = \tfrac{\kappa_n}{\kappa_{n+1}}\zeta^{-1}\boldsymbol x_{n+1} + (1{-}\zeta^{-1})\hat{\boldsymbol x}_n - \kappa_n\zeta^{-1}\boldsymbol\Psi_h(\varsigma_n,\hat{\boldsymbol x}_n,\boldsymbol W_n)$

The coupling parameter $\zeta \in (0,1]$ controls a trade-off between linear stability and exact inversion.$\zeta = 1$ recovers exact algebraic reversibility; $\zeta < 1$ introduces a small mixing of the auxiliary state which enlarges the linear stability region at the cost of a residual inversion error. We use $\zeta = 1$ throughout the empirical results. The weights $\{\kappa_n\}$ and reparameterized time variable $\varsigma$ are determined by the noise schedule $(\alpha_t, \sigma_t)$; specializing these recovers the data-prediction, noise-prediction, ODE, and SDE cases under a single scheme.

Reversibility for SDEs poses an additional challenge: a naïve construction stores the entire realization of the Brownian motion in memory to replay it in reverse,This is the approach taken by CycleDiffusion and similar reversible-by-storage methods. which prohibits adaptive step sizes. Rex avoids this by fixing a single seed $\omega \in \Omega$ and reconstructing any sub-interval of the Brownian trajectory on demand through a splittable PRNG, following the Brownian interval construction of Kidger et al. . The resulting scheme supports adaptive step-size SDE solvers while remaining exactly reversible.

03 — Convergence, stability, and subsumption

Rex inherits convergence and stability from $\boldsymbol\Phi$, and subsumes prior diffusion solvers

The numerical properties of Rex are inherited from the underlying scheme $\boldsymbol\Phi$: the order of convergence, the region of linear stability, and (in the SDE case) the strong order of convergence are all preserved by the construction. To the best of our knowledge, Rex is also the first reversible solver framework that achieves exact inversion for diffusion SDEs without storing the entire realization of the Brownian motion .

THM 3.5
$k$-th order convergence (ODE)
If $\boldsymbol\Phi$ is a $k$-th order explicit RK scheme, then Rex constructed from $\boldsymbol\Phi$ is also $k$-th order. In particular, instantiating $\boldsymbol\Phi$ as RK4 yields an algebraically reversible fourth-order solver for the probability flow ODE.
THM 3.6
Strong convergence (SDE)
If $\boldsymbol\Phi$ is an SRK scheme of strong order $\xi$, then Princeps $\boldsymbol\Psi$ inherits the same strong order on the reverse-time diffusion SDE.
PROP
Non-trivial region of stability
Unlike BDIA and O-BELM, which possess no non-trivial region of linear stability, Rex inherits the linear stability region of the McCallum–Foster construction ; step sizes need not be vanishingly small for the scheme to remain numerically well-behaved.

Princeps subsumes prior diffusion solvers

A consequence of the Lawson-based construction is that Princeps $\boldsymbol\Psi$, instantiated with appropriate choices of the base scheme $\boldsymbol\Phi$, reduces exactly to many of the most popular numerical solvers used in the diffusion literature.A full statement appears in the paper as Theorem 3.10 — Rex subsumes previous solvers; the proofs are deferred to the appendix. Each row of the table below identifies a choice of $\boldsymbol\Phi$, the existing solver it recovers, and the corresponding algebraically reversible Rex variant.

Table 1. Princeps subsumes prior diffusion solvers — each row identifies a choice of the base scheme $\boldsymbol\Phi$, the existing solver that $\boldsymbol\Psi$ recovers, and the corresponding algebraically reversible Rex variant.
Choice of base scheme $\boldsymbol\Phi$Princeps $\boldsymbol\Psi$ recoversRex variant
Euler (ODE)DDIM Reversible DDIM
Euler — noise predictionDPM-Solver-1 Reversible DPM-Solver-1
Midpoint / 2-stageDPM-Solver-2, DPM-Solver++(2S) Reversible DPM-Solver-2 / ++
1-step (data) — VP scheduleDPM-Solver++1, gDDIM Reversible gDDIM
Euler–MaruyamaSDE-DPM-Solver-1 / SEEDS-1 Reversible SDE-DPM-Solver-1
RK4 / Dopri5 / ShARK (no prior analogue)High-order reversible solver

Comparison with prior reversible solvers

Table 2. Comparison of properties across reversible diffusion solvers: EDICT, BDIA, O-BELM, and Rex.
PropertyEDICTBDIAO-BELMRex (ours)
Algebraically reversible
Arbitrary order of convergence
Non-trivial region of linear stability
Constructed from any explicit (S)RK $\boldsymbol\Phi$
Diffusion SDE support
Adaptive step size
04 — Empirical results

Empirical results across inversion, generation, editing, and Boltzmann sampling

We evaluate Rex in five regimes: (a) the empirical inversion error of a Stable Diffusion v1.5 round trip; (b) unconditional image generation on CelebA-HQ; (c) text-to-image generation on Stable Diffusion v1.5 with COCO captions; (d) round-trip image editing , for which exact inversion is essential; and (e) Boltzmann sampling of molecular conformations , for which an approximate inverse invalidates the change-of-variables likelihood and breaks self-normalized importance sampling.

Inversion error · Stable Diffusion v1.5

For an algebraically reversible solver, encoding a Stable Diffusion  latent and then decoding it should recover the original latent up to floating-point rounding error. We measure the latent-space MSE of this round trip on Stable Diffusion v1.5 in fp32 at 10, 20, and 50 sampling steps; the results are shown in Figure 3.

Steps
Order of magnitude log₁₀ latent MSE — lower ↔ closer to exact 0 −1 −2 −3 −4 −5 −6 −7 −8 −9 −10 −11 10 steps 20 steps 50 steps fp32 round-off floor ≈ 10⁻¹¹

Figure 3. Latent reconstruction MSE as a slopegraph: each line is one solver's error as the number of sampling steps grows from 10 to 50, plotted on a log₁₀ axis whose floor is the fp32 round-off level at $\sim\!10^{-11}$. Lines that slope down improve with more steps; O-BELM's slopes up—a linear-stability failure consistent with §3. Hover any dot for the exact value; click a step count to focus its column; click a method in the legend to hide it.

Several observations follow. DDIM is not reversible, so its error simply reflects the truncation error of the underlying solver. EDICT and BDIA both improve with the number of steps but plateau in the $10^{-7}$–$10^{-9}$ range due to numerical instability. O-BELM exhibits a more striking failure mode: the error grows with the number of steps, since the iteration is nowhere linearly stable and round-off error accumulates.This is consistent with the theoretical claim in Section 03: a scheme with no non-trivial region of linear stability cannot maintain bounded inversion error as the number of steps grows. Rex sits at the floating-point round-off floor and improves monotonically with step count.

Unconditional generation · CelebA-HQ 256×256 , DDPM

We evaluate $10^4$ samples under identical seeds, using the DINOv2  Fréchet distance , FD$_\infty$ , precision/recall , density, and coverage . Rex with Euler–Maruyama (the SDE variant) outperforms every prior reversible solver at 20 and 50 sampling steps; the ODE variants (Midpoint, RK4) match O-BELM at lower step counts and exceed it at higher ones. Figure 4 and Table 3 summarize the results.

FD ↓ · 50 steps
391.93
Rex (E–M); O-BELM 476.29; BDIA 500.79
Density ↑ · 50 steps
0.98
Rex (E–M); O-BELM 0.77; DDIM 0.67
Coverage ↑ · 50 steps
0.56
Rex (E–M); O-BELM 0.45; DDIM 0.45
Precision ↑ · 50 steps
0.87
Rex (E–M); O-BELM 0.84
Steps10
0.5 FD FD∞ Precision Recall Density Coverage
Steps20
0.5 FD FD∞ Precision Recall Density Coverage
Steps50
0.5 FD FD∞ Precision Recall Density Coverage
EDICT DDIM BDIA O-BELM Rex (Midpoint) Rex (RK4) Rex (Euler–Maruyama, SDE)
Figure 4. Six metrics on CelebA-HQ 256×256, DDPM, across three sampling-step budgets. Larger polygon ≡ better. The shaded violet region in each panel is Rex (Euler–Maruyama), the SDE variant; it attains the largest polygon at 20 and 50 steps. Click a legend entry to dim that solver across all three panels.

Text-to-image · Stable Diffusion v1.5, COCO captions

We sample under 1000 COCO captions with shared seeds, scored by CLIP score, ImageReward, and PickScore. The stochastic variants of Rex (Euler–Maruyama and ShARK) achieve higher aesthetic and prompt-alignment scores than every reversible baseline; both also exceed the non-reversible DDIM baseline. Figure 5 and Table 4 summarize the metrics across step counts.

Image Reward ↑ · 50 steps
0.264
Rex (E–M); DDIM 0.247; O-BELM 0.160
PickScore ↑ · 50 steps
21.72
Rex (ShARK); DDIM 21.04
CLIP score ↑ · 10 steps
31.69
Rex (RK4); DDIM 31.78
20% 40% 60% 80% 100% CLIP 10 steps Image Reward 10 steps PickScore 10 steps CLIP 20 steps Image Reward 20 steps PickScore 20 steps CLIP 50 steps Image Reward 50 steps PickScore 50 steps
DDIM (non-rev.) EDICT BDIA (γ=0.5) O-BELM Rex (Midpoint) Rex (RK4) Rex (E–M, SDE) Rex (ShARK, SDE)
Figure 5. Stable Diffusion v1.5 on 1000 COCO captions. Each of the nine axes is one (metric, step-count) pair — CLIP, Image Reward, and PickScore at 10, 20, and 50 sampling steps — min–max normalised across the eight solvers so that the outer ring on every axis is the best score. The three sectors group the axes by step count. Rex SDE variants (E–M, ShARK) sweep the outer ring on Image Reward and PickScore at every step count; Rex (ShARK) is highlighted with a soft fill. EDICT collapses to the centre at 10 steps.

Round-trip image editing · pix2pix, Stable Diffusion v1.5

Round-trip editing is a canonical task for which exact inversion is required: a real image is encoded into noise under its original caption, then decoded under a modified edit caption. We evaluate on the pix2pix dataset, measuring prompt alignment with CLIP, ImageReward, and PickScore, and faithfulness to the source image with LPIPS.

LPIPS ↓ (Rex Dopri5)
0.107
vs. O-BELM 0.140; DDIM 0.214; BDIA 0.885
Image Reward ↑
−0.547
Rex (Dopri5) — best of all reversible
PickScore ↑
18.72
Rex (Euler) — beats non-reversible DDIM

Rex with the Dopri5 base scheme (an adaptive 5(4) Dormand–Prince method) is, to our knowledge, the first adaptive-step-size reversible solver applied to diffusion-based editing.Adaptive step-size SDE solvers, prior to Rex, required storing the entire Brownian trajectory in memory. The Brownian-interval construction discussed in Section 02 is what makes the adaptive variant reversible. EDICT failed entirely on this benchmark, collapsing to the identity map; BDIA also failed, with an LPIPS of $0.885$ — roughly an order of magnitude worse than the next-best reversible method. Rex preserves source-image faithfulness while still following the edit instruction, as shown in Figure 6 and Table 5.

0.5 Image Reward ↑ CLIP ↑ PickScore ↑ LPIPS ↓
DDIM BDIA O-BELM Rex (Euler) Rex (Dopri5)
Figure 6. Round-trip editing on the pix2pix dataset, Stable Diffusion v1.5, 50 inversion + 50 generation steps. Each axis is normalized across the five solvers (LPIPS inverted) so that larger = better. Rex (Dopri5) sits on or near the outer ring across all four axes; BDIA collapses to the center because its LPIPS is 8× worse than every other reversible method. EDICT is omitted — it failed entirely on this benchmark.

Boltzmann sampling of molecular conformations · tri-alanine with DiT

Sampling from a Boltzmann distribution $p_{\text{target}}(\boldsymbol x) \propto \exp(-\mathcal E(\boldsymbol x))$ with a neural ODE  requires exact likelihoods, computed via the change-of-variables formula. If the discretized solver is not a bijection the resulting likelihoods are biased, and self-normalized importance sampling then produces estimators that are inconsistent in expectation .The bias does not, in general, decrease with more samples; it reflects a systematic mismatch between the integrated density and the true density.

We train a Diffusion Transformer on tri-alanine (Ala–Ala–Ala) and evaluate sampling quality with the effective sample size (ESS) and the 2-Wasserstein distance to the molecular-dynamics ground truth, taken with respect to the energy distribution ($\mathcal E\text{-}\mathcal W_2$) and the dihedral-angle distribution ($\mathbb T\text{-}\mathcal W_2$).

Table 6. Tri-alanine Boltzmann sampling · 104 samples. Rex (Dopri5), dropped in for the non-reversible Dopri5 baseline within the same DiT pipeline, achieves the best energy-Wasserstein metric of any method evaluated.
Model Solver $\mathcal E$-$\mathcal W_2$ ↓ $\mathbb T$-$\mathcal W_2$ ↓ ESS ↑
RegFlow1.0511.6120.029
SBG (IS)0.7580.5020.052
SBG (SMC)0.5980.503
ECNF++Dopri52.2060.9620.003
DiTDopri50.7370.4680.140
DiTRex (Dopri5)0.4950.4970.104

Best in mauve, second-best underlined. Source: Table bg_results in the preprint.

Substituting Rex (Dopri5) for the non-reversible Dopri5 baseline within the same DiT pipeline reduces the energy-distribution Wasserstein error from $0.737$ to $\mathbf{0.495}$, a $33\%$ reduction, while leaving the dihedral-angle metric essentially unchanged. This is consistent with the inversion-error analysis above: exact reversibility ensures that the change-of-variables likelihoods used in importance sampling are unbiased, which the non-reversible Dopri5 baseline cannot guarantee.

05 — Conclusion

Conclusion

We have proposed Rex, a family of algebraically reversible solvers for diffusion models, constructed by combining Lawson exponential integration with a McCallum–Foster-style coupling. Rex inherits an arbitrarily high order of convergence from the underlying explicit RK scheme in the ODE case, and to the best of our knowledge furnishes the first method for exact inversion of diffusion SDEs without storing the entire realization of the Brownian motion. The intermediate Princeps family subsumes several widely used diffusion solvers — DDIM, DPM-Solver, gDDIM, and SEEDS-1 — as special cases, so that Rex yields algebraically reversible analogues of each. Empirically, Rex functions as a capable numerical scheme across unconditional and conditional image generation, round-trip image editing, and Boltzmann sampling of molecular conformations; in the Boltzmann setting, exact reversibility yields an unbiased change-of-variables likelihood and a $33\%$ reduction in the energy-distribution Wasserstein metric over the non-reversible baseline.

References