skip to main content
10.1145/3721238.3730646acmconferencesArticle/Chapter ViewFull TextPublication PagessiggraphConference Proceedingsconference-collections

Abstract

Recent extensions to spatiotemporal path reuse, or ReSTIR, improve rendering efficiency in the presence of high-frequency content by augmenting path reservoirs to represent contributions over full pixel footprints. Still, if historical paths fail to contribute to future frames, these benefits disappear. Prior ReSTIR work backprojects to the prior frame to identify paths for reuse. Backprojection can fail to find relevant paths for many reasons, including moving cameras or subpixel geometry with differing motion.
We introduce reservoir splatting to reduce these failures. Splatting forward-projects the primary hits of prior-frame paths. Unlike backprojection, forward-projected path samples fall into the current-frame pixel relevant to their exact primary hits, making successful reuse more likely. This also enables motion blur for ReSTIR, by splatting at multiple time steps, and supports depth of field without the specialized shift maps needed previously.
Beyond enabling motion blur, splatting improves resampling quality over Zhang et al.’s [2024] Area ReSTIR at up to 10% lower cost. To improve robustness, we show how to MIS splatted and backprojected samples to help every current-frame pixel get at least one historical path proposed for reuse.
Fig. 1:
Prior ReSTIR methods [Zhang et al. 2024] degrade during camera motion, as sequential frames rarely shade identical primary hits, making perfect reuse tricky. This worsens near fine details, like foliage and fur, where sequential frames may not hit the same surface. By forward splatting hits from last frame, we guarantee they are reevaluated. Adding time to samples also enables resampling for motion blur while shading only one sample per pixel. Here we show two scenes under camera motion (Sheep In Forest and Subway) with stock Area ReSTIR [Zhang et al. 2024] and our new splatting-based ReSTIR. Insets also show an offline reference and a naïve motion blur baseline using Zhang et al. [2024] backprojection.

1 Introduction

Recent importance sampling techniques based on resampled importance sampling (RIS) [Talbot et al. 2005] and iterated RIS (also known as ReSTIR) [Bitterli et al. 2020] dramatically improve rendering efficiency in interactive contexts by spatiotemporally reusing path samples. Over time, these methods can converge to near-optimal importance sampling with only one new sample per pixel (spp). This convergence relies on maintaining an easily-reusable path history. When history resets, quality temporarily degrades to that of naive, 1 spp path tracing. Reducing the frequency of spurious sample history resets is thus key to maintaining ReSTIR’s efficiency.
Recently, Zhang et al. [2024] showed that elevating a path’s dimensionality with the subpixel location allows better maintenance of temporal history in the presence of high-frequency normal maps. But like other ReSTIR papers [Lin et al. 2022; Ouyang et al. 2021], Zhang et al. still use temporal backprojection; this regularly changes a path’s primary hit point between frames, whenever motion occurs. Such a history is challenging to reuse, especially where pixel color depends on aggregate geometry, e.g., hair or foliage; see Figure 2.
Fig. 2:
Prior work, like Area ReSTIR [Zhang et al. 2024], converges quite well with 1 spp using a static camera. When moving, quality degrades significantly, especially for the high-frequency geometry in Emerald Square. This motivates our search for better ways to maintain historical samples.
In this paper, we forward reproject (or splat) prior-frame samples to further improve reuse. Unless occluded in the current frame, such samples can be reused without changing their object-space primary hit; this stabilizes reuse and reduces history resets.
Our specific contributions in this paper include:
A scatter-based reservoir reuse that exactly preserves object-space primary hits between frames (section 4),
A simple “backup” sample mechanism to fill holes between splatted samples, e.g., during zooming (section 4.2),
Applying splatting to motion blur; we show the first ReSTIR-accelerated motion blur algorithm (section 4.4),
Enabling depth of field without Zhang et al.’s [2024] specialized shift map or hand-tuned MIS weights (section 4.5).
Overall, our work simplifies Area ReSTIR sampling, gives better performance, enhances resampling quality, and enables motion blur and depth of field out-of-the-box.

2 Related Work

Ray and path tracing [Kajiya 1986; Whitted 1979] have long proven costly, motivating researchers to explore techniques to optimize, reduce samples, and amortize costs. We overview a number of common methods, including temporal reprojection, denoising, gradient-domain rendering, path reuse, and resampling techniques. We then review algorithms specifically developed for motion blur rendering.

2.1 Temporal Reprojection and Denoising

Early real-time ray tracing researchers lacked the benefits of hardware acceleration [Kilgariff et al. 2018], often making do with significantly less than one ray per pixel. Frameless rendering [Bishop et al. 1994] recomputes a subset of pixels each frame, reusing others until they get updated. Adding prior-frame reprojection [Adelson and Hodges 1995; Corso et al. 2017] repositions reused pixels, and a render cache [Nehab et al. 2007; Scherzer et al. 2007; Walter et al. 1999] reuses computation over multiple frames. Adaptive frameless rendering [Dayal et al. 2005] carefully places new samples and runs (cached) samples through spatiotemporal reconstruction. Yang et al. [2011] reprojected bidirectionally, i.e., from both forward and backward frames, to help reconstruct intermediate frames.
Outside of ray tracing, games widely apply temporal antialiasing (TAA) [Yang et al. 2020] to reduce undersampling using prior-frame data. Modern TAA methods also upsample their output [Yang et al. 2009], but can introduce ghosting, blur, and shimmer. To reduce these artifacts, many apply fast neural networks [Liu 2020; Xiao et al. 2020] that extend and accelerate offline superresolution networks (e.g., Dong et al. [2015]).
Real-time denoisers (e.g., Schied et al. [2017; 2018]) often ingest low-sample inputs, so they employ temporal reprojection to increase effective sample counts [NVIDIA 2020b]. As in temporal superresolution, many denoisers also apply neural networks [Bako et al. 2017; Chaitanya et al. 2017; Işık et al. 2021], sometimes to uncover adaptive sampling opportunities [Hasselgren et al. 2020; Kuznetsov et al. 2018].
Our method is the first to demonstrate how prior-frame samples can be used via forward-reprojection without introducing bias.

2.2 Path Reuse

Path reuse [Bekaert et al. 2002] amortizes costs by reusing paths or segments within pixel blocks, though such reuse has challenges; Xu and Sbert [2007] explore ways to reduce tile correlations. Gradient-domain rendering builds correlated paths for close-by pixels with shift mappings [Kettunen et al. 2015; Lehtinen et al. 2013] to evaluate finite differences. Bauszat et al. [2017] use these shift mappings to better reuse specular paths within pixel blocks.
Recently, resampling algorithms [Bitterli et al. 2020] have significantly improved path reuse, demonstrating real-time performance for many light transport problems; our work fits into this category of spatiotemporal reservoir resampling (or ReSTIR) algorithms.

2.3 Resampling for Rendering

Resampled importance sampling (RIS) [Talbot et al. 2005] aggregates M samples, which are then resampled into N samples, approximately distributed according to a normalized target function. Bitterli et al.’s [2020] ReSTIR DI combines RIS (using N = 1) with weighted reservoir sampling [Chao 1982], with each per-pixel reservoir storing a sample whose distribution is refined by resampling from spatial and temporal neighbors. Ouyang et al. [2021] treat longer paths as virtual point lights [Keller 1997] to handle indirect light. Lin et al. [2022] shift full paths between neighbors to properly handle glossy surfaces. Their generalized RIS (GRIS) provides a mathematical foundation for ReSTIR. Zhang et al.’s [2024] Area ReSTIR improves robustness for depth of field and subpixel details by reusing lens and subpixel coordinates with fractional motion vectors. We build on Area ReSTIR, but redesign its temporal reuse.

2.4 Motion Blur

Motion blur arises as cameras have non-zero exposure times. Perceptually, it conveys relative motion within a frame. But temporal integration is expensive, so real-time methods use various approximations, e.g., accumulating multiple frames [Haeberli and Akeley 1990], blurring per-pixel [Rosado 2007], extruding geometry [Gribel et al. 2010; Tatarchuk et al. 2003], or screen-space velocity maps [McGuire et al. 2012]. See Navarro et al. [2011] for a survey. Such approximations typically favor performance over accuracy.
Ray tracers can stochastically sample time to render motion blur; this is generally more expensive but produces better results. Multidimensional adaptive sampling [Hachisuka et al. 2008; Meister and Hachisuka 2022] concentrates samples in motion-blurred regions. Frequency analysis [Egan et al. 2009] enables sparser sampling driven by a sheared space-time filter. Light field reconstruction [Lehtinen et al. 2011; 2012; Munkberg et al. 2014] can efficiently render motion blur. Covariance tracing [Belcour et al. 2013] computes a required per-pixel sample count, and reconstructs in image-space. Manzi et al. [2016] accelerate motion blur by evaluating temporal differences with cross-frame shift maps. Oberberger et al. [2022] further account for motion blur in the denoising process.
To our knowledge, we achieve the first motion blur via resampling, potentially enabling real-time motion blur in low-sample renderers.

3 Preliminaries

We first provide a review of key concepts related to ReSTIR.

3.1 Unbiased Contribution Weights

Since the result of RIS does not have a tractable PDF, Lin et al. [2022] abandon the traditional f/p estimators
E[f(X)p(X)]=suppXf(x)dx
(1)
in favor of the more general f(X)WX estimators with
E[f(X)WX]=suppXf(x)dx,
(2)
where random variable WX is an unbiased contribution weight (UCW) for X. A random variable WX is a UCW for X if and only if Equation (2) is true for all integrable f; this is equivalent to E[WX|X]=1/pX(X) [Lin et al. 2022].
While RIS resampling generally results in intractable PDFs, it allows unbiased integration and iterative resampling with simple UCWs via this f(X)WX integration framework.

3.2 Shift Mappings

Reusing a path between pixels requires modifying some vertices to change which pixel it contributes to. Shift mappings T move paths between domains, e.g., pixels. Formally, a shift map from Ω1 to Ω2 is a bijective function from a subset of Ω1 to a subset of Ω2.
The reconnection shift [Lehtinen et al. 2013] is a commonly used shift mapping. Given two camera positions x0 and y0 and primary hits x1 and y1, this shift maps base path x¯=[x0x1x2xn] into offset path T(x¯)=[y0y1x2xn], reconnecting to x¯ immediately after primary hit y1. This works well on rough surfaces, but fails the goal of good shifts (that f(x¯)f(T(x¯))) if any of x1, x2, or y1 are nearly specular. Kettunen et al.’s [2015] half-vector shift and Lin et al.’s [2022] hybrid shift postpone reconnection until rough vertices are found, more effectively handling specular materials.
Forward and backward shifts T and T− 1 need not be defined for all paths, but when defined, bijectivity is essential for unbiased path reuse1. Shift mappings also change path and probability densities, so formulas using shifts need Jacobian determinants |T′(x)| or |∂T/∂x|.

3.3 GRIS

Given candidate samples X1, …, XN, their corresponding domains Ω1, …, ΩN, a target domain Ω for resampling, and a non-negative target function p^ defined in Ω, GRIS [Lin et al. 2022] aggregates X1, …, XN into a result YΩ, approximately distributed proportional to p^. First, each XiΩi is shifted to a similar sample YiΩ via shift map Ti:
Yi=Ti(Xi).
(3)
Then, a resampling weight wi is computed for each Yi:
wi=mi(Yi)p^(Yi)WXi|TiXi|,
(4)
where mi is a resampling MIS weight, WXi is sample Xi’s UCW in its original domain Ωi, and |∂Ti/∂Xi| is the Jacobian of shift Ti. Finally, output Y is resampled from the Yi proportional to weights wi. The new contribution weight for Y is
WY=1p^(Y)i=1Nwi.
(5)
If the supports of candidates Yi together cover the support of p^, WY is unbiased, as per section 3.1. Taking one canonical sample guarantees this coverage; a canonical Xc is sampled from Ωc = Ω, with an identity shift map Tc, so that Xc alone covers p^’s support.
Lin et al. [2022] introduce the generalized balance heuristic to compute resampling MIS weights:
mi(y)=ci p^i(y)j=1Ncj p^j(y),
(6)
which uses the “p^ from” function
p^j(y)=p^j(Tj1(y))|Tj1y|
(7)
as an (unnormalized) proxy for the PDF of y = Tj(zj), originating in domain Ωj as zj=Tj1(y). The Jacobian modifies the proxy the way probability densities transform in shift mappings. The confidence weights cj control the relative weight of candidate samples.

3.4 ReSTIR

ReSTIR repeatedly applies GRIS to share samples spatiotemporally. Each pixel i stores a reservoir containing a sample Xi, its UCW WXi, and confidence weight ci. See Wyman et al.’s [2023] course notes for details, but at a high level, the process typically goes as follows:
Initial sampling. Each pixel i gets some number Ninit initial candidates, e.g., newly-traced independent paths. One is selected via RIS, producing a canonical initial sample Xi with UCW WXi.
Temporal resampling. Each pixel i backprojects along its backward motion vector to choose prior-frame reservoir j. The results of GRIS between Yj = Tj(Xj) and Xi, using confidence weights cj and Ninit, replaces Xi, a new WXi is computed, and ci is set to min (ccap, cj + Ninit), where ccap is a fixed confidence cap.
Spatial resampling. Each pixel i chooses Nspat spatial neighbors j1,,jNspat randomly from a box around i, and performs GRIS between Xi and each Xjk, using the corresponding confidence weights. The result overwrites Xi; WXi is updated with Equation (5), and ci sums the samples’ confidence weights, again clamped to ccap.
Shading. The final color for pixel i is evaluated as f(Xi)WXi, and the same process repeats in the next frame.

3.5 Area ReSTIR

Prior to Area ReSTIR [Zhang et al. 2024], screen-space algorithms such as Lin et al. [2022] applied ReSTIR to path space starting at secondary hits, x2, finding primary hits x1 by ray tracing at predetermined subpixel locations (e.g., pixel centers).
Zhang et al. [2024] note this adds instabilities near high-frequency details: even if a reused path suffix [x2x3⋅⋅⋅xn] remains valid, it may be incompatible with a new pixel’s implicitly defined prefix [x0x1]. To cure this, they add subpixel location (and lens position) as additional dimensions to reservoirs, essentially storing full paths [x0x1⋅⋅⋅xn]. Target functions no longer vary with each pixel’s implicit prefix, as the shift maps propose complete, reusable paths. This significantly boosts reuse quality.
To improve subpixel precision for temporal path reuse, Zhang et al. [2024] backproject the hit point at each pixel i’s center to the prior
frame, via an image-space shift with motion vector δi, building a 1 × 1 pixel fractional reservoir around the result. This off-grid fractional reservoir is populated with samples from the overlapping 2 × 2 block of pixels (via RIS), and the chosen sample is shifted back to the current frame with motion vector − δi, approximately retaining the primary hit.
As no samples in the 2 × 2 block of prior-frame pixels may fall in the fractional reservoir, Zhang et al.’s “fast” variant can leave this reservoir empty, producing excess noise. Prior to RIS, their “robust” variant first spatially shifts all samples in the 2 × 2 block to the fractional reservoir. This solves the issue but is more expensive.
Mathematically, Area ReSTIR integrates the measurement contribution function for each pixel i:
Ii=Ωifi(x¯)dx¯=Ωihi(x¯)f(x¯)dx¯,
where pixel i’s path space Ωi ⊂ Ω is the set of paths x¯ for which the pixel filter hi(x¯)>0, and f(x¯) is the path contribution. We use p^i(x¯) to denote the pixel-dependent target function for resampling, which we assume is defined as
p^i(x¯)=hi(x¯)p^(x¯),
(8)
often with p^=f, but cheaper approximations can also be used.
Note that Area ReSTIR only approximately retains primary hits; this may degrade temporal reuse near subpixel details. Our reservoir splatting exactly retains primary hits, improving temporal reuse.

3.6 Motion Blur

Zhang et al.’s Area ReSTIR integrates the measurement integral at a fixed time. However, rendering motion blur requires integrating over a time interval [t0, t1), where Δt = t1t0 is the exposure time. Intensity of pixel i is then
Ii=t0t1Ωi(t)hi(x¯,t)f(x¯,t)dx¯dt,
(9)
where hi(x¯,t) is the camera’s pixel filter at time t ∈ [t0, t1).
We experimented by naïvely extending Area ReSTIR for motion blur by adding sample time t to the reservoirs. But backprojecting each pixel’s motion δi at a single time causes suboptimal reuse, as apparent motion depends on both sample time and subpixel location, which vary independently. Our splatting uses samples’ real motion between frames, exactly contributing to the relevant pixels.

4 Reservoir Splatting

Prior ReSTIR methods gather reusable temporal neighbors in the prior frame; each pixel j is backprojected to the prior frame by a single backward motion vector δj to find candidates for reuse [Bitterli et al. 2020; Zhang et al. 2024]. Paths are then reused with the negated image-space motion − δj, regardless of the reuse candidate’s
real image-space motion. This inexact tracking of shading points between frames can cause reuse failures in cases of subpixel detail, including factors like multiple objects (e.g., foliage) or region scaling (zooming), spuriously discarding the sample history that ReSTIR exploits.
We instead scatter prior-frame candidates by forward reprojection, or splatting, to map them to the current frame. Each splatted
point uniquely maps to the current image, always contributing to the pixel it lands on, unless occluded. This exact mapping avoids discarding history unnecessarily.
4.1 details how the math of resampling changes when splatting. section 4.2 combines scatter- and gather-based reuse with appropriate MIS for better quality, albeit at higher cost. section 4.3 discusses an approach to appropriately update reservoir confidence when splatting. section 4.4 expands our sample splatting to real-time motion blur, and section 4.5 describes how splatting also improves Zhang et al.’s [2024] depth of field.

4.1 GRIS with Scatter

Lin et al.’s [2022] GRIS theory requires defining input domains without looking at the samples. Prior work worked around this by reusing based on pixel-center motion vectors. But exactly preserving prior-frame samples’ primary hits requires examining which samples contribute to which pixel, which is not allowed. To reconcile this, we conceptually define all prior samples as contributing to every pixel, but ensure zero weight for those not splatting into current pixel j.2 This allows applying GRIS for temporal resampling assuming:
(1)
The target domain Ω is the current frame’s full path space.
(2)
The target function for pixel j is p^j(x¯)=hj(x¯)p^(x¯).3
(3)
Input samples X1, …, XN come from all prior-frame area reservoirs, i.e., N is the screen size H × W. Sample domains are Ωi = Ωprev, the entire prior frame’s path space.
(4)
Prior-frame candidates Xi use the same shift map Ti = T, where T is forward projection followed by the hybrid shift.
(5)
Input XN + 1 is a canonical sample Y=Yj for current pixel j. It uses identity shift TN + 1, as it lies in the current frame.
For efficient computation, we first initialize a GRIS reservoir with a canonical sample for each current-frame pixel. We then shift all prior paths Xi to the current frame as Yi = T(Xi), identifying the pixels each contributes to, and stream Yi to these pixels’ reservoirs (while avoiding race conditions).
While a balance heuristic (section 4.1.2) for all-to-all reuse for all N pixels requires O(N3) total shifts, the splat operation implicitly zero-weights most prior-frame samples for a given pixel. This reduces to two shifts per pixel (forward-splatting the prior reservoir and reverse-splatting the initial sample) totaling O(N) shifts.

4.1.1 Reprojection Shift Mapping.

To splat path x¯ into the current frame, we first transform its primary hit x1 by keeping its object space position; given model-to-world transform M, we get y1 = M(Mprev)− 1x1. For static scenes, y1 = x1. We then map lens vertex
x0 to the current frame. For pinhole cameras, y0 is the new camera position. For other camera types, we can retain local lens coordinates or apply the cross-frame camera transform, i.e., y0=Mv(Mvprev)1x0 given view-to-world matrix Mv. We test visibility between y0 and y1; if occluded, the shift fails and remains undefined. Finally, we shift path suffix [x2x3xn] into [y2y3yn] using any good shift, e.g., Lin et al.’s hybrid shift. Reprojection induces an additional Jacobian we must consider to avoid bias; we discuss this in section 4.1.3.

4.1.2 GRIS Formulas.

4.1.1 defined resampling for current pixel j using N + 1 inputs: N prior samples Xi shifted to Yi = T(Xi) in the current frame, plus a canonical sample YN + 1 = Y*. In this subsection, we discuss how the box filter reduces the balance heuristic into simple expressions; the supplemental document contains derivations of MIS weights with general pixel filters.
Starting with prior sample Xi mapped to Yi = Ti(Xi), we substitute p^j(x¯)=hj(x¯)p^(x¯) into Equation (4). Using a box filter, wi = 0 if Yi is not in pixel j. Otherwise, for Yi inside pixel j, hj(Yi) = 1, so we get
wi=mi(Yi) p^(Yi)WXi|TXi|.
(10)
The generalized balance heuristic [Lin et al. 2022] then reduces to
mi(Yi)=ci p^prev(Xi)|T/Xi|1c p^(Yi)+ci p^prev(Xi)|T/Xi|1,
(11)
given confidence weights ci for samples Xi and c* for canonical sample Y*. This is because shifting Yi into any prior-frame domain Ωk results in the same path Tk1(Yi)=T1(Yi)=Xi, whose primary hit belongs only to pixel i’s box filter. As a result, the balance heuristic in Equation (6) reduces from N + 1 terms in the denominator to two in Equation (11). Scattering allows simply shifting each sample Xi once, contributing shifted sample Yi to the pixel it falls in.
The initial sample YN + 1 = Y* gets resampling weight
wN+1=mN+1(Y) p^(Y)WY,
(12)
with
mN+1(Y)=c p^(Y)c p^(Y)+cr p^prev(T1(Y))|T1/Y|,
(13)
given confidence weight cr at prior-frame pixel defined by the reverse splat T− 1(Y*). If the reverse splat fails due to occlusion or lying outside the image, that term becomes zero. Similarly, we only shift the canonical sample to the prior frame once, and at most one prior-frame pixel has positive box filter value for the splat T− 1(Y*). Thus, we again reduce from N + 1 terms in the denominator to two.

4.1.3 Jacobian.

Renderers often parametrize primary hits with subpixel locations. Consider splatting changing a previous-frame subpixel location v to the current frame u. In this parametrization, the Jacobian determinant from the reprojection (section 4.1.1) is
|uv|=|uy1||y1x1||x1v|,
(14)
where |∂y1/∂x1| accounts for object scaling and |∂y1/∂u| accounts for projection from world- to screen-space [Lehtinen et al. 2013]:
|y1u|=|y1ω||ωu|=y1y02cosθNcos3θV,
(15)
given angle θN between y1’s normal and (y0y1), and angle θV between the camera’s forward vector and (y1y0). The full Jacobian |T/x¯| multiplies the subpixel Jacobian |∂u/∂v| and the Jacobian of the remaining path, e.g., from the hybrid shift:
|Tx¯|=(cosθNcosθNprevcos3θVprevcos3θVx1x02y1y02)|y1x1||Thybrid|,
(16)
where |∂y1/∂x1| = 1 for rigid transformations. For non-rigid body deformations, this is computed as the ratio between the corresponding triangle’s current and prior area.

4.1.4 Summary.

We first scatter prior samples to the current frame, using them for resampling where they land. Next, we reverse-splat canonical samples to the prior frame to evaluate MIS weights. Last, we update UCWs for each pixels’ chosen sample. Reservoir splatting remains unbiased as Jacobians compensate for any screen-space density change, and the initial samples ensure we fully cover the path space. Efficiently implemented, reservoir splatting costs similar to Zhang et al.’s [2024] fast reuse in Area ReSTIR.

4.2 Backup Sample

Splatting projects to the current frame using the last frame’s forward motion vectors; this leaves holes where no prior-frame pixel contributes. We can optionally fill holes with backup samples, e.g., backprojecting to the prior frame to find relevant samples.
To obtain a backup, we follow the rounded motion vector δ at each pixel center back to a prior-frame pixel b, whose sample Xb we give as an additional input to our temporal reuse. We directly use Zhang et al.’s [2024] proposed temporal shift Tb.
This simple method improves robustness, albeit at increased cost similar to Zhang et al.’s [2024] robust variant. Like any shift moving the shading point, this works best for low frequency geometry and lighting. In the supplementary material, we extend the MIS weights from Equations 11 and 13 to correctly account for backup samples.

4.3 Confidence Weight Update

Earlier methods set the confidence weight cj to the capped sum of the inputs (section 3.4). However, we conceptually have H × W inputs, of which only a few contribute. Setting the confidence weight based only on contributing reservoirs produces bias. Instead, we model confidence after Zhang et al. [2024], backprojecting the current pixel-center into the prior frame and bilinearly interpolating the confidence weights of the 2 × 2 overlapped pixels:
cj=min(ci+k=14βkck, ccap).
(17)
Here, βk is the bilinear weight corresponding to how much pixel k′ overlaps with the backprojected pixel j. With the backup, we also add its confidence before clamping.

4.4 Motion Blur in ReSTIR

No prior spatiotemporal resampling algorithms handle motion blur, which requires integrating over a shutter time [t0, t1) and associating a specific time t with each sample.

4.4.1 Naïve Area ReSTIR Extension.

We started by naïvely extending Zhang et al. [2024], augmenting paths x¯ into path-time pairs (x¯,t) and seeking a new shift between prior- and current-frame pairs. Shift mappings rely on invariants (e.g., a vertex or half-vector), and a sample’s offset from the frame start is a possible invariant. A shift might preserve such an offset; given frame duration Δt, a sample time of t maps to t ± Δt in future or prior frames.
Backprojecting sample (x¯j,tj) from current pixel j to previous time tjΔt defines a motion vector. But this represents only motion at x¯j, not all surfaces in pixel j. Area ReSTIR reuses from reservoirs with fixed motion. Even with a new time-offset shift, many paths in the reservoir may be irrelevant for the current pixel due to differing motion, sample time, or subpixel detail. This makes failed shifts and lost history more likely, degrading rendering quality; see Figure 3.
Fig. 3:
This Landscape has 4.3 billion triangles. For static cameras, Zhang et al. [2024] get good results, but the quality is lost in motion, even with naïve motion blur (section 4.4.1). Splatting improves quality, lowers cost, and extends to multi-splatting to resample paths at multiple times during the exposure, further improving quality.
Fig. 4:
To remain robust, Area ReSTIR [Zhang et al. 2024] reuses temporally by one shift that preserves lens coordinates and another that preserves the primary hit, and merges with MIS. Our reservoir splatting alone preserves both. Left: Area ReSTIR computes a motion vector by tracing a primary hit through the current lens and pixel centers (1), then reprojecting towards the prior lens center onto the prior image plane (2). Middle: Lens vertex copy reuses prior path’s lens coordinates (3), adds the motion vector to the image-space location (4), and traces a new primary hit (5). Primary hit reconnection adds the motion vector (6) and reprojects the old primary hit onto the lens through the image-space vertex (7). Right: Our reservoir splatting copies the local lens coordinates (1) and reprojects the old primary hit onto the image-space towards the new lens vertex (2).

4.4.2 Splatting for Motion Blur.

Comparatively, using scattering for motion blur is simple. Splatting forward-projects a single path, whereas backprojection seeks contributions for an entire pixel.
Splatting by shifting forward in time Δt simply projects the prior primary hit to the correct point on its motion, mapping to a specific pixel. Given shift T in section 4.1, our motion blur shift map T^ is:
(y¯,t)=T^(x¯,tprev)=(T(x¯),tprev+Δt),
(18)
which should be read as “forward project x¯ to time tprev+Δt”, with T also implicitly depending on t.

4.4.3 Multi-Splatting for Motion Blur.

Some motion blur methods
[Haeberli and Akeley 1990] render repeatedly and accumulate. Similarly, splatting can repeatedly shift prior samples and splat into the current frame at different times. This reduces sample sparsity, improving reuse but at increased cost (see Figures 3 and 5).
To do this, we partition the shutter time τ = t1t0 into K intervals of length τ/K, and splat each sample to all K intervals with shifts
(y¯(n),t(n))=T^n(x¯,tprev)=(Tn(x¯),t0+nτK+tprevt0prevK),
(19)
where Tn reprojects x¯ from tprev to t(N), and n = 0, …, K − 1. As T^n squeezes the time dimension by 1/K, we have |T^n|=|Tn|/K in resampling weight and MIS formulas.

4.5 Splatting for Depth of Field

Zhang et al. [2024] achieved real-time depth of field by reusing the subpixel location u. Shifts could then be defined by either reusing the lens position s (the lens vertex copy shift), or the primary hit x1 (the primary hit reconnection shift). Primary hit reconnection improves reuse of bokeh samples for large apertures, but often leads to shift failures for small apertures. Neither shift produces acceptable results alone, so they must be combined via heuristic-based MIS weights.
In contrast, splatting explicitly preserves the primary hit x1 and also allows reuse of a fixed lens sample s, while the image-space location u is simply defined by splatted sample location. This achieves high-quality bokeh using our single shift map. This saves compute time and simplifies the code. See Figure 4 for an illustration.

5 Implementation Details

We implement our reservoir splatting4 in Falcor [Kallweit et al. 2022]. We use Zhang et al.’s [2024] area reservoir storage format, adding the explicit object-space primary hit, as well as a sample time when enabling motion blur. For correctness and efficiency, a few key details should be considered.
Multiple splats can land in any pixel, requiring atomics to avoid race conditions, as in order-independent transparency algorithms [Maule et al. 2011] such as real-time K-buffering [Bavoil et al. 2007].
Motion blur is tricky, as it traces rays at any time t. This requires arbitrary-time BVH queries. While recent hardware accelerates such motion BVHs [NVIDIA 2020; 2020a], Falcor does not. Our prototype only handles blur from dynamic cameras in static scenes, allowing use of regular BVHs as geometry is time-invariant. Our algorithm should extend to dynamic scenes once we can trace rays at arbitrary times.
The supplemental material discusses these topics in further detail.

6 Results

We generated all results at 1920 × 1080, with timings captured on an NVIDIA GeForce RTX 4090. We follow most settings from Zhang et al. [2024] and Lin et al. [2022], e.g., ccap = 20, spatial neighbors from a 30 pixel radius. We enable Russian roulette, so average path depth is < 3 in most scenes. Unless stated, motion blur results use a simulated shutter of 1/24 seconds, regardless of frame time. We use MAPE to measure numerical errors and FLIP for perceptual errors [Andersson et al. 2021].
Figure 7 shows a cross-section of results, comparing our single splatting, with and without backup (Sections 4.4.2 and 4.4.3), to various baselines in tricky scenes.
The top three rows (Sheep In Forest, Hair, and Residential Lobby) compare with Zhang et al.’s [2024] Area ReSTIR, using their fast and robust temporal methods. Our splatting consistently runs 5-10% faster with 10-20% lower error than fast reuse, and splatting with a backup similarly outperforms robust reuse. In tricky cases, splatting alone also beats robust reuse; in simpler cases, e.g., along flat surfaces, splatting only gives quality on par with Zhang et al.’s fast reuse, though it usually retains a performance advantage.
The bottom rows (Bistro Exterior and Emerald Square) show motion blur, which was incompatible with ReSTIR before our work. To avoid comparing to ad hoc post-process blur, we use a baseline of Zhang et al. [2024] augmented by our naïve, backprojected motion blur from section 4.4.1. In this case, splatting again has 10-15% lower error and usually lower cost. From a quality perspective, Area ReSTIR’s robust reuse has quite noticeable 2 × 2 pixel correlations. Such correlations are not present with splatting (alone), so even if the robust baseline has lower metrics, splatting often has less objectionable artifacts and runs twice as fast.
Qualitatively, multi-splatting could improve all our results in Figure 7 further (e.g., as in Figure 5), albeit at increased cost.
Fig. 5:
The Living Room using multi-splatting with varying splats per prior pixel, with camera rotation generally leftwards.
Fig. 6:
The Wooden Staircase captured under extreme forward motion. In such cases, splatting prior reservoirs into single pixels leaves holes between splats. Using a backup sample (section 4.2) helps fill these holes, improving reuse.
Fig. 7:
Here, we compare our work with various baselines. For Sheep In Forest, Hair, and Residential Lobby, we directly compare with Zhang et al.’s [2024] fast and robust variants. No prior resampling method quickly handles motion blur, so in Bistro Exterior and Emerald Square we compare against the naïve backprojected approach described in section 4.4.1 (along with a robust variant similar to Zhang et al.).

7 Discussion

Below we discuss some key takeaways from our work and results.
Resampled reuse deteriorates under motion. While ReSTIR converges impressively for static cameras, more frequent mismatches between temporal samples degrades the history under motion (see Figure 2). This is particularly apparent near high frequencies.
Gather versus scatter. This question repeatedly arises in graphics. Typically, similar algorithms can be designed with either. But a splat-based scatter guarantees testing every prior reservoir for relevance to a current pixel. Especially under motion, gathering via backprojected motion vectors makes no such guarantee. Quality bumps just from scattering are small in well-sampled scenes, but differences grow near high frequencies (e.g., high-frequency geometry in Figures 1, 3, 9, 7, and high-frequency material in Figure  8).
Fig. 8:
The Zero Day scene captured under moderate upward motion. Splatting improves high-frequency normal-mapped glossy materials compared to Area ReSTIR.
Splatting performance is fast and stable. Splatting performance depends on atomic contention. Even with fast motion, we averaged 0.6 to 0.9 valid prior-frame splats per pixel, with a maximum of around 5. Variations were well-distributed, leading to good GPU utilization. Despite the additional overheads discussed in Section 5, splatting only requires visibility queries; these are cheaper than the primary rays traced by other shifts that must find a new primary hit.
Backup samples are important for robustness. Splatting introduces holes, which grow when splats’ relative motion diverges. Backup samples help reduce these holes, albeit at increased cost from the use both scattering and gathering. Figure 6 shows extremely fast motion, where these hole-filling benefits are clear. Interesting future work includes identifying backup reuse methods with smaller overheads. Even so, splatting with backup results in only six total shifts per pixel, while Zhang et al.’s [2024] robust reuse uses ten total shifts per pixel.
Multi-splatting. Multi-splatting gives another hole-reduction strategy, where the performance cost is relatively small, but can significantly improve motion blur quality (see Figure 3 and Figure 5). The Sheep in Forest and Subway in Figure 1 use 2 × splatting.
Improving depth of field. Zhang et al. [2024] introduced the first resampling method for depth of field, improving quality significantly. However, they combined two shift maps with MIS, as neither shift handled samples in all circles of confusion. This adds variance, as one choice is nearly always better than the other.
Our splatting-based approach uses a single shift valid across the depth range, reducing noise near the focal plane and on high-frequency geometry. Additionally, due to the reduced shift count, our approach is routinely up to 10% faster (e.g., Figure 9 and Figure 7).
Fig. 9:
Captured under motion in the Bistro. Motion increases Area ReSTIR’s [Zhang et al. 2024] noise due to less temporal reuse. Our splatting better preserves sample history, reducing noise, especially on foliage and specular highlights where correspondences between frames are more difficult to maintain.

8 Limitations and Future Work

While splatting improves reuse for high-frequency details that remain in image frame-to-frame, it does not improve quality for newly-disoccluded surfaces; many of our tricky scenes have such pixels. While Zhang et al.’s [2024] and our backup samples can help fill such holes, better shifts or path mutations are likely needed for significant improvement (e.g., Kettunen et al. [2023]).
Extreme forward-motion may map one pixel to many. Backward-reprojection reuses a sample for many pixels with correlation; splatting reuses one-to-one, but may leave holes without backup samples. In simple cases where backprojected pixel footprints map well to the prior frame (e.g., Dining Room in Figure 10), splatting with backup may be slightly noisier than Area ReSTIR’s robust reuse. This is because robust reuse pulls from a 2 × 2 neighborhood, which provides more samples but induces more correlation.
Fig. 10:
The Dining Room captured under slow rightward movement. Due to simple geometry and linear movement, Area ReSTIR’s fractional reservoirs map well to the prior frame. Zhang et al.’s [2024] robust variant reuses from the 2 × 2 neighborhood, thus providing more effective samples than splatting plus a single backup sample.
While multi-splatting also helps fill holes, it can spread fireflies present in reservoirs into multiple pixels along directions of movement (see Figure 3, far right). Such fireflies may cause problems for many modern denoisers, so mitigating them requires some thought.
Splatting follows a forward motion vector, so it inherits problems related to such methods; pixels representing geometry seen in a mirror will be splatted based on the mirror geometry, not the virtual image. Our multi-splatting suggests we could follow multiple motion vectors per pixel to further improve the rendering of mirror-like surfaces, e.g., the clear-coated regions in Figure 8. This requires careful theoretical development.

9 Conclusion

We present a new method of forward projecting, or splatting, reservoirs between frames that improves reuse by reducing spuriously discarded sample histories. We show this works within Lin et al.’s [2022] generalized RIS theory, enabling splatting-based ReSTIR to remain unbiased.
Our work naturally applies to motion blur by repeatedly splatting samples for resampling. Furthermore, since we project exact primary hits, we also render depth of field with a single shift map, improving quality and performance over Zhang et al. [2024].
We thus believe splatting is the future way to implement ReSTIR for all scenarios. Enabling resampling for both scatter and gather operations enlarges the algorithmic toolbox, allowing ReSTIR to apply to a larger variety of problems, perhaps including better handling of mismatches between sampling density and screen size or combining with more complex bidirectional rendering techniques.

Acknowledgments

We thank Aaron Lefohn, Bill Dally, and Eric Shaffer for supporting this research. We also thank the anonymous reviewers for pointing out areas for improvement and clarifications.
The following environment maps were acquired from PolyHaven: Kloofendaal Sky for Dining Room, Spruit Dawn for Sheep in Forest, Shanghai Bund for Hair, Misty Pines for Residential Lobby, and Signal Hill Sunrise for Emerald Square. The Dining Room and Wooden Staircase scenes were acquired from Bitterli’s [2016] rendering resources.

Footnotes

1
Failure to ensure invertibility, i.e., T(T− 1(y)) = y and T− 1(T(x)) = x, is a common source of bias.
2
Similar formulations are used in light tracing, bidirectional path tracing, vertex connection and merging, and related techniques [Dutré et al. 1993; Georgiev et al. 2012; Lafortune and Willems 1993; Veach and Guibas 1995].
3
Building on ideas from Area ReSTIR [Zhang et al. 2024], the target function is the measurement contribution function defined by Veach [1997].

Supplemental Material

PDF File - Supplemental Document
Derivations and details to supplement the main paper.
MP4 File - Supplemental Video
A supplemental video comparing the new technique to prior state of the art in various scenarios.
ZIP File - Results Viewer
Interactive comparison of full-resolution images in the paper's figures.

References

[1]
Stephen J Adelson and Larry F Hodges. 1995. Generating Exact Ray-Traced Animation Frames by Reprojection. IEEE Computer Graphics and Applications 15, 3 (1995), 43–52. https://doi.org/10.1109/38.376612
[2]
Pontus Andersson, Jim Nilsson, Peter Shirley, and Tomas Akenine-Möller. 2021. Visualizing Errors in Rendered High Dynamic Range Images. In Conference of the European Association for Computer Graphics (Eurographics) Short Papers. Eurographics-European Association for Computer Graphics. https://doi.org/10.2312/egs.20211015
[3]
Steve Bako, Thijs Vogels, Brian McWilliams, Mark Meyer, Jan Novák, Alex Harvill, Pradeep Sen, Tony DeRose, and Fabrice Rousselle. 2017. Kernel-Predicting Convolutional Networks for Denoising Monte Carlo Renderings. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2017) 36, 4, Article 97 (2017), 97:1–97:14 pages. https://doi.org/10.1145/3072959.3073708
[4]
Pablo Bauszat, Victor Petitjean, and Elmar Eisemann. 2017. Gradient-Domain Path Reusing. ACM Transactions on Graphics (TOG) 36, 6 (2017), 229:1–229:9. https://doi.org/10.1145/3130800.3130886

Index Terms

  1. Reservoir Splatting for Temporal Path Resampling and Motion Blur

    Recommendations

    Comments