linumpy.gpu.bspline#

Tensor-product cubic B-spline scattered-data approximation.

Provides a simple GPU/CPU primitive for fitting a smooth 3-D field to scattered (weighted) voxel samples on a regular control-point lattice and evaluating the resulting field at arbitrary voxel grids.

Used by linumpy.gpu.n4 for the bias-field B-spline update step, but kept generic so other smoothing/warp primitives can reuse it.

The fit implements the single-level Lee-Wolberg-Shin (1997) B-spline approximation that ITK uses inside BSplineScatteredDataPointSetToImageFilter (the engine of N4). For each scattered sample p with value v_p the locally-optimal value at surrounding control point c is:

phi_c(p) = w_c(p) * v_p / sum_d w_d(p)^2

and the per-control-point coefficient is the squared-weight average:

coeff[c] = sum_p w_c(p)^2 * phi_c(p) / sum_p w_c(p)^2
         = sum_p gamma_p * w_c(p)^3 * v_p / S(p)
           -------------------------------------
                   sum_p gamma_p * w_c(p)^2

where S(p) = sum_d w_d(p)^2 and gamma_p folds in the per-voxel mask/weight. Because the tensor-product basis is separable, w_c(p)^k factorises across axes and S(p) factorises into a product of per-axis sums of squared basis weights, so the fit reduces to three contiguous tensor contractions – one through B^3 for the numerator and one through B^2 for the denominator. This matches the ITK behaviour while remaining a single GPU-friendly tensordot chain.

An earlier implementation used a Nadaraya-Watson kernel regression (coeff[c] = sum_p w_c(p) * v_p / sum_p w_c(p)). That form has no implicit smoothness penalty and, at the dense control grids reached by later N4 fitting levels, lets the fit absorb tissue-scale features (e.g. white-matter contrast) into the bias estimate. PSDB’s squared weights regularise short-range support and recover the contrast.

Functions#

`bspline_fit_precompute`(bases, *[, eps])	Build the iteration-invariant constants used by `bspline_fit()`.
`bspline_fit`(values, weights, mask, n_control_points, *)	Fit a tensor-product cubic B-spline to scattered voxel samples.
`bspline_evaluate`(control_coeffs, target_shape, *[, ...])	Evaluate a cubic B-spline given control coefficients on a regular grid.

Module Contents#

linumpy.gpu.bspline.bspline_fit_precompute(bases, *, eps=1e-08)[source]#

Build the iteration-invariant constants used by bspline_fit().

The squared/cubed per-axis basis matrices and the separable per-voxel denominator S(p) = (sum_c M_z[z,c]^2)(sum_c M_y[y,c]^2)(sum_c M_x[x,c]^2) depend only on bases, so callers that issue many fits at the same shape (e.g. the N4 fitting loop) can build them once and pass them in via bspline_fit()’s precomputed argument.

Returns (M_z2, M_y2, M_x2, M_z3, M_y3, M_x3, S_safe) where S_safe = maximum(S, eps) – the per-iteration maximum against eps is folded into the precompute.

Parameters:

bases (tuple[Any, Any, Any])
eps (float)

Return type:

tuple[Any, Any, Any, Any, Any, Any, Any]

linumpy.gpu.bspline.bspline_fit(values, weights, mask, n_control_points, *, use_gpu=True, eps=1e-08, bases=None, precomputed=None)[source]#

Fit a tensor-product cubic B-spline to scattered voxel samples.

Parameters:

values (np.ndarray) – Sample values, shape (Z, Y, X), float32.
weights (np.ndarray or None) – Per-voxel non-negative weights (same shape). None = all ones.
mask (np.ndarray or None) – Boolean mask selecting which voxels participate in the fit. None = all voxels.
n_control_points (tuple of int) – Control-grid size (Cz, Cy, Cx). Each value must be >= 4.
use_gpu (bool) – Use CuPy when available; falls back to NumPy.
eps (float) – Floor on the kernel-weight denominator to avoid division by zero for control points with no support.
bases (tuple of arrays, optional) – Pre-built per-axis basis matrices (M_z, M_y, M_x) from _build_axis_basis() matching values.shape and n_control_points. When provided, skips the per-call build; useful when the caller (e.g. an N4 fitting level) issues many fits at the same shape.
precomputed (tuple of arrays, optional) – Output of bspline_fit_precompute() for the same bases. When provided, skips rebuilding the squared/cubed bases and the separable denominator S – a per-iteration full-volume allocation in the N4 fit loop.

Returns:

Control coefficients, shape n_control_points, float32 NumPy array (always returned on the host).

Return type:

np.ndarray

linumpy.gpu.bspline.bspline_evaluate(control_coeffs, target_shape, *, use_gpu=True, bases=None)[source]#

Evaluate a cubic B-spline given control coefficients on a regular grid.

Inverse of bspline_fit()’s coordinate mapping: target voxel 0 maps to control coordinate 0; target voxel N - 1 maps to Cn - 3.

Parameters:

control_coeffs (np.ndarray) – Control-grid coefficients, shape (Cz, Cy, Cx).
target_shape (tuple of int) – Output volume shape (Z, Y, X).
use_gpu (bool) – Use CuPy when available.
bases (tuple of arrays, optional) – Pre-built per-axis basis matrices (M_z, M_y, M_x) matching target_shape and control_coeffs.shape. When provided, skips the per-call build.

Returns:

Evaluated field, shape target_shape, float32.

Return type:

np.ndarray