linumpy.gpu.n4#
GPU N4 bias field correction.
Implements the Tustison 2010 N4 algorithm using the B-spline primitive
in linumpy.gpu.bspline and a CuPy/NumPy-shared histogram
sharpening routine.
Each fitting level loops over:
Compute the log-residual
r = log(v) - log_biason masked voxels.Sharpen the residual histogram by Wiener-deconvolving it with a Gaussian PSF (Sled 1998 / Tustison 2010), producing a LUT mapping observed log-intensity to expected (unbiased) log-intensity.
Voxel-wise, compute the per-voxel log-bias update
delta = log(v) - LUT(log(v) - log_bias).Fit a tensor-product cubic B-spline to
deltaon a regular control grid, evaluate at full resolution, and add tolog_bias.
The next fitting level doubles the number of control points per axis.
Memory budget (per N4 call):
~6 x volume_size x 4 bytes
i.e. ~12 GB for a (256, 1024, 1024) float32 volume.
Functions#
|
Return the per-voxel sharpened log-intensity (LUT-mapped). |
|
GPU-accelerated N4 bias field correction. |
Module Contents#
- linumpy.gpu.n4.sharpen_residual(log_v, mask, *, n_bins=200, fwhm_log=0.15, wiener_noise=0.01, use_gpu=True, mask_weights=None)[source]#
Return the per-voxel sharpened log-intensity (LUT-mapped).
Implements the Sled/Tustison histogram sharpening: build the weighted log-intensity histogram restricted to mask, deconvolve it by a Gaussian PSF (Wiener-regularised), and return the LUT
E[v_true | v_obs]evaluated at every voxel in log_v.- Parameters:
log_v (np.ndarray) – Log-intensity volume (any shape, float32).
mask (np.ndarray or None) – Boolean mask; only masked voxels contribute to the histogram. When
None, all voxels are used.n_bins (int) – Histogram bin count.
fwhm_log (float) – Full-width-half-maximum of the Gaussian PSF in log-intensity units. Controls how much sharpening is applied (smaller FWHM means less sharpening, since the deconvolution kernel is narrower). N4 default is approximately 0.15.
wiener_noise (float) – Wiener regularisation term. Larger values stabilise the deconvolution at the expense of sharpening.
use_gpu (bool) – Use CuPy when available.
mask_weights (np.ndarray, optional) – Float32 view of mask (
mask.astype(float32)). When provided, skips the per-call cast – a full-volume allocation that the N4 fit loop otherwise repeats every iteration.
- Returns:
Sharpened log-intensity, same shape and dtype as log_v. Outside the mask, the input log-intensity is returned unchanged.
- Return type:
np.ndarray
- linumpy.gpu.n4.n4_correct_gpu(vol, mask=None, *, shrink_factor=4, n_iterations=None, spline_distance_mm=10.0, voxel_size_mm=(1.0, 1.0, 1.0), n_bins=200, fwhm_log=0.15, wiener_noise=0.01, convergence_tol=0.001, use_gpu=True, out=None, bias_out=None)[source]#
GPU-accelerated N4 bias field correction.
Faithful CuPy/NumPy port of the Tustison 2010 N4 algorithm: at each fitting level, alternate Sled-style histogram sharpening and tensor cubic B-spline scattered-data fitting until convergence. The B-spline control mesh is fixed across levels (matching SimpleITK’s behaviour);
n_iterationsonly controls per-level iteration counts and the residual is composed across levels.Parameters mirror
linumpy.intensity.bias_field.n4_correct()so the two backends are interchangeable. Extra knobs (n_bins,fwhm_log,wiener_noise) tune the sharpening histogram.- Parameters:
vol (np.ndarray) – Float32 input volume (Z, Y, X).
mask (np.ndarray or None) – Boolean tissue mask.
None= full volume.shrink_factor (int) – Isotropic spatial subsampling factor for the fit (>=1).
n_iterations (list of int or None) – Max iterations per fitting level. Length sets the number of levels. Default
[20, 20, 20]. Fewer iterations than the SimpleITK CPU backend because the GPU PSDB residual update has no internal multilevel dampening, so each iteration has full effect; more than ~20 per level causes the bias field to absorb true tissue contrast (verified empirically on live OCT).spline_distance_mm (float) – Approximate distance between B-spline control knots at level 0.
voxel_size_mm (3-tuple of float) – Voxel size (z, y, x) in mm.
n_bins (sharpening parameters) – See
sharpen_residual().fwhm_log (sharpening parameters) – See
sharpen_residual().wiener_noise (sharpening parameters) – See
sharpen_residual().convergence_tol (float) – Per-iteration convergence threshold on the relative L2 change of
log_bias. Iterations stop early when the change drops below this value.use_gpu (bool) – Use CuPy when available.
out (np.ndarray, optional) – Destination buffer for the corrected output (full
vol.shape, float32). When provided, the streaming Z-tile loop writes results directly into this buffer instead of allocating a fresh array. May safely alias the inputvol(the host buffer is not read after the initial H2D upload at function entry), saving a full-volume float32 allocation on large mosaics.bias_out (np.ndarray, optional) – Destination buffer for the bias-field output, same shape and dtype constraints as out.
- Returns:
corrected (np.ndarray) – Bias-corrected float32 volume (Z, Y, X), full resolution.
bias_field (np.ndarray) – Estimated multiplicative bias field, float32, full resolution.
- Return type: