linumpy.stack_alignment.filter#

Outlier filtering and tile-offset correction for inter-slice shift fields.

Functions#

filter_outlier_shifts(shifts_df[, max_shift_mm, ...])

Detect and filter outlier shifts that cause excessive drift.

correct_tile_offset_shifts(shifts_df, tile_fov_x_mm[, ...])

Correct pairwise shifts that are spurious integer multiples of an artifact step.

filter_step_outliers(shifts_df[, max_step_mm, window, ...])

Fix per-step spikes in shifts, independent of global outlier detection.

Module Contents#

linumpy.stack_alignment.filter.filter_outlier_shifts(shifts_df, max_shift_mm=0.5, method='rehome', iqr_multiplier=1.5, return_fraction=0.4)[source]#

Detect and filter outlier shifts that cause excessive drift.

Parameters:
  • shifts_df (pd.DataFrame) – DataFrame with columns: fixed_id, moving_id, x_shift_mm, y_shift_mm

  • max_shift_mm (float) – Maximum allowed pairwise shift in mm (floor for IQR method)

  • method (str) –

    ‘clamp’, ‘median’, ‘zero’, ‘local’, ‘iqr’, or ‘rehome’.

    ’rehome’ (recommended default): distinguishes genuine re-homing events from encoder glitch spikes. A step is only corrected if it is large AND approximately self-cancelling with an adjacent step. Specifically, a step at position i is treated as a spike when:

    |step[i] + step[i±1]| < return_fraction * |step[i]|
    

    i.e. the adjacent step reverses most of the displacement. Re-homing events (large step followed by small steps that stay at the new position) are left untouched. This makes the filter safe to enable by default without manual threshold tuning per subject.

  • iqr_multiplier (float) – Multiplier for IQR-based detection (only used by ‘iqr’ method).

  • return_fraction (float) – For ‘rehome’: fraction threshold below which a round-trip is considered self-cancelling (default 0.4 – if the adjacent step reverses more than 60 % of a large step, treat as glitch spike).

Returns:

Filtered DataFrame with outlier shifts corrected.

Return type:

pd.DataFrame

linumpy.stack_alignment.filter.correct_tile_offset_shifts(shifts_df, tile_fov_x_mm, tile_fov_y_mm=None, tolerance=0.05, min_step_mm=0.0)[source]#

Correct pairwise shifts that are spurious integer multiples of an artifact step.

The XY shifts file records xmin_mm[fixed] - xmin_mm[moving], where xmin_mm[i] is the left-edge position of the mosaic grid for slice i. After each slice the acquisition software calls detect_mosaic to find the tissue boundary; if the boundary has moved, mosaic_xmin_mm is reset to the new position (minus a margin). This repositioning is recorded in the shifts file as an apparent lateral tissue drift even though the tissue itself did not move. The magnitude equals however far the detected tissue boundary shifted, which is determined by tissue geometry and the ROI detection algorithm – not by the overlap-corrected tile step or any stage hardware quantum.

Note

The artifact step tile_fov_x_mm must be empirically determined from the shifts_xy.csv data. It is not equal to tile_size_um x (1 - overlap_fraction) / 1000 (the stitching tile step). To find the correct value, inspect the x_shift_mm column for a cluster of near-equal large steps; that common value is the artifact step.

These steps are persistent (not self-cancelling) and therefore survive the spike detector in filter_outlier_shifts unmodified. This function strips the integer-artifact-step component from each shift, leaving only the true inter-slice tissue drift.

This function checks each pairwise step independently: if the X component is within tolerance of N x tile_fov_x_mm (for integer N ≠ 0), the offset N x tile_fov_x_mm is subtracted, recovering the true tissue drift. The same is done for the Y component independently.

Parameters:
  • shifts_df (pd.DataFrame) – DataFrame with columns: fixed_id, moving_id, x_shift_mm, y_shift_mm (and optionally x_shift, y_shift in pixels).

  • tile_fov_x_mm (float) – Empirically determined artifact step size in X (mm). Must be found from the shifts data – see note above.

  • tile_fov_y_mm (float, optional) – Tile field-of-view width in Y (mm). Defaults to tile_fov_x_mm.

  • tolerance (float) – Fractional tolerance: a component is treated as a tile-multiple when |component - N x fov| / fov < tolerance. Default 0.05 (5 %).

  • min_step_mm (float) – Only inspect steps whose magnitude exceeds this value (mm). Default 0 – all steps are checked.

Returns:

  • pd.DataFrame – Corrected DataFrame.

  • List[int] – Indices of rows that were modified.

Return type:

tuple[pandas.DataFrame, list[int]]

linumpy.stack_alignment.filter.filter_step_outliers(shifts_df, max_step_mm=0.0, window=2, method='local_median', mad_threshold=3.0, return_fraction=0.0)[source]#

Fix per-step spikes in shifts, independent of global outlier detection.

Parameters:
  • shifts_df (pd.DataFrame) – DataFrame with shift columns.

  • max_step_mm (float) – Maximum allowed per-step shift in mm. 0 disables (for clamp/local_median).

  • window (int) – Neighbor window size.

  • method (str) – ‘clamp’, ‘local_median’, or ‘local_mad’.

  • mad_threshold (float) – MADs above local median to flag as outlier (for local_mad method).

  • return_fraction (float) – For all methods: if a flagged large step is NOT self-cancelling with an adjacent step (round-trip > return_fraction * step_mag), it is treated as a re-homing event and left unchanged. Set to 0 to disable this guard (legacy behaviour).

Returns:

Filtered DataFrame.

Return type:

pd.DataFrame