API Reference

class spotiflow.model.spotiflow.Spotiflow(config: SpotiflowModelConfig | None = None)

Supervised spot detector using a multi-stage neural network as a backbone for feature extraction followed by resolution-dependent post-processing modules to allow loss computation and optimization at different resolution levels.

fit(train_images: Sequence[ndarray], train_spots: Sequence[ndarray], val_images: Sequence[ndarray], val_spots: Sequence[ndarray], augment_train: bool | Pipeline = True, save_dir: str | None = None, train_config: dict | SpotiflowTrainingConfig | None = None, device: 'auto' | 'cpu' | 'cuda' | 'mps' = 'auto', logger: 'none' | 'tensorboard' | 'wandb' = 'tensorboard', number_of_devices: int | None = 1, num_workers: int | None = 0, callbacks: Sequence[Callback] | None = None, deterministic: bool | None = True, benchmark: bool | None = False, **dataset_kwargs)

Train a Spotiflow model.

Parameters:
train_images : Sequence[np.ndarray]

training images

train_spots : Sequence[np.ndarray]

training spots

val_images : Sequence[np.ndarray]

validation images

val_spots : Sequence[np.ndarray]

validation spots

augment_train : Union[bool, AugmentationPipeline]

whether to augment the training data. Defaults to True.

save_dir : Optional[str], optional

directory to save the model to. Must be given if no checkpoint logger is given as a callback. Defaults to None.

train_config : Optional[SpotiflowTrainingConfig], optional

training config. If not given, will use the default config. Defaults to None.

device : Literal["cpu", "cuda", "mps"], optional

computing device to use. Can be “cpu”, “cuda”, “mps”. Defaults to “cpu”.

logger : Optional[pl.loggers.Logger], optional

logger to use. Defaults to “tensorboard”.

number_of_devices : Optional[int], optional

number of accelerating devices to use. Only applicable to “cuda” acceleration. Defaults to 1.

num_workers : Optional[int], optional

number of workers to use for data loading. Defaults to 0 (main process only).

callbacks : Optional[Sequence[pl.callbacks.Callback]], optional

callbacks to use during training. Defaults to no callbacks.

deterministic : Optional[bool], optional

whether to use deterministic training. Set to True for deterministic behaviour at a cost of performance. Defaults to True.

benchmark : Optional[bool], optional

whether to use benchmarking. Set to False for deterministic behaviour at a cost of performance. Defaults to False.

**dataset_kwargs

additional arguments to pass to the SpotsDataset class. Defaults to no additional arguments.

classmethod from_folder(pretrained_path: str, inference_mode=True, which: str = 'best', map_location: 'auto' | 'cpu' | 'cuda' | 'mps' = 'auto', verbose: bool = False) Self

Load a pretrained model.

Parameters:
pretrained_path : str

path to the model folder

inference_mode : bool, optional

whether to set the model in eval mode. Defaults to True.

which : str, optional

which checkpoint to load. Defaults to “best”.

map_location : str, optional

device string to load the model to. Defaults to ‘auto’ (hardware-based).

Returns:

loaded model

Return type:

Self

classmethod from_pretrained(pretrained_name: str, inference_mode: bool = True, which: str = 'best', map_location: 'auto' | 'cpu' | 'cuda' | 'mps' = 'auto', cache_dir: Path | str | None = None, verbose: bool = True, **kwargs) Self

Load a pretrained model with given name

Parameters:
pretrained_name : str

name of the pretrained model to be loaded

inference_mode : bool, optional

whether to set the model in eval mode. Defaults to True.

which : str, optional

which checkpoint to load. Defaults to “best”.

map_location : str, optional

device string to load the model to. Defaults to ‘auto’ (hardware-based).

cache_dir : Optional[Union[Path, str]], optional

directory to cache the model. Defaults to None. If None, will use the default cache directory (given by the env var SPOTIFLOW_CACHE_DIR if set, otherwise ~/.spotiflow).

Returns:

loaded model

Return type:

Self

load(path: str, which: 'best' | 'last' = 'best', inference_mode: bool = True, map_location: str = 'cuda') None

Load a model from disk.

Parameters:
path : str

folder to load the model from

which : Literal['best', 'last'], optional

which checkpoint to load. Defaults to “best”.

inference_mode : bool, optional

whether to set the model in eval mode. Defaults to True.

map_location : str, optional

device string to load the model to. Defaults to ‘cuda’.

optimize_threshold(val_ds: Dataset, cutoff_distance: int = 3, min_distance: int = 1, exclude_border: bool = False, threshold_range: tuple[float, float] = (0.3, 0.7), niter: int = 11, batch_size: int = 1, device: device | 'auto' | 'cpu' | 'cuda' | 'mps' | None = None, subpix: bool | None = None) None
Optimize the probability threshold on an annotated dataset.

The metric used to optimize is the F1 score.

Parameters:
val_ds : torch.utils.data.Dataset

dataset to optimize on

cutoff_distance : int, optional

distance tolerance considered for points matching. Defaults to 3.

min_distance : int, optional

Minimum distance between spots for NMS. Defaults to 1.. Defaults to 2.

exclude_border : bool, optional

Whether to exclude spots at the border. Defaults to False.

threshold_range : Tuple[float, float], optional

Range of thresholds to consider. Defaults to (.3, .7).

niter : int, optional

number of iterations for both coarse- and fine-grained search. Defaults to 11.

batch_size : int, optional

batch size to use. Defaults to 2.

device : Optional[Union[torch.device, Literal["auto", "cpu", "cuda", "mps"]]], optional

computing device to use. If None, will infer from model location. If “auto”, will infer from available hardware. Defaults to None.

subpix : Optional[bool], optional

whether to use the stereographic flow to compute subpixel localization. If None, will deduce from the model configuration. Defaults to None.

predict(img: ndarray | Array, prob_thresh: float | None = None, n_tiles: tuple[int] = None, max_tile_size: int = None, min_distance: int = 1, exclude_border: bool = False, scale: int | None = None, subpix: bool | int | None = None, peak_mode: 'skimage' | 'fast' = 'fast', normalizer: Callable | 'auto' | None = 'auto', verbose: bool = True, progress_bar_wrapper: Callable | None = None, device: device | 'auto' | 'cpu' | 'cuda' | 'mps' | None = None, distributed_params: dict | None = None) tuple[ndarray, SimpleNamespace]

Predict spots in an image.

Parameters:
img : Union[np.ndarray, da.Array]

input image

prob_thresh : Optional[float], optional

Probability threshold for peak detection. If None, will load the optimal one. Defaults to None.

n_tiles : Tuple[int, int], optional

Number of tiles to split the image into. Defaults to (1,1).

min_distance : int, optional

Minimum distance between spots for NMS. Defaults to 1.

exclude_border : bool, optional

Whether to exclude spots at the border. Defaults to False.

scale : Optional[int], optional

Scale factor to apply to the image. Defaults to None.

subpix : bool, optional

Whether to use the stereographic flow to compute subpixel localization. If None, will deduce from the model configuration. Defaults to None.

peak_mode : str, optional

Peak detection mode (can be either “skimage” or “fast”, which is a faster custom C++ implementation). Defaults to “fast”.

normalizer : Optional[Union[Literal["auto"], callable]], optional

Normalizer to use. If None, will use the default normalizer. Defaults to “auto” (percentile-based normalization with p_min=1, p_max=99.8).

verbose : bool, optional

Whether to print logs and progress. Defaults to True.

progress_bar_wrapper : Optional[callable], optional

Progress bar wrapper to use. Defaults to None.

device : Optional[Union[torch.device, Literal["auto", "cpu", "cuda", "mps"]]], optional

computing device to use. If None, will infer from model location. If “auto”, will infer from available hardware. Defaults to None.

Returns:

Tuple of (points, details). Points are the coordinates of the spots. Details is a namespace containing the spot-wise probabilities (prob), the heatmap (heatmap), the stereographic flow (flow), the 2D local offset vector field (subpix) and the spot intensities (intens).

Return type:

Tuple[np.ndarray, SimpleNamespace]

save(path: str, which: 'best' | 'last' = 'best', update_thresholds: bool = False) None

Save the model to disk.

Parameters:
path : str

folder to save the model to

which : Literal["best", "last"]

which checkpoint to save. Should be either “best” or “last”.

update_thresholds : bool, optional

whether to update the thresholds file. Defaults to False.

class spotiflow.model.config.SpotiflowModelConfig(backbone: 'resnet' | 'unet' | 'unet_res' = 'unet', in_channels: int = 1, out_channels: int = 1, initial_fmaps: int = 32, fmap_inc_factor: Number = 2, n_convs_per_level: int = 3, levels: int = 4, downsample_factor: int = 2, kernel_size: int = 3, padding: int | str = 'same', mode: 'direct' | 'fpn' | 'slim' = 'slim', background_remover: bool = False, compute_flow: bool = True, batch_norm: bool = True, downsample_factors: tuple[tuple[int, int]] | None = None, kernel_sizes: tuple[tuple[int, int]] | None = None, dropout: float = 0.0, sigma: Number = 1.0, is_3d: bool = False, grid: int | tuple[int, int, int] = (1, 1, 1), **kwargs)
class spotiflow.model.config.SpotiflowTrainingConfig(crop_size: int | tuple[int, int] | tuple[int, int, int] = 512, smart_crop: bool = False, heatmap_loss_f: str = 'bce', flow_loss_f: str = 'l1', loss_levels: int | None = None, num_train_samples: int | None = None, pos_weight: Number = 10.0, lr: float = 0.0003, optimizer: str = 'adamw', batch_size: int = 4, lr_reduce_patience: int = 10, num_epochs: int = 200, finetuned_from: str | None = None, early_stopping_patience: int = 0, crop_size_depth: int = 32, **kwargs)
class spotiflow.data.spots.SpotsDataset(images: Sequence[ndarray], centers: Sequence[ndarray], augmenter: Callable | None = None, downsample_factors: Sequence[int] = (1,), sigma: float = 1.0, mode: str = 'max', compute_flow: bool = False, image_files: Sequence[str] | None = None, normalizer: 'auto' | Callable | None = 'auto', add_class_label: bool = True, grid: Sequence[int] | None = None)

Base spot dataset class instantiated with loaded images and centers.

__init__(images: Sequence[ndarray], centers: Sequence[ndarray], augmenter: Callable | None = None, downsample_factors: Sequence[int] = (1,), sigma: float = 1.0, mode: str = 'max', compute_flow: bool = False, image_files: Sequence[str] | None = None, normalizer: 'auto' | Callable | None = 'auto', add_class_label: bool = True, grid: Sequence[int] | None = None)

Constructor

Parameters:
images : Sequence[np.ndarray]

Sequence of images.

centers : Sequence[np.ndarray]

Sequence of center coordinates.

augmenter : Optional[Callable], optional

Augmenter function. If given, function arguments should two (first image, second spots). Defaults to None.

downsample_factors : Sequence[int], optional

Downsample factors. Defaults to (1,).

sigma : float, optional

Sigma of Gaussian kernel to generate heatmap. Defaults to 1.

mode : str, optional

Mode of heatmap generation. Defaults to “max”.

compute_flow : bool, optional

Whether to compute flow from centers. Defaults to False.

image_files : Optional[Sequence[str]], optional

Sequence of image filenames. If the dataset was not constructed from a folder, this will be None. Defaults to None.

normalizer : Union[Literal["auto"], Callable, None], optional

Normalizer function. Defaults to “auto” (percentile-based normalization with p_min=1 and p_max=99.8).

property augmenter : Callable

Return augmenter function.

Returns:

Augmenter function. It should take image and centers as input.

Return type:

Callable

property centers : Sequence[ndarray]

Return centers of spots in dataset.

Returns:

Sequence of center coordinates.

Return type:

Sequence[np.ndarray]

classmethod from_folder(path: Path | str, augmenter: Callable | None = None, downsample_factors: Sequence[int] = (1,), sigma: float = 1.0, image_extensions: Sequence[str] = ('tif', 'tiff', 'png', 'jpg', 'jpeg'), mode: str = 'max', max_files: int | None = None, compute_flow: bool = False, normalizer: Callable | 'auto' | None = 'auto', random_state: int | None = None, add_class_label: bool = True, grid: Sequence[int] | None = None) Self

Build dataset from folder. Images and centers are loaded from disk and normalized.

Parameters:
path : Union[Path, str]

Path to folder containing images (with given extensions) and centers.

augmenter : Callable

Augmenter function.

downsample_factors : Sequence[int], optional

Downsample factors. Defaults to (1,).

sigma : float, optional

Sigma of Gaussian kernel to generate heatmap. Defaults to 1.

image_extensions : Sequence[str], optional

Image extensions to look for in images. Defaults to (“tif”, “tiff”, “png”, “jpg”, “jpeg”).

mode : str, optional

Mode of heatmap generation. Defaults to “max”.

max_files : Optional[int], optional

Maximum number of files to load. Defaults to None (all of them).

compute_flow : bool, optional

Whether to compute flow from centers. Defaults to False.

normalizer : Optional[Union[Callable, Literal["auto"]]], optional

Normalizer function. Defaults to “auto” (percentile-based normalization with p_min=1 and p_max=99.8).

random_state : Optional[int], optional

Random state used when shuffling file names when “max_files” is not None. Defaults to None.

Returns:

Dataset instance.

Return type:

Self

property image_files : Sequence[str]

Return image filenames with the same order as in the dataset.

Returns:

Sequence of image filenames. If the dataset was not constructed from a folder, this will be None.

Return type:

Union[Sequence[str], None]

property images : Sequence[ndarray]

Return images in dataset.

Returns:

Sequence of images.

Return type:

Sequence[np.ndarray]

property n_classes : int

Return number of classes in the dataset.

Returns:

number of spot classes.

Return type:

int

spotiflow.utils.get_data(path: Path | str, normalize: bool = True, include_test: bool = False, is_3d: bool = False) tuple[ndarray, ndarray, ndarray, ndarray]

Get data from a given path. The path should contain a ‘train’ and ‘val’ folder.

Parameters:
path : Union[Path, str]

Path to the data.

normalize : bool, optional

Whether to normalize the data. Defaults to True.

Returns:

A 4-length tuple of arrays corresponding to the training images, training spots, validation images and validation spots.

Return type:

Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]

spotiflow.utils.normalize(x: ndarray, pmin: float = 1.0, pmax: float = 99.8, subsample: int = 1, clip: bool = False, ignore_val: int | float | None = None) ndarray

Normalizes (percentile-based) a 2d image with the additional option to ignore a value. The normalization is done as follows:

x = (x - I_{p_{min}}) / (I_{p_{max}} - I_{p_{min}})

where I_{p_{min}} and I_{p_{max}} are the pmin and pmax percentiles of the image intensity, respectively.

Parameters:
x : np.ndarray

Image to be normalized

pmin : float, optional

Minimum percentile. Defaults to 1..

pmax : float, optional

Maximum percentile. Defaults to 99.8.

subsample : int, optional

Subsampling factor for percentile calculation. Defaults to 1.

clip : bool, optional

Whether to clip the normalized image. Defaults to False.

ignore_val : Optional[Union[int, float]], optional

Value to be ignored. Defaults to None.

Returns:

Normalized image

Return type:

np.ndarray

spotiflow.utils.read_coords_csv(fname: str, add_class_column: bool = False) ndarray

Parses a csv file and returns correctly ordered points array

Parameters:
fname : str

Path to the csv file

add_class_column : bool, optional

Whether to add a class column to the points array. Defaults to False.

Returns:

A 2D array of spot coordinates. If add_class_column is True, then the array will have shape (N, 3) where N is the number of points. Otherwise, the array will have shape (N, 2)

Return type:

np.ndarray

spotiflow.utils.write_coords_csv(pts: ndarray, fname: Path | str) None

Writes points in a NumPy array to a CSV file

Parameters:
pts : np.ndarray

2D or 3D array of points

fname : Union[Path, str]

Path to the CSV file

spotiflow.sample_data.test_image_hybiss_2d()

Single test HybISS image from the Spotiflow paper (doi.org/10.1101/2024.02.01.578426)

spotiflow.sample_data.test_image_synth_3d()

Single synthetic volumetric stack from the Spotiflow paper (doi.org/10.1101/2024.02.01.578426)

spotiflow.sample_data.test_image_terra_2d()

Single test Terra frame from the Spotiflow paper (doi.org/10.1101/2024.02.01.578426)