Models Module#
Module for defining models for landmark detection.
- class landmarker.models.AddCoordChannels[source]#
Adds the x and y coordinates of each pixel as additional channels to the input tensor. Optionally, it can also add the radial distance of each pixel to the center of the image as an additional channel. This is done to provide the network with spatial information.
- Parameters:
radial_channel (bool, optional) β whether to add the radial distance of each pixel to the center of the image as an additional channel. Defaults to False.
- class landmarker.models.CholeskyHourglass[source]#
Proposed in βUGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Lossβ - Kumar et al. (2019) # TODO: Note that the implementation of Kumar et al. use DU-Net as the backbone. # We use the residual hourglass.
- Parameters:
img_size (tuple[int, int]) β size of the input image.
in_channels (int) β number of input channels.
out_channels (int) β number of output channels.
channels (Sequence[int], optional) β number of output channels for each convolutional layer.
subunits (int, optional) β number of subunits in each convolutional layer.
up_sample_mode (str, optional) β upsampling mode. Defaults to βnearestβ.
- Returns:
predicted heatmaps of shape (batch_size, out_channels, *img_dims) cen: covariance matrices of the predicted heatmaps of shape (batch_size,
out_channels, 2, 2)
- Return type:
pred
- __init__(img_size, in_channels, out_channels, channels=[64, 128, 256, 512], conv_block=<class 'monai.networks.blocks.convolutions.ResidualUnit'>, up_sample_mode='nearest')[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
img_size (tuple[int, int]) β
in_channels (int) β
out_channels (int) β
channels (Sequence[int]) β
conv_block (Module) β
up_sample_mode (str) β
- Return type:
None
- class landmarker.models.CoordConvLayer[source]#
CoordConv is a convolutional layer that adds the x and y coordinates of each pixel as additional channels to the input tensor. Optionally, it can also add the radial distance of each pixel to the center of the image as an additional channel. This is done to provide the network with spatial information.
- source: βAn intriguing failing of convolutional neural networks and the CoordConv
solutionβ - Liu et al.
- Parameters:
spatial_dims (int) β number of spatial dimensions of the input image.
in_channels (int) β number of input channels.
out_channels (int) β number of output channels.
radial_channel (bool, optional) β whether to add the radial distance of each pixel to the center of the image as an additional channel. Defaults to False.
conv_block (nn.Module, optional) β convolutional block to use. Defaults to ResidualUnit.
- __init__(spatial_dims, in_channels, out_channels, radial_channel=False, conv_block=<class 'monai.networks.blocks.convolutions.ResidualUnit'>)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
spatial_dims (int) β
in_channels (int) β
out_channels (int) β
radial_channel (bool) β
conv_block (Module) β
- class landmarker.models.Hourglass[source]#
Hourglass network is a network with symmetrical encoder and decoder paths. The encoder path downsamples the input image while the decoder path upsamples the image. Skip connections are added between the encoder and decoder paths to preserve spatial information. This network is used for pose estimation.
Proposed in: βStacked Hourglass Networks for Human Pose Estimationβ - Newell et al. (2016)
- Parameters:
spatial_dims (int) β number of spatial dimensions of the input image.
in_channels (int) β number of input channels.
out_channels (int) β number of output channels.
channels (Sequence[int], optional) β number of output channels for each convolutional layer.
conv_block (nn.Module, optional) β convolutional block to use. Defaults to ResidualUnit.
pooling (nn.Module, optional) β pooling layer to use. Defaults to nn.MaxPool2d.
up_sample_mode (str, optional) β upsampling mode. Defaults to βnearestβ.
- __init__(spatial_dims, in_channels, out_channels, channels=[64, 128, 256, 512], conv_block=<class 'monai.networks.blocks.convolutions.ResidualUnit'>, pooling=<class 'torch.nn.modules.pooling.MaxPool2d'>, up_sample_mode='nearest')[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
spatial_dims (int) β
in_channels (int) β
out_channels (int) β
channels (Sequence[int]) β
conv_block (Module) β
pooling (Module) β
up_sample_mode (str) β
- class landmarker.models.OriginalSpatialConfigurationNet[source]#
Implementation of the Spatial Configuration Network (SCN) from the paper βIntegrating spatial configuration into heatmap regression based CNNs for landmark localizationβ by Payer et al. (2019). https://www.sciencedirect.com/science/article/pii/S1361841518305784
- Parameters:
in_channels (int, optional) β number of input channels. Defaults to 1.
out_channels (int, optional) β number of output channels. Defaults to 4.
la_channels (int, optional) β number of output channels for each convolutional layer. Defaults to 128.
la_depth (int, optional) β number of convolutional layers. Defaults to 3.
la_kernel_size (int, optional) β kernel size for the convolutional layers. Defaults to 3.
la_dropout (float, optional) β dropout probability. Defaults to 0.5.
sp_channels (int, optional) β number of channels for the convolutional layers. Defaults to 128.
sp_kernel_size (int, optional) β kernel size for the convolutional layers. Defaults to 11.
sp_downsample (int, optional) β factor by which the image is downsampled. Defaults to 16.
init_weights (bool, optional) β whether to initialize the weights of the convolutional layers.
- __init__(in_channels=1, out_channels=4, la_channels=128, la_depth=3, la_kernel_size=3, la_dropout=0.5, sp_channels=128, sp_kernel_size=11, sp_downsample=16, init_weigths=False, spatial_dim=2)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
in_channels (int) β
out_channels (int) β
la_channels (int) β
la_depth (int) β
la_kernel_size (int | tuple[int, ...]) β
la_dropout (float) β
sp_channels (int) β
sp_kernel_size (int) β
sp_downsample (int) β
init_weigths (bool) β
spatial_dim (int) β
- Return type:
None
- class landmarker.models.OriginalSpatialConfigurationNet3d[source]#
Implementation of the Spatial Configuration Network (SCN) from the paper βIntegrating spatial configuration into heatmap regression based CNNs for landmark localizationβ by Payer et al. (2019). https://www.sciencedirect.com/science/article/pii/S1361841518305784
This is the 3D version of the original SCN.
- __init__(in_channels=1, out_channels=4, la_channels=64, la_depth=3, la_kernel_size=3, la_dropout=0.5, sp_channels=64, sp_kernel_size=7, sp_downsample=4)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
in_channels (int) β
out_channels (int) β
la_channels (int) β
la_depth (int) β
la_kernel_size (int | tuple[int, ...]) β
la_dropout (float) β
sp_channels (int) β
sp_kernel_size (int) β
sp_downsample (int) β
- class landmarker.models.ProbSpatialConfigurationNet[source]#
Probabilistic Spatial Configuration Network (PSCN)
Adapted implementation of the Probabilistic Spatial Configuration Network (PSCN) from the paper βIntegrating spatial configuration into heatmap regression based CNNs for landmark localizationβ by Payer et al. (2019). This is the same as the Spatial Configuration Network (SCN), but with a different last layer. Instead of multiplying the output of the SCN with the output of the spatial configuration network, we add them together, since the output of the spatial configuration network is a probability distribution in the logit space.
- Parameters:
spatial_dims (int, optional) β number of spatial dimensions of the input image. Defaults to 2.
in_channels (int, optional) β number of input channels. Defaults to 1.
out_channels (int, optional) β number of output channels. Defaults to 4.
la_channels (Sequence[int], optional) β number of output channels for each convolutional layer. Defaults to (128, 128, 128, 128).
la_kernel_size (int | tuple[int, int], optional) β kernel size for the convolutional layers. Defaults to 3.
la_strides (Sequence[int], optional) β strides for the convolutional layers. Defaults to (2, 2, 2).
la_num_res_units (int, optional) β number of residual units in the convolutional layers. Defaults to 2.
la_norm (str, optional) β type of normalization to use. Defaults to βinstanceβ.
la_activation (str, optional) β type of activation to use. Defaults to βPRELUβ.
la_adn_ordering (str, optional) β ordering of the layers in the residual units. Defaults to βNDAβ.
la_dropout (float, optional) β dropout probability. Defaults to 0.0.
sp_channels (int, optional) β number of channels for the convolutional layers. Defaults to 128.
sp_kernel_size (int, optional) β kernel size for the convolutional layers. Defaults to 11.
sp_downsample (int, optional) β factor by which the image is downsampled. Defaults to 16.
sp_image_input (bool, optional) β whether to use the input image as input for the spatial
- __init__(spatial_dims=2, in_channels=1, out_channels=4, la_channels=(128, 128, 128, 128, 128), la_kernel_size=3, la_strides=(2, 2, 2, 2), la_num_res_units=2, la_norm='instance', la_activation='PRELU', la_adn_ordering='NDA', la_dropout=0.0, sp_channels=128, sp_kernel_size=11, sp_downsample=16, sp_image_input=True)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
spatial_dims (int) β
in_channels (int) β
out_channels (int) β
la_channels (Sequence[int]) β
la_kernel_size (int | tuple[int, int]) β
la_strides (Sequence[int]) β
la_num_res_units (int) β
la_norm (str) β
la_activation (str) β
la_adn_ordering (str) β
la_dropout (float) β
sp_channels (int) β
sp_kernel_size (int) β
sp_downsample (int) β
sp_image_input (int) β
- Return type:
None
- class landmarker.models.SpatialConfigurationNet[source]#
Adapted implementation of the Spatial Configuration Network (SCN) from the paper βIntegrating spatial configuration into heatmap regression based CNNs for landmark localizationβ by Payer et al. (2019). https://www.sciencedirect.com/science/article/pii/S1361841518305784
- Parameters:
spatial_dims (int) β number of spatial dimensions of the input image.
in_channels (int) β number of input channels.
out_channels (int) β number of output channels.
la_channels (Sequence[int], optional) β number of output channels for each convolutional layer.
la_kernel_size (int, optional) β kernel size for the convolutional layers.
la_strides (Sequence[int], optional) β strides for the convolutional layers.
la_num_res_units (int, optional) β number of residual units in the convolutional layers.
la_norm (str, optional) β type of normalization to use. Defaults to βINSTANCEβ.
la_activation (str, optional) β type of activation to use. Defaults to βPRELUβ.
la_adn_ordering (str, optional) β ordering of the layers in the residual units. Defaults to βADNβ.
la_dropout (float, optional) β dropout probability. Defaults to 0.0.
sp_channels (int, optional) β number of channels for the convolutional layers.
sp_kernel_size (int, optional) β kernel size for the convolutional layers.
sp_downsample (int, optional) β factor by which the image is downsampled.
sp_image_input (bool, optional) β whether to use the input image as input for the spatial configuration network.
- __init__(spatial_dims=2, in_channels=1, out_channels=4, la_channels=(128, 128, 128, 128), la_kernel_size=3, la_strides=(2, 2, 2), la_num_res_units=2, la_norm='INSTANCE', la_activation='PRELU', la_adn_ordering='ADN', la_dropout=0.0, sp_channels=128, sp_kernel_size=11, sp_downsample=16, sp_image_input=True)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
spatial_dims (int) β
in_channels (int) β
out_channels (int) β
la_channels (Sequence[int]) β
la_kernel_size (int | tuple[int, int]) β
la_strides (Sequence[int]) β
la_num_res_units (int) β
la_norm (str) β
la_activation (str) β
la_adn_ordering (str) β
la_dropout (float) β
sp_channels (int) β
sp_kernel_size (int) β
sp_downsample (int) β
sp_image_input (bool) β
- class landmarker.models.StackedCholeskyHourglass[source]#
Stacked Cholesky Hourglass Network as proposed in βUGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Lossβ - Kumar et al. (2019). It is a stack of hourglass networks with a Cholesky Estimator Network at the bottleneck of each hourglass. The output of the Cholesky Estimator Network is a lower triangular matrix that is used to estimate the covariance matrix of the Gaussian distribution of the predicted heatmaps. The covariance matrix is then used to compute the Gaussian Log-Likelihood Loss.
- Parameters:
nb_stacks (int) β number of hourglass networks to stack.
img_size (tuple[int, int]) β size of the input image.
in_channels (int) β number of input channels.
out_channels (int) β number of output channels.
channels (Sequence[int], optional) β number of output channels for each convolutional layer.
conv_block (nn.Module, optional) β convolutional block to use. Defaults to ResidualUnit.
up_sample_mode (str, optional) β upsampling mode. Defaults to βnearestβ.
- __init__(nb_stacks, img_size, in_channels, out_channels, channels=[64, 128, 256, 512], conv_block=<class 'monai.networks.blocks.convolutions.ResidualUnit'>, up_sample_mode='nearest')[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
nb_stacks (int) β
img_size (tuple[int, int]) β
in_channels (int) β
out_channels (int) β
channels (Sequence[int]) β
conv_block (Module) β
up_sample_mode (str) β
- Return type:
None
- class landmarker.models.StackedHourglass[source]#
Stacked hourglass.
- Parameters:
nb_stacks (int) β number of hourglass modules to stack.
spatial_dims (int) β number of spatial dimensions of the input image.
in_channels (int) β number of input channels.
out_channels (int) β number of output channels.
channels (Sequence[int], optional) β number of output channels for each convolutional layer.
up_sample_mode (str, optional) β upsampling mode. Defaults to βnearestβ.
- __init__(nb_stacks, spatial_dims, in_channels, out_channels, channels=[64, 128, 256, 512], conv_block=<class 'monai.networks.blocks.convolutions.ResidualUnit'>, pooling=<class 'torch.nn.modules.pooling.MaxPool2d'>, up_sample_mode='nearest')[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
nb_stacks (int) β
spatial_dims (int) β
in_channels (int) β
out_channels (int) β
channels (Sequence[int]) β
conv_block (Module) β
pooling (Module) β
up_sample_mode (str) β