Vision Transformers for Dense Prediction
- class vformer.models.dense.dpt.AddReadout(start_index=1)[source]
Handles readout operation when readout parameter is add. Removes cls_token or readout_token from tensor and adds it to the rest of tensor
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class vformer.models.dense.dpt.DPTDepth(backbone, in_channels=3, img_size=(384, 384), readout='project', hooks=(2, 5, 8, 11), channels_last=False, use_bn=False, enable_attention_hooks=False, non_negative=True, scale=1.0, shift=0.0, invert=False)[source]
Implementation of ” Vision Transformers for Dense Prediction ” https://arxiv.org/abs/2103.13413
- Parameters
backbone (str) – Name of ViT model to be used as backbone, must be one of {vitb16,`vitl16`,`vit_tiny`}
in_channels (int) – Number of channels in input image, default is 3
img_size (tuple[int]) – Input image size, default is (384,384)
readout (str) – Method to handle the readout_token or cls_token Must be one of {add, ignore,`project`}, default is project
hooks (list[int]) – List representing index of encoder blocks on which hooks will be registered. These hooks extract features from different ViT blocks, eg attention, default is (2,5,8,11).
channels_last (bool) – Alters the memory format of storing tensors, default is False, For more information visit, this blogpost<https://pytorch.org/tutorials/intermediate/memory_format_tutorial.html>
use_bn (bool) – If True, BatchNormalisation is used in FeatureFusionBlock_custom, default is False
enable_attention_hooks (bool) – If True, get_attention hook is registered, default is false
non_negative (bool) – If True, Relu operation will be applied in DPTDepth.model.head block, default is True
invert (bool) – If True, forward pass output of DPTDepth.model.head will be transformed (inverted) according to scale and shift parameters, default is False
scale (float) – Float value that will be multiplied with forward pass output from DPTDepth.model.head, default is 1.0
shift (float) – Float value that will be added with forward pass output from DPTDepth.model.head after scaling, default is 0.0
- class vformer.models.dense.dpt.FeatureFusionBlock_custom(features, activation, deconv=False, bn=False, expand=False, align_corners=True)[source]
Feature fusion block.
- class vformer.models.dense.dpt.Interpolate(scale_factor, mode, align_corners=False)[source]
Interpolation module
- Parameters
scale_factor (float) – Scaling factor used in interpolation
mode (str) – Interpolation mode
align_corners (bool) – Whether to align corners in Interpolation operation
- class vformer.models.dense.dpt.ProjectReadout(in_features, start_index=1)[source]
Another class that handles readout operation. Used when readout parameter is project
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class vformer.models.dense.dpt.ResidualConvUnit_custom(features, activation=<class 'torch.nn.modules.activation.GELU'>, bn=True)[source]
Residual convolution module
- Parameters
features (int) – Number of features
activation (nn.Module) – Activation module, default is nn.GELU
bn (bool) – Whether to use batch normalisation
- class vformer.models.dense.dpt.Slice(start_index=1)[source]
Handles readout operation when readout parameter is ignore. Removes cls_token or readout_token by index slicing
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class vformer.models.dense.dpt.Transpose(dim0, dim1)[source]
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.