Patch

class vformer.encoder.embedding.patch.PatchEmbedding(img_size, patch_size, in_channels, embedding_dim, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]
Parameters
  • img_size (int) – Image Size

  • patch_size (int) – Patch Size

  • in_channels (int) – Number of input channels in the image

  • embedding_dim (int) – Number of linear projection output channels

  • norm_layer (nn.Module,) – Normalization layer, Default is nn.LayerNorm

forward(x)[source]
Parameters

x (torch.Tensor) – Input tensor

Returns

Returns output tensor by applying convolution operation with same kernel_size and stride on input tensor.

Return type

torch.Tensor