Swin Encoder

class vformer.encoder.swin.SwinEncoder(dim, input_resolution, depth, num_heads, window_size, mlp_ratio=4.0, qkv_bias=True, qkv_scale=None, p_dropout=0.0, attn_dropout=0.0, drop_path=0.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, downsample=None, use_checkpoint=False)[source]
Parameters
  • dim (int) – Number of input channels.

  • input_resolution (tuple[int]) – Input resolution.

  • depth (int) – Number of blocks.

  • num_heads (int) – Number of attention heads.

  • window_size (int) – Local window size.

  • mlp_ratio (float) – Ratio of MLP hidden dim to embedding dim.

  • qkv_bias (bool, default is True) – Whether to add a bias vector to the q,k, and v matrices

  • qk_scale (float, optional) – Override default qk scale of head_dim ** -0.5 in Window Attention if set

  • p_dropout (float,) – Dropout rate.

  • attn_dropout (float, optional) – Attention dropout rate

  • drop_path_rate (float or tuple[float]) – Stochastic depth rate.

  • norm_layer (nn.Module) – Normalization layer. default is nn.LayerNorm

  • downsample (nn.Module, optional) – Downsample layer(like PatchMerging) at the end of the layer, default is None

forward(x)[source]
Parameters

x (torch.Tensor) –

Returns

Returns output tensor

Return type

torch.Tensor

class vformer.encoder.swin.SwinEncoderBlock(dim, input_resolution, num_heads, window_size=7, shift_size=0, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, p_dropout=0.0, attn_dropout=0.0, drop_path_rate=0.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, drop_path_mode='batch')[source]
Parameters
  • dim (int) – Number of the input channels

  • input_resolution (int or tuple[int]) – Input resolution of patches

  • num_heads (int) – Number of attention heads

  • window_size (int) – Window size

  • shift_size (int) – Shift size for Shifted Window Masked Self Attention (SW_MSA)

  • mlp_ratio (float) – Ratio of MLP hidden dimension to embedding dimension

  • qkv_bias (bool, default= True) – Whether to add a bias vector to the q,k, and v matrices

  • qk_scale (float, Optional) –

  • p_dropout (float) – Dropout rate

  • attn_dropout (float) – Dropout rate

  • drop_path_rate (float) – Stochastic depth rate

  • norm_layer (nn.Module) – Normalization layer, default is nn.LayerNorm

forward(x)[source]
Parameters

x (torch.Tensor) –

Returns

Returns output tensor

Return type

torch.Tensor