Swin

class vformer.encoder.swin.SwinEncoder(dim, input_resolution, depth, num_heads, window_size, mlp_ratio=4.0, qkv_bias=True, qkv_scale=None, p_dropout=0.0, attn_dropout=0.0, drop_path=0.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, downsample=None, use_checkpoint=False)[source]
dim: int

Number of input channels.

input_resolution: tuple[int]

Input resolution.

depth: int

Number of blocks.

num_heads: int

Number of attention heads.

window_size: int

Local window size.

mlp_ratio: float

Ratio of MLP hidden dim to embedding dim.

qkv_bias: bool, default is True

Whether to add a bias vector to the q,k, and v matrices

qk_scale: float, optional

Override default qk scale of head_dim ** -0.5 in Window Attention if set

p_dropout: float,

Dropout rate.

attn_dropout: float, optional

Attention dropout rate

drop_path_rate: float or tuple[float]

Stochastic depth rate.

norm_layer: nn.Module

Normalization layer. default is nn.LayerNorm

downsample: nn.Module, optional

Downsample layer(like PatchMerging) at the end of the layer, default is None

forward(x)[source]
Parameters

x (torch.Tensor) –

Returns

Returns output tensor

Return type

torch.Tensor

class vformer.encoder.swin.SwinEncoderBlock(dim, input_resolution, num_heads, window_size=7, shift_size=0, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, p_dropout=0.0, attn_dropout=0.0, drop_path_rate=0.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, drop_path_mode='batch')[source]
Parameters
  • dim (int) – Number of the input channels

  • input_resolution (int or tuple[int]) – Input resolution of patches

  • num_heads (int) – Number of attention heads

  • window_size (int) – Window size

  • shift_size (int) – Shift size for Shifted Window Masked Self Attention (SW_MSA)

  • mlp_ratio (float) – Ratio of MLP hidden dimension to embedding dimension

  • qkv_bias (bool, default= True) – Whether to add a bias vector to the q,k, and v matrices

  • qk_scale (float, Optional) –

  • p_dropout (float) – Dropout rate

  • attn_dropout (float) – Dropout rate

  • drop_path_rate (float) – Stochastic depth rate

  • norm_layer (nn.Module) – Normalization layer, default is nn.LayerNorm

forward(x)[source]
Parameters

x (torch.Tensor) –

Returns

Returns output tensor

Return type

torch.Tensor