Spatial Self Attention
- class vformer.attention.spatial.SpatialAttention(dim, num_heads, sr_ratio=1, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0, linear=False, activation=<class 'torch.nn.modules.activation.GELU'>)[source]
Bases:
Module
Spatial Reduction Attention introduced in : Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions This class also supports the linear complexity spatial attention in the improved paper
- Parameters
dim (int) – Dimension of the input tensor
num_heads (int) – Number of attention heads
sr_ratio (int) – Spatial Reduction ratio
qkv_bias (bool) – If True, add a learnable bias to query, key, value, default is
True
qk_scale (float, optional) – Override default qk scale of head_dim ** -0.5 if set
attn_drop (float, optional) – Dropout rate
proj_drop (float, optional) – Dropout rate
linear (bool) – Whether to use linear Spatial attention,default is
False
.activation (nn.Module) – Activation function, default is
nn.GELU
.
- forward(x, H, W)[source]
- Parameters
x (torch.Tensor) – Input tensor
H (int) – Height of image patches
W (int) – Width of image patches
- Returns
Returns output tensor by applying spatial attention on input tensor
- Return type
torch.Tensor
- training: bool