Vanilla Transformer

class vformer.encoder.vanilla.VanillaEncoder(embedding_dim, depth, num_heads, head_dim, mlp_dim, p_dropout=0.0, attn_dropout=0.0, drop_path_rate=0.0, drop_path_mode='batch')[source]
Parameters
  • embedding_dim (int) – Dimension of the embedding

  • depth (int) – Number of self-attention layers

  • num_heads (int) – Number of the attention heads

  • head_dim (int) – Dimension of each head

  • mlp_dim (int) – Dimension of the hidden layer in the feed-forward layer

  • p_dropout (float) – Dropout Probability

  • attn_dropout (float) – Dropout Probability

  • drop_path_rate (float) – Stochastic drop path rate

forward(x)[source]
Parameters

x (torch.Tensor) –

Returns

Returns output tensor

Return type

torch.Tensor