How Positional Embeddings work in Self-Attention (code in Pytorch)

Understand how positional embeddings emerged and how we use the inside self-attention to model highly structured data such as images