Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/.

NLP Zero to One: Transformers (Part 13/30)

Kowshik chilamkurthy
Nerd For Tech
Published in
4 min readMar 2, 2021

--

Generated by Author

Introduction..

Self-Attention..

Self-attention weights, generated by author
Scaled Dot-Product attention

Multi-Headed Self-Attention..

left: Scaled Dot-Product attention, Right: Multi- Headed Attention, Ref: [1]

Transformers..

Transformer Architecture
Comparing the decoder setup between RNN based and Transformer based E-D models, generated by author
Generated by Author

--

--

Nerd For Tech
Nerd For Tech

Published in Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/.

Kowshik chilamkurthy
Kowshik chilamkurthy

Written by Kowshik chilamkurthy

RL | ML | ALGO TRADING | TRANSPORTATION | GAME THEORY

No responses yet