Aniruddha Deb

The Properly Illustrated Transformer

That’s a f@!#ton of dropouts

The title is (obviously) a shout out to Jay Alammar’s Blog, which is probably the best source to learn about transformers, followed by The Annotated Transformer. Both of them are unclear about the nuances though (where the dropouts are mostly), and so I made this to clear that up.

PS: Yes, this is just the encoder stack. I’ll probably make the decoder stack in a part 2?