The Properly Illustrated Transformer
Posted on Wed 08 March 2023 in Machine Learning
The title is (obviously) a shout out to Jay Alammar's Blog, which is probably the best source to learn about transformers, followed by The Annotated Transformer. Both of them are unclear about the nuances though (where the dropouts are mostly), and so I made this to clear that up.
PS: Yes, this is just the encoder stack. I'll probably make the decoder stack in a part 2?