Table of Contents NLP Seq2Seq Additive Attention Dot Product Attention Transformer Efficient Transformer: A Survey GPT BERT T5 GPT-2 Llama Structured State Space Models (S4) Mamba Computer Vision ResNet ViT How ViT Works? Inference Training/Finetuning LoRA RLHF