这个项目还有个网站,地址:https://nn.labml.ai/
这个项目将论文和pytorch代码结合起来,大大方便了大家的学习。
Multi-headed attention
Transformer building blocks
Transformer XL
Relative multi-headed attention
Rotary Positional Embeddings (RoPE)
Attention with Linear Biases (ALiBi)
RETRO
Compressive Transformer
GPT Architecture
GLU Variants
kNN-LM: Generalization through Memorization
Feedback Transformer
Switch Transformer
Fast Weights Transformer
FNet
Attention Free Transformer
Masked Language Model
MLP-Mixer: An all-MLP Architecture for Vision
Pay Attention to MLPs (gMLP)
Vision Transformer (ViT)
Primer EZ
Hourglass
Generate on a 48GB GPU
Finetune on two 48GB GPUs
LLM.int8()
Denoising Diffusion Probabilistic Models (DDPM)
Denoising Diffusion Implicit Models (DDIM)
Latent Diffusion Models
Stable Diffusion
Original GAN
GAN with deep convolutional network
Cycle GAN
Wasserstein GAN
Wasserstein GAN with Gradient Penalty
StyleGAN 2
Graph Attention Networks (GAT)
Graph Attention Networks v2 (GATv2)
Proximal Policy Optimization with Generalized Advantage Estimation
Deep Q Networks with with Dueling Network, Prioritized Replay and Double Q Network.
Solving games with incomplete information such as poker with CFR.
Kuhn Poker
Adam
AMSGrad
Adam Optimizer with warmup
Noam Optimizer
Rectified Adam Optimizer
AdaBelief Optimizer
Batch Normalization
Layer Normalization
Instance Normalization
Group Normalization
Weight Standardization
Batch-Channel Normalization
DeepNorm
PonderNet
Evidential Deep Learning to Quantify Classification Uncertainty
Fuzzy Tiling Activations
Greedy Sampling
Temperature Sampling
Top-k Sampling
Nucleus Sampling
Zero3 memory optimizations
我们一起看一下ResNet的例子,地址:https://nn.labml.ai/resnet/index.html

这是Block。

这是Block里面的内容。
这样的方式理解pytorch代码是不是简单了许多。