Layer Norm modification #313

RyanKim17920 · 2024-05-25T03:31:23Z

Modified layer norms in the encoder layers to fit more closely to the original ViT paper.

tonyyunyang · 2024-05-31T15:18:36Z

There is this prenorm and postnorm issue. I believe the author went for postnorm as it stabilizes the training. GPT2 experienced unstable training issue due to prenorm, and since then all GPT uses postnorm.

RyanKim17920 · 2024-05-31T20:46:35Z

Oh, that's interesting, thanks for letting me know.

Layer Norm modification

bb4841e

Modified layer norms in the encoder layers to fit more closely to the original ViT paper.

lucidrains force-pushed the main branch 3 times, most recently from 19eb6d4 to 5e808f4 Compare August 21, 2024 14:23

lucidrains force-pushed the main branch from 43cbcad to f50d7d1 Compare October 9, 2024 14:32

lucidrains force-pushed the main branch from 1de866d to db05a14 Compare March 5, 2025 18:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Layer Norm modification #313

Layer Norm modification #313

Uh oh!

RyanKim17920 commented May 25, 2024

Uh oh!

tonyyunyang commented May 31, 2024

Uh oh!

RyanKim17920 commented May 31, 2024

Uh oh!

Uh oh!

Uh oh!

Layer Norm modification #313

Are you sure you want to change the base?

Layer Norm modification #313

Uh oh!

Conversation

RyanKim17920 commented May 25, 2024

Uh oh!

tonyyunyang commented May 31, 2024

Uh oh!

RyanKim17920 commented May 31, 2024

Uh oh!

Uh oh!