Batch Normalization

Batch Normalization

Batch Normalization์€ ๋ณดํ†ต activation layer ์ „์— ์‚ฌ์šฉ๋˜์–ด ๋„คํŠธ์›Œํฌ ์—ฐ์‚ฐ ๊ฒฐ๊ณผ๊ฐ€ ์›ํ•˜๋Š” ๋ฐฉํ–ฅ์˜ ๋ถ„ํฌ๋Œ€๋กœ ๋‚˜์˜ค๋Š” ๊ฒƒ์„ ๋ชฉ์ ์œผ๋กœ ํ•œ๋‹ค.
notion image
ย 

Batch Normalization ๋ฐฉ๋ฒ•

  1. mini-batch์—์„œ์˜ ํ‰๊ท ์„ ๊ณ„์‚ฐํ•œ ํ›„ ๋ชจ๋“  mini-batch๋งˆ๋‹ค ํ‰๊ท ๊ณผ ๋ถ„์‚ฐ์„ ๊ฐ๊ฐ ๊ณ„์‚ฐ
  1. ํ‰๊ท ๊ณผ ๋ถ„์‚ฐ์œผ๋กœ Normalize
  1. ๋‹ค์‹œ ์ถ”๊ฐ€์ ์ธ scaling(), shifting factor()๋ฅผ ์‚ฌ์šฉ
ย 

ํŠน์ง•

  • gradient๊ฐ€ ๊ฐœ์„ ๋˜์–ด ํ•™์Šต์ด ์ž˜๋˜๊ฒŒ ๋งŒ๋“ค์–ด์ค€๋‹ค.
  • Weight์˜ ์ดˆ๊ธฐํ™”์— ์˜์กดํ•˜์ง€ ์•Š๋Š”๋‹ค.
  • Regularization์˜ ์—ญํ• ์„ ํ•˜์—ฌ Overfitting์„ ๋ง‰์•„์ค€๋‹ค.
  • Learning rate ๊ฐ€ ๋†’์•„๋„ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋‹ค.
  • train data๋Š” batch์˜ mean์„ ์ด์šฉํ•˜๊ณ , test data๋Š” train์„ ๊ฑฐ์นœ ํ›„ ์ „์ฒด data์˜ mean๋ฅผ ์ด์šฉํ•ด ์ •๊ทœํ™”๋ฅผ ํ•œ๋‹ค.
ย 
ย 
๐Ÿฅ‘
๋…ผ๋ฌธ์ •๋ฆฌ ์ฐธ๊ณ ์ž๋ฃŒ
ย 
ย 
ย