Gradient Descent

Gradient Descent

ย 
์ผ์ฐจ ํ•จ์ˆ˜์ธ ๊ฒฝ์šฐ ์•„๋ž˜ ์‹๊ณผ ๊ฐ™์ด ํ•จ์ˆ˜์„ ๋ฏธ๋ถ„ํ•˜์—ฌ ๊ณ„์‚ฐํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ์ˆ˜์น˜์ ์œผ๋กœ ๊ฒฝ์‚ฌ๋ฅผ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•์„ Numerical Gradient๋ผ๊ณ  ํ•œ๋‹ค.
ย 
notion image
ย 
๋งจ ์™ผ์ชฝ์ด ํ˜„์žฌ์˜ Weight ๊ฐ’์ด๊ณ  ์•„๋ž˜์—๋Š” Loss ๊ฐ’์ด ๋‚˜ํƒ€๋‚˜์žˆ๋‹ค. ๋ฅผ 0.0001์œผ๋กœ ์„ค์ •ํ•œ ํ›„ Gradient๋ฅผ ๊ตฌํ•˜๊ธฐ์œ„ํ•ด ์œ„์˜ ์ˆ˜์‹์„ ์ด์šฉํ•œ ๊ฒฐ๊ณผ๋Š” ๋งจ ์˜ค๋ฅธ์ชฝ๊ณผ ๊ฐ™๋‹ค. -2.5๋ผ๋Š” Gradient ๊ฐ’์„ ํ™•์ธ ํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ ์ด ๋•Œ ๊ธฐ์šธ๊ธฐ๊ฐ€ ์Œ์ˆ˜๋ผ๋Š” ๊ฒƒ์€ Loss์— ์Œ์˜ ์˜ํ–ฅ์„ ์ค€๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธ ํ•  ์ˆ˜ ์žˆ๋‹ค.
Numerical Gradient์˜ ๊ฒฝ์šฐ Debugging ํ•˜๊ธฐ ์ข‹๋‹ค๋Š” ์žฅ์ ์ด ์žˆ์ง€๋งŒ ์ด๋ ‡๊ฒŒ ํ•˜๋‚˜์˜ ๊ฐ’๋“ค์„ ์ผ์ผ์ด ๊ณ„์‚ฐํ•˜๋ฉด ๋ถ€์ •ํ™•ํ•˜๊ณ  ๋งค์šฐ ์˜ค๋ž˜๊ฑธ๋ฆด ๊ฒƒ์ด๋‹ค.
ย 
์ด๋Ÿฌํ•œ ๋‹จ์ ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋ฏธ๋ถ„์‹์„ ์ด์šฉํ•˜์—ฌ Gradient ๊ฐ’์„ ํ•œ๋ฒˆ์— ์–ป๋Š” ๋ฐฉ๋ฒ•์„ Analytic Gradient๋ผ๊ณ  ํ•œ๋‹ค.
notion image
ย 
Analytic Gradient์€ ๋น ๋ฅด๊ณ  ๊ฐ„ํŽธํ•˜์ง€๋งŒ ์ฝ”๋“œ ์ž‘์„ฑ์‹œ ์—๋Ÿฌ๊ฐ€ ๋ฐœ์ƒํ•˜๊ธฐ ์‰ฝ๊ธฐ ๋•Œ๋ฌธ์— ๊ฒ€ํ† ๊ณผ์ •์—์„œ Numerical Gradient ๋ฐฉ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ Gradient Check๋ฅผ ์‹ค์‹œํ•œ๋‹ค.
# ๋ฏธ๋ถ„ ์ด์šฉํ•œ ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• (Analytic Gradient) while True: weights_grad = evaluate_gradient(loss_fun, data, weights) weights += - step_size * weights_grad # ํŒŒ๋ผ๋ฏธํ„ฐ ์—…๋ฐ์ดํŠธ
ย