โ€œ๋‚ด 10๋ถ„์˜ 1๋งŒ์ด๋ผ๋„ ์•„ํ”„๋‹ค ํ–‰๋ณตํ•ด์ค˜๐Ÿ˜ญโค๏ธโ€ NLP๋กœ ์ด๋ณ„ ๋…ธ๋ž˜ ํ”Œ๋ ˆ์ด๋ฆฌ์ŠคํŠธ ๋งŒ๋“ค๊ธฐ
๐ŸŽง

โ€œ๋‚ด 10๋ถ„์˜ 1๋งŒ์ด๋ผ๋„ ์•„ํ”„๋‹ค ํ–‰๋ณตํ•ด์ค˜๐Ÿ˜ญโค๏ธโ€ NLP๋กœ ์ด๋ณ„ ๋…ธ๋ž˜ ํ”Œ๋ ˆ์ด๋ฆฌ์ŠคํŠธ ๋งŒ๋“ค๊ธฐ

Created
Aug 9, 2022
Editor
cleanUrl: 'articles/playlist'

๐ŸŽง Intro. What is E-byul? (feat. NLP)

๐ŸŽง
์ข‹์œผ๋‹ˆ ๊ทธ ์‚ฌ๋žŒ ์†”์งํžˆ ๊ฒฌ๋””๊ธฐ ๋ฒ„๊ฑฐ์›Œ ๋„ค๊ฐ€ ์กฐ๊ธˆ ๋” ํž˜๋“ค๋ฉด ์ข‹๊ฒ ์–ด ์ง„์งœ ์กฐ๊ธˆ ๋‚ด ์‹ญ ๋ถ„์˜ ์ผ๋งŒ์ด๋ผ๋„ ์•„ํ”„๋‹ค ํ–‰๋ณตํ•ด์ค˜ - ์œค์ข…์‹ , โ€˜์ข‹๋‹ˆโ€™ (2017)
ย 
์—ฌ๋Ÿฌ๋ถ„๋“ค์€ ์šฐ์šธํ•œ ๋‚ ์— ์–ด๋–ค ์‹์œผ๋กœ ์‹œ๊ฐ„์„ ๋ณด๋‚ด์‹œ๋‚˜์š”? ์Œ์•…์€ ์ˆ˜๋งŽ์€ ์‚ฌ๋žŒ๋“ค์„ ์œ„๋กœํ•˜๊ธฐ๋„ ํ•˜๊ณ , ๋‹จ์ˆœํžˆ ๋ง๋กœ ์ „๋‹ฌํ•˜๊ธฐ ํž˜๋“  ํ‘œํ˜„์„ ๊ฑด๋„ค๊ณค ํ•ฉ๋‹ˆ๋‹ค. ์šฐ์šธ์ฆ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์‚ฌ๋žŒ๋“ค์€ ์ฆ๊ฑฐ์šด ์Œ์•…๋ณด๋‹ค๋Š” ์Šฌํ”ˆ ์Œ์•…์„ ํƒํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์Šฌํ”ˆ ์Œ์•…์„ ๋“ฃ๋Š”๋‹ค๊ณ  ํ•ด์„œ ๋” ์šฐ์šธํ•ด์ง€๋Š” ๊ฒƒ์ด ์•„๋‹Œ ์˜คํžˆ๋ ค ์ž์‹ ์˜ ๊ฐ์ • ์ƒํƒœ๋ฅผ ์ดํ•ดํ•ด์ค€๋‹ค๊ณ  ๋А๊ปด ํŽธ์•ˆํ•จ์„ ๋А๋ผ๋Š” ๊ฒƒ์ด์ฃ .
๋Œ€์ค‘ ๊ฐ€์š”์—๋Š” ์ˆ˜๋งŽ์€ ์ด๋ณ„ ๋…ธ๋ž˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋…ธ๋ž˜ ๊ฐ€์‚ฌ์— ๋‚˜ํƒ€๋‚˜๋Š” ์ด๋ณ„์˜ ์ •์„œ๋Š” ๋งค์šฐ ์„ฌ์„ธํ•˜๊ณ  ๋ณตํ•ฉ์ ์ด์ฃ . ๊ทธ๋ ‡๋‹ค๋ฉด ์ˆ˜๋งŽ์€ ์ด๋ณ„ ๋…ธ๋ž˜๋“ค ์ค‘์—์„œ๋„ ์šฐ๋ฆฌ๊ฐ€ ๋“ฃ๊ณ  ์‹ถ์€ ์ข…๋ฅ˜์˜ ๊ฐ€์‚ฌ๊ฐ€ ๋‹ด๊ธด ๋…ธ๋ž˜๋“ค์„ ๊ณจ๋ผ ๋“ค์„ ์ˆ˜๋Š” ์—†์„๊นŒ์š”? ์ž์—ฐ์–ด์ฒ˜๋ฆฌ(NLP, Natural Language Processing)๋ฅผ ์ด์šฉํ•ด ์ฃผ์–ด์ง„ ํ…์ŠคํŠธ์— ๋Œ€ํ•œ ํŠน์ • ๊ฐ์ •์„ ๋ถ„๋ฅ˜ํ•˜๋Š” โ€˜๊ฐ์ • ๋ถ„๋ฅ˜โ€™ ์ž‘์—…์„ ๋…ธ๋ž˜ ๊ฐ€์‚ฌ์— ์ ์šฉํ•œ๋‹ค๋ฉด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค! ์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ์ž์—ฐ์–ด์ฒ˜๋ฆฌ๋ฅผ ์ด์šฉํ•˜์—ฌ ์ด๋ณ„ ๊ฐ€์‚ฌ๋“ค์„ ๊ฐ์ • ๋ถ„๋ฅ˜ํ•˜๋Š” ๋…ผ๋ฌธ์„ ์†Œ๊ฐœํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
ย 
ย 
๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋ณ„ ๊ฐ€์‚ฌ๋ฅผ ๊ฐ์ • ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•ด์„œ CBOW ํ•™์Šต์„ ํ†ตํ•ด ๊ฐ์ • ์‚ฌ์ „์„ ๊ตฌ์ถ•ํ•˜๊ณ , LSTM์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์‚ฌ ํ•™์Šต์„ ๊ฑฐ์นœ ํ›„, ์œ ์‚ฌํ•œ ๊ฐ์ •์œผ๋กœ ๊ฐ€์‚ฌ๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ชจ๋ธ์„ ์†Œ๊ฐœํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. CBOW ๋Š” Word2Vec ๋ชจ๋ธ ์ค‘ ํ•˜๋‚˜์ด๊ณ , LSTM์€ RNN ๋ชจ๋ธ ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿผ ์ง€๊ธˆ๋ถ€ํ„ฐ CBOW ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ๊ตฌ์„ฑ๋˜๊ณ  ๊ตฌํ˜„๋˜๋Š”์ง€, LSTM ๋ชจ๋ธ์€ ์–ด๋–ค ๊ณผ์ •์„ ๊ฑฐ์ณ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•˜๋Š”์ง€ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค!
ย 
ย 
๐ŸŽง
deep daiv. 1st Digital Single <What is E-byul? (feat. NLP)> Tracklist
ย 

๐ŸŽง Track 1. CBOW๋กœ ๊ฐ์ • ์‚ฌ์ „ ๊ตฌ์ถ•ํ•˜๊ธฐ

1) Word2Vec

์ €๋ฒˆ ์‹œ๊ฐ„์—๋Š” โ€˜ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•โ€™์„ ํ†ตํ•ด ๋‹จ์–ด ๋ถ„์‚ฐ ํ‘œํ˜„์„ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ์—๋Š” ํ•œ์ธต ๋ฐœ์ „๋œ ๊ธฐ๋ฒ•์ธ โ€˜์ถ”๋ก  ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•โ€™์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—์„œ ๋ฐ”๋กœ Word2Vec์ด ๋“ฑ์žฅํ•ฉ๋‹ˆ๋‹ค! ํ†ต๊ณ„ ๊ธฐ๋ฐ˜๊ณผ ์ถ”๋ก  ๊ธฐ๋ฐ˜์˜ ๊ฐ€์žฅ ํฐ ์ฐจ์ด๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃจ๋Š” ๋ฐ์— ์žˆ์Šต๋‹ˆ๋‹ค. ํ†ต๊ณ„ ๊ธฐ๋ฐ˜์€ ๋‹จ ํ•œ ๋ฒˆ์˜ ์ฒ˜๋ฆฌ๋กœ ๋‹จ์–ด์˜ ๋ถ„์‚ฐ ํ‘œํ˜„์„ ์–ป๋Š” ๋ฐ˜๋ฉด ์ถ”๋ก  ๊ธฐ๋ฐ˜์€ ๋ฐ์ดํ„ฐ๋ฅผ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ฏธ๋‹ˆ๋ฐฐ์น˜๋กœ ๋‚˜๋ˆ„์–ด ํ•™์Šตํ•˜์ฃ . ๋ง๋ญ‰์น˜์˜ ํฌ๊ธฐ๊ฐ€ ์ž‘์œผ๋ฉด ํ†ต๊ณ„ ๊ธฐ๋ฐ˜๋„ ๊ดœ์ฐฎ์ง€๋งŒ ์‹ค์ œ๋กœ ์ ์šฉํ•˜๋Š” ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ๋Š” ์—„์ฒญ๋‚˜๊ฒŒ ํฌ๊ธฐ ๋•Œ๋ฌธ์— ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•์œผ๋กœ๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
notion image
ย 
Word2vec์˜ ๋ชจ๋ธ์€ ๋Œ€ํ‘œ์ ์œผ๋กœ CBOW์™€ skip-gram์œผ๋กœ ๋‚˜๋‰˜๋Š”๋ฐ, CBOW๋Š” ๋งฅ๋ฝ(์ฃผ๋ณ€ ๋‹จ์–ด)์œผ๋กœ๋ถ€ํ„ฐ ํƒ€๊ฒŸ ๋‹จ์–ด๋ฅผ ์ถ”์ธกํ•˜๊ณ  skip-gram์€ ํƒ€๊ฒŸ ๋‹จ์–ด๋กœ๋ถ€ํ„ฐ ๋งฅ๋ฝ(์ฃผ๋ณ€ ๋‹จ์–ด)์„ ์ถ”์ธกํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ๋ณดํ†ต skip-gram์ด ๋” ํšจ์œจ์ ์œผ๋กœ ์“ฐ์ด์ง€๋งŒ, ์†Œ๊ฐœํ•˜๋Š” ๋…ผ๋ฌธ์—์„œ๋Š” CBOW๋ฅผ ์ด์šฉํ•˜์—ฌ ๋…ธ๋ž˜ ๊ฐ€์‚ฌ์˜ ๋‹จ์–ด ์˜๋ฏธ ํŒŒ์•…๊ณผ ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๊ฐ์ • ์‚ฌ์ „์„ ๊ตฌ์ถ•ํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— CBOW์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. *CBOW์— ๋Œ€ํ•œ ๋Œ€๋ถ€๋ถ„์˜ ์„ค๋ช…๊ณผ ๊ทธ๋ฆผ์€ โ€˜๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹2โ€™ ๊ต์žฌ ์ž๋ฃŒ๋ฅผ ์ฃผ๋กœ ์ฐธ๊ณ ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
ย 

2) CBOW

ย 
notion image
์œ„ ๊ทธ๋ฆผ์„ ๋ณด๋ฉด CBOW ๋ชจ๋ธ์ด ์ž…๋ ฅ์ธต, ์€๋‹‰์ธต, ์ถœ๋ ฅ์ธต์œผ๋กœ ๋‚˜๋ˆ„์–ด์ ธ ์žˆ๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ ์ธต์€ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์˜ ๊ธฐ๋ณธ์ ์ธ ๊ตฌ์กฐ์ธ๋ฐ์š”, CBOW์˜ ํŠน์ง•์€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž…๋ ฅ์ด ์žˆ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ์ฃผ๋ณ€ ๋งฅ๋ฝ์˜ ํฌ๊ธฐ๋ฅผ ์–ด๋А ์ •๋„๋กœ ์ •ํ•  ๊ฒƒ์ธ์ง€์—์„œ ์ž…๋ ฅ์˜ ๊ฐœ์ˆ˜๊ฐ€ ๋‹ฌ๋ผ์ง€๋Š”๋ฐ, ์œ„์˜ ์˜ˆ์‹œ์—์„œ๋Š” ์œˆ๋„์šฐ์˜ ํฌ๊ธฐ(๋งฅ๋ฝ)๋ฅผ 1๋กœ ์„ค์ •ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ž…๋ ฅ์ด 2๊ฐœ์ธ ๊ฒƒ์ž…๋‹ˆ๋‹ค. You say goodbye and I say hello. ๋ผ๋Š” ๋ฌธ์žฅ์—์„œ ํƒ€๊นƒ์ด say๋ผ๋Š” ๋‹จ์–ด๋ผ๋ฉด ์ฃผ๋ณ€ ๋‹จ์–ด์ธ you์™€ goodbye์˜ ์›ํ•ซ๋ฒกํ„ฐ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
ย 
notion image
ย 
๊ฐ ์ธต ์‚ฌ์ด์—๋Š” ๊ฐ€์ค‘์น˜(W)๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค. ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์˜ ๊ธฐ๋ณธ์€ ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ์— ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ ๊ฒฝ๋ง์˜ ํ•™์Šต ์›๋ฆฌ๋Š” ์ˆœ์ „ํŒŒ(์ •๋ฐฉํ–ฅ)๋กœ ์ž…๋ ฅ๊ณผ ๊ฐ€์ค‘์น˜, ํŽธํ–ฅ์„ ํ†ตํ•ด ์†์‹คํ•จ์ˆ˜๋ฅผ ๊ตฌํ•˜๊ณ  ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์—ญ์ „ํŒŒ(์—ญ๋ฐฉํ–ฅ)๋กœ ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ตฌํ•ด ๊ฐ€์ค‘์น˜๋ฅผ ๊ฐฑ์‹ ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ•™์Šต์ด๋ผ ํ•˜๋ฉฐ ํ•™์Šต์ด ๋ฐ˜๋ณต๋ ์ˆ˜๋ก ๋งค๊ฐœ๋ณ€์ˆ˜(๊ฐ€์ค‘์น˜, ํŽธํ–ฅ ๋“ฑ)๋“ค์ด ์†์‹คํ•จ์ˆ˜๋ฅผ ์ตœ์†Œ๋กœ ํ•˜๋„๋ก ๊ฐฑ์‹ ๋˜์–ด, ํ•™์Šต์„ ์ง„ํ–‰ํ• ์ˆ˜๋ก ์‹ ๊ฒฝ๋ง์˜ ํšจ์œจ์ด ์ข‹์•„์ง€๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
ย 
Softmax๋Š” ์ถœ๋ ฅ์ธต์—์„œ ๋‚˜์˜จ ์ ์ˆ˜๋ฅผ ํ™•๋ฅ ๋กœ ๋ณ€ํ™˜ํ•ด์ฃผ๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค. Softmax ๊ณ„์ธต์„ ํ†ตํ•ด ๊ตฌํ•œ ํ™•๋ฅ ์„ ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ ์˜ค์ฐจ ๊ณ„์ธต์„ ์ด์šฉํ•ด ์ •๋‹ต ๋ ˆ์ด๋ธ”๊ณผ ๋น„๊ตํ•˜๋Š” ๊ฒƒ์ด์ฃ . ์ด๋ ‡๊ฒŒ ๊ตฌํ•ด์ง„ ์†์‹คํ•จ์ˆ˜๋Š” ์—ญ์ „ํŒŒ๋ฅผ ํ†ตํ•ด ๊ฑฐ์Šฌ๋Ÿฌ ์˜ฌ๋ผ๊ฐ€๊ณ  ์•ž ๊ณ„์ธต์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊ฐฑ์‹ ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ด์ œ ๊ฐ„๋‹จํ•œ CBOW ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค!
ย 
MatMul
class MatMul: def __init__(self, W): self.params = [W] self.grads = [np.zeros_like(W)] self.x = None # ์ˆœ์ „ํŒŒ def forward(self, x): W, = self.params # ๋„˜ํŒŒ์ด์˜ matmul์€ ํ–‰๋ ฌ ๊ณฑ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. out = np.matmul(x, W) self.x = x return out # ์—ญ์ „ํŒŒ def backward(self, dout): W, = self.params dx = np.matmul(dout, W.T) # T๋Š” ์ „์น˜ํ–‰๋ ฌ dW = np.matmul(self.x.T, dout) # [...] ์‚ฌ์šฉ ์‹œ ๊นŠ์€ ๋ณต์‚ฌ self.grads[0][...] = dW return dx
SoftmaxWithLoss
softmax
# ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜ ๊ตฌํ˜„ def softmax(a): c = np.max(a) exp_a = np.exp(a - c) # ์˜ค๋ฒ„ํ”Œ๋กœ ๋Œ€์ฑ… sum_exp_a = np.sum(exp_a) y = exp_a / sum_exp_a return y
cross_entropy_error
# ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ ์˜ค์ฐจ ํ•จ์ˆ˜ ๊ตฌํ˜„ def cross_entropy_error(y, t): if y.ndim == 1: t = t.reshape(1, t.size) y = y.reshape(1, y.size) batch_size = y.shape[0] return -np.sum(t * np.log(y + 1e-7)) / batch_size
# ์†Œํ”„ํŠธ๋งฅ์Šค ๊ณ„์ธต๊ณผ ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ ์˜ค์ฐจ ๊ณ„์ธต์„ ํ•ฉํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค! class SoftmaxWithLoss: def __init__(self): self.loss = None # ์†์‹ค self.y = None # softmax์˜ ์ถœ๋ ฅ self.t = None # ์ •๋‹ต ๋ ˆ์ด๋ธ” (์›-ํ•ซ ๋ฒกํ„ฐ) # ์ˆœ์ „ํŒŒ def forward(self, x, t): self.t = t self.y = softmax(x) self.loss = cross_entropy_error(self.y, self.t) return self.loss # ์—ญ์ „ํŒŒ def backward(self, dout=1): batch_size = self.t.shape[0] dx = (self.y - self.t) / batch_size return dx
class SimpleCBOW: def __init__(self, vocab_size, hidden_size): # ์–ดํœ˜ ์ˆ˜, ์€๋‹‰์ธต์˜ ๋‰ด๋Ÿฐ ์ˆ˜ V, H = vocab_size, hidden_size # ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™” # random.randn์„ ํ†ตํ•ด ์ž„์˜์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. # astype('f') ๋Š” ๋„˜ํŒŒ์ด ๋ฐฐ์—ด ๋ฐ์ดํ„ฐ ํƒ€์ž…, 32๋น„ํŠธ ๋ถ€๋™์†Œ์ˆ˜์  ์ˆ˜๋กœ ์ดˆ๊ธฐํ™” W_in = 0.01 * np.random.randn(V, H).astype('f') W_out = 0.01 * np.random.randn(H, V).astype('f') # ๊ณ„์ธต ์ƒ์„ฑ # MatMul ๊ณ„์ธต์€ ์œˆ๋„์šฐ์˜ ํฌ๊ธฐ๋งŒํผ ์ƒ์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค! self.in_layer0 = MatMul(W_in) self.in_layer1 = MatMul(W_in) self.out_layer = MatMul(W_out) self.loss_layer = SoftmaxWithLoss() # ๋ชจ๋“  ๊ฐ€์ค‘์น˜์™€ ๊ธฐ์šธ๊ธฐ๋ฅผ ๋ฆฌ์ŠคํŠธ์— ์ถ”๊ฐ€ layers = [self.in_layer0, self.in_layer1, self.out_layer] self.params, self.grads = [], [] for layer in layers: self.params += layer.params self.grads += layer.grads # ์ธ์Šคํ„ด์Šค ๋ณ€์ˆ˜์— ๋‹จ์–ด์˜ ๋ถ„์‚ฐ ํ‘œํ˜„ ์ €์žฅ self.word_vecs = W_in # ์ˆœ์ „ํŒŒ def forward(self, contexts, target): h0 = self.in_layer0.forward(contexts[:, 0]) h1 = self.in_layer1.forward(contexts[:, 1]) h = (h0 + h1) * 0.5 score = self.out_layer.forward(h) loss = self.loss_layer.forward(score, target) return loss # ์—ญ์ „ํŒŒ def backward(self, dout=1): ds = self.loss_layer.backward(dout) da = self.out_layer.backward(ds) da *= 0.5 self.in_layer1.backward(da) self.in_layer0.backward(da) return None
ย 
CBOW ๋ชจ๋ธ์˜ ํ•™์Šต์€ ์ผ๋ฐ˜์ ์ธ ์‹ ๊ฒฝ๋ง์˜ ํ•™์Šต๊ณผ ๋น„์Šทํ•ฉ๋‹ˆ๋‹ค. ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์‹ ๊ฒฝ๋ง์— ์ž…๋ ฅํ•œ ํ›„, ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ตฌํ•˜๊ณ  ๊ฐ€์ค‘์น˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊ฐฑ์‹ ํ•ด ๋‚˜๊ฐ‘๋‹ˆ๋‹ค. Optimizer๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜ ๊ฐฑ์‹ ์„ ์œ„ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ SGD(ํ™•๋ฅ ์  ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•), AdaGrad ๋“ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ์ฝ”๋“œ์—์„œ๋Š” Adam์ด๋ผ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค!
import sys from common.trainer import Trainer from common.optimizer import Adam from simple_cbow import SimpleCBOW from common.util import preprocess, create_contexts_target, convert_one_hot window_size = 1 hidden_size = 5 batch_size = 3 max_epoch = 1000 text = 'You say goodbye and I say hello.' corpus, word_to_id, id_to_word = preprocess(text) vocab_size = len(word_to_id) contexts, target = create_contexts_target(corpus, window_size) target = convert_one_hot(target, vocab_size) contexts = convert_one_hot(contexts, vocab_size) model = SimpleCBOW(vocab_size, hidden_size) optimizer = Adam() trainer = Trainer(model, optimizer) trainer.fit(contexts, target, max_epoch, batch_size) trainer.plot() word_vecs = model.word_vecs for word_id, word in id_to_word.items(): print(word, word_vecs[word_id])
ย 
Colab์„ ํ†ตํ•ด ์ง์ ‘ ์‹œํ–‰ํ•ด ๋ณธ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค! ์„ธ๋กœ ์ถ•์€ loss ๊ฐ’, ๊ฐ€๋กœ์ถ•์€ iterations ๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ• ์ˆ˜๋ก loss ๊ฐ€ ์ž‘์•„์ง€๋Š” ๊ฒƒ์ด ์ด์ƒ์ ์ž…๋‹ˆ๋‹ค. ๋ง๋ญ‰์น˜๊ฐ€ ์ž‘์•„์„œ ๊ทธ๋Ÿฐ์ง€ ๋งŒ์กฑ์Šค๋Ÿฌ์šด ๊ฒฐ๊ณผ๋Š” ์•„๋‹ˆ์ง€๋งŒ ํ•™์Šต์ด ์ง„ํ–‰๋˜์—ˆ๋‹ค๋Š” ๊ฒƒ์€ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค!!
notion image
ย 
์ง์ ‘ ํ•ด๋ณด๊ณ  ์‹ถ์œผ์‹  ๋ถ„๋“ค์€ ์•„๋ž˜ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ํ•ด๋ณด์‹œ๋ฉด ์ข‹์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค!
GitHub - WegraLee/deep-learning-from-scratch-2: ใ€Ž๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ โทใ€(ํ•œ๋น›๋ฏธ๋””์–ด, 2019)
โœ… 2019.07.02 - ์ฑ… ๋ณธ๋ฌธ์˜ ์ˆ˜์‹๊ณผ ๊ทธ๋ฆผ ํŒŒ์ผ๋“ค์„ ๋ชจ์•„ ๊ณต์œ ํ•ฉ๋‹ˆ๋‹ค. ์Šคํ„ฐ๋”” ์ž๋ฃŒ ๋“ฑ์„ ๋งŒ๋“œ์‹ค ๋•Œ ํ•„์š”ํ•˜๋ฉด ํ™œ์šฉํ•˜์„ธ์š”. ๋‹ค์Œ์€ ์—ญ์ž๊ฐ€ ์ถ”์ฒœํ•˜๋Š” ์„ ์ˆ˜์ง€์‹์ž…๋‹ˆ๋‹ค. hanbit.co.kr ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ํŒŒ์ผ(6์žฅ, 7์žฅ์—์„œ ์‚ฌ์šฉ)์€ ์•„๋ž˜ URL์—์„œ ๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. https://www.oreilly.co.jp/pub/9784873118369/BetterRnnlm.pkl ์†Œ์Šค ์ฝ”๋“œ์— ๊ด€ํ•œ ์„ค๋ช…์€ ์ฑ…์„ ์ฐธ๊ณ ํ•˜์„ธ์š”. ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋ ค๋ฉด ์•„๋ž˜์˜ ์†Œํ”„ํŠธ์›จ์–ด๊ฐ€ ์„ค์น˜๋˜์–ด ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ์„ ํƒ์‚ฌํ•ญ์œผ๋กœ ๋‹ค์Œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
GitHub - WegraLee/deep-learning-from-scratch-2: ใ€Ž๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ โทใ€(ํ•œ๋น›๋ฏธ๋””์–ด, 2019)
ย 

3) CBOW๋กœ ๊ฐ์ • ์‚ฌ์ „ ๊ตฌ์ถ•ํ•˜๊ธฐ

์ด์ œ ๋…ผ๋ฌธ์—์„œ CBOW๋ฅผ ์–ด๋–ป๊ฒŒ ํ™œ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค! Word2Vec๋ฅผ ์ด์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ „์ฒ˜๋ฆฌ ๊ณผ์ •์ด ํ•„์ˆ˜์ž…๋‹ˆ๋‹ค. ์ „์ฒ˜๋ฆฌ์—๋Š” ํ† ํฐํ™”, ํ’ˆ์‚ฌ ํƒœ๊น…, ๋ถˆ์šฉ์–ด ์ฒ˜๋ฆฌ ๊ณผ์ • ๋“ฑ์ด ์žˆ์ฃ . ์ „์ฒ˜๋ฆฌ๋ฅผ ๋งˆ์นœ ํ›„์—๋Š” ๋ฌด์—‡์„ ํ• ๊นŒ์š”? ์ด์   ๋‹จ์–ด๋ณ„ ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ๋‹จ์ˆœํžˆ โ€˜์‚ฌ๋ž‘โ€™์ด๋ผ๋Š” ๋‹จ์–ด๋Š” ๋ฌด์กฐ๊ฑด ํ–‰๋ณต๋งŒ์„ ๋œปํ•˜์ง„ ์•Š์Šต๋‹ˆ๋‹ค. ๋ฌธ๋งฅ์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€์ฃ . โ€˜์•„์ฃผ ๋งŽ์ด ์‚ฌ๋ž‘ํ–ˆ๋˜ ๋‚˜์˜ ๊ทธ๋Œ€๋ฅผ ์ด์   ๋– ๋‚˜ ๋ณด๋‚ด๋ คํ•ดโ€™, โ€˜๋‹ค๋ฅธ ์‚ฌ๋ž‘ ๋ชป ํ•  ๊ฑฐ ๊ฐ™์•„์š”โ€™ ๊ฐ™์€ ๋ฌธ์žฅ๋“ค์—์„œ ์‚ฌ๋ž‘์€ โ€˜์ด๋ณ„โ€™์ด๋ผ๋Š” ๋‹จ์–ด์™€ ์˜๋ฏธ์ ์œผ๋กœ ์œ ์‚ฌ๋„๊ฐ€ ๋” ๋†’์•„์•ผ ํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋•Œ Word2Vec์„ ํ†ตํ•œ ์ž„๋ฒ ๋”ฉ ๋ชจ๋ธ์€ ๋‹จ์–ด ๊ฐ„ ๊ฑฐ๋ฆฌ ์œ ์‚ฌ๋„๋ฅผ ํŒ๋ณ„ํ•˜์—ฌ ๋‹จ์–ด ๊ฐ„์˜ ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
ย 
์ง€๊ธˆ๊นŒ์ง€ ์„ค๋ช…ํ–ˆ๋˜ CBOW๋ฅผ ์ด์šฉํ•ด ๋‹จ์–ด๋ฅผ ํ•™์Šตํ•˜๊ณ , ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐ์ • ์‚ฌ์ „์„ ๊ตฌ์ถ•ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์œ„์˜ ๋…ผ๋ฌธ์—์„œ๋Š” 4๊ฐ€์ง€ ๋Œ€ํ‘œ ๊ฐ์ •์„ โ€˜์Šฌํ””โ€™, โ€˜๋ถ€์ •โ€™, โ€˜๋ถ„๋…ธโ€™, โ€˜๋ฌด๊ด€์‹ฌโ€™์œผ๋กœ ์„ค์ •ํ•˜๊ณ  ์œ ์‚ฌ๋„๋ฅผ ๋ฒกํ„ฐํ™”ํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ํ‘œ๋Š” ๋…ผ๋ฌธ์— ์†Œ๊ฐœ๋˜์–ด ์žˆ๋Š” ํ‘œ์ธ๋ฐ์š”, 4๊ฐ€์ง€ ๊ฐ์ •์— ๋Œ€ํ•œ ์œ ์‚ฌ๋„๊ฐ€ ๋†’์€ ๋‹จ์–ด๋ฅผ 5๊ฐœ์”ฉ ์ถ”์ถœํ•œ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค.
notion image
์œ„์˜ ํ‘œ์ฒ˜๋Ÿผ ์œ ์‚ฌ๋„๊ฐ€ ๋†’์€ ์ˆœ์„œ๋Œ€๋กœ , ์œ ์‚ฌ๋„๊ฐ€ 0.5 ์ด์ƒ์ธ ๋‹จ์–ด๋“ค๋งŒ์„ ์ด์šฉํ•˜์—ฌ ๋Œ€ํ‘œ ๊ฐ์ •์— ๋Œ€ํ•œ ๊ฐ์ • ์‚ฌ์ „์„ ๊ตฌ์ถ•ํ•˜๋Š” ๊ฒƒ์ด์ฃ . ์ด์ œ ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ LSTM์„ ์ด์šฉํ•˜์—ฌ ์ด๋ณ„ ๊ฐ€์‚ฌ์˜ ๊ฐ์ •์„ ๋ถ„๋ฅ˜ํ•˜๊ฒŒ ๋˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค!
ย 

๐ŸŽง Track 2. LSTM์œผ๋กœ ๊ฐ์ • ๋ถ„๋ฅ˜ ํ•™์Šตํ•˜๊ธฐ

1) ๋ฐ์ดํ„ฐ์˜ ์ˆœ์„œ๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ํ•™์Šตํ•˜๋Š” ์ˆœํ™˜์‹ ๊ฒฝ๋ง RNN(Recurrent Neural Network)

ย 
๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์‚ฌ๋žŒ์˜ ์‹ ๊ฒฝ๋ง ์›๋ฆฌ์™€ ๊ตฌ์กฐ๋ฅผ ๋ชจ๋ฐฉํ•˜์—ฌ ๋งŒ๋“ค์–ด์ง„ โ€˜์ธ๊ณต์‹ ๊ฒฝ๋ง(Artificial Neural Network)โ€™์ž…๋‹ˆ๋‹ค. ์˜ค๋Š˜ ํฌ์ŠคํŠธ์—์„œ ์†Œ๊ฐœํ•  RNN(Recurrent Neural Network, ์ˆœํ™˜์‹ ๊ฒฝ๋ง)๋„ ์ธ๊ณต์‹ ๊ฒฝ๋ง์˜ ํ•œ ์ข…๋ฅ˜์ธ๋ฐ์š”. ์ˆœํ™˜์‹ ๊ฒฝ๋ง์€ โ€˜์ˆœ์ฐจ ๋ฐ์ดํ„ฐ(Sequential Data)โ€™ ์ฆ‰ โ€˜์ˆœ์„œโ€™ ์ •๋ณด๊ฐ€ ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์— ํŠนํ™”๋œ ์ธ๊ณต์‹ ๊ฒฝ๋ง์ž…๋‹ˆ๋‹ค. โ€˜์ˆœ์ฐจ ๋ฐ์ดํ„ฐโ€™๋Š” ์‹œ๊ฐ„ ์ˆœ์„œ๋Œ€๋กœ ์žฅ๋ฉด์ด ์ง„ํ–‰๋˜๋Š” ์˜ํ™”์˜ ์‹œํ€€์Šค์ฒ˜๋Ÿผ ์ˆœ์„œ ์ •๋ณด๋ฅผ ๊ฐ–๊ณ  ์žˆ์–ด์š”.
ํ…์ŠคํŠธ๋ฐ์ดํ„ฐ๋„ ์ˆœ์ฐจ๋ฐ์ดํ„ฐ์— ํ•ด๋‹นํ•ด์š”. ๋ฌธ์žฅ์„ ์ฝ๋Š” โ€˜์ˆœ์„œโ€™๊ฐ€ ์žˆ๊ณ , ์–ด๋–ค ๋‹จ์–ด์˜ ์˜๋ฏธ๋Š” ๊ทธ ๋‹จ์–ด์˜ ์•ž๋’ค ์ˆœ์„œ์— ์žˆ๋Š” ๋‹ค๋ฅธ ๋‹จ์–ด๋“ค์— ๋”ฐ๋ผ ์ฆ‰ ๋ฌธ์žฅ์—์„œ ์œ„์น˜ํ•˜๋Š” ์ˆœ์„œ์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€๋‹ˆ๊นŒ์š”. ํ…์ŠคํŠธ๋ฐ์ดํ„ฐ์—์„œ ์ˆœ์„œ ์ •๋ณด๋ฅผ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ  ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ๋ฅผ ํ•  ์ˆ˜๋„ ์žˆ๋Š”๋ฐ, RNN์€ ํ…์ŠคํŠธ๋ฐ์ดํ„ฐ์˜ ์ˆœ์„œ ์ •๋ณด๊นŒ์ง€ ๋ชจ๋‘ ๊ณ ๋ คํ•˜์—ฌ ํ•™์Šต์„ ํ•˜๊ฒ ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
RNN์ด ์–ด๋–ป๊ฒŒ ํ…์ŠคํŠธ๋ฐ์ดํ„ฐ์˜ ์ˆœ์„œ ์ •๋ณด๋ฅผ ๊ณ ๋ คํ•ด์„œ ํ•™์Šต๋˜๋Š”์ง€ ์•Œ์•„๋ณผ๊นŒ์š”? RNN(์ˆœํ™˜์‹ ๊ฒฝ๋ง)์€ ์ด๋ฆ„์ฒ˜๋Ÿผ ๋ชจ๋ธ ๋‚ด๋ถ€์— โ€˜์ˆœํ™˜โ€™ํ•˜๋Š” ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์–ด, ๊ณผ๊ฑฐ์— ํ•™์Šตํ•œ ๊ฐ’์„ ํ˜„์žฌ์˜ ํ•™์Šต์— ๋ฐ˜์˜ํ•˜๋Š” ๊ณผ์ •์„ ๊ณ„์†ํ•ด์„œ ๋ฐ˜๋ณตํ•  ์ˆ˜ ์žˆ์–ด์š”. ๋ฐ”๋กœ ๊ณผ๊ฑฐ์— ํ•™์Šตํ•œ ๊ฐ’์˜ ๊ฐ€์ค‘์น˜(Weight Value)๋ฅผ ํ˜„์žฌ์˜ ํ•™์Šต๊ณผ์ •์— ๋ฐ˜์˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด์„œ์š”.
RNN ๋ชจ๋ธ์˜ ์ˆœํ™˜ ๊ตฌ์กฐ๋Š” ์‹œ์ (ํƒ€์ž„์Šคํƒญ) ๋‹จ์œ„๋กœ cell์„ ํŽผ์ณ์„œ ์˜ค๋ฅธ์ชฝ๊ณผ ๊ฐ™์ด ํ‘œํ˜„ํ•  ์ˆ˜๋„ ์žˆ๋‹ค. ์‚ฌ์ง„์ถœ์ฒ˜ https://wikidocs.net/22886
RNN ๋ชจ๋ธ์˜ ์ˆœํ™˜ ๊ตฌ์กฐ๋Š” ์‹œ์ (ํƒ€์ž„์Šคํƒญ) ๋‹จ์œ„๋กœ cell์„ ํŽผ์ณ์„œ ์˜ค๋ฅธ์ชฝ๊ณผ ๊ฐ™์ด ํ‘œํ˜„ํ•  ์ˆ˜๋„ ์žˆ๋‹ค. ์‚ฌ์ง„์ถœ์ฒ˜ https://wikidocs.net/22886
ย 
RNN ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์›๋ฆฌ
* โ€˜[๋”ฅ๋Ÿฌ๋‹] RNN ๊ธฐ์ดˆ (์ˆœํ™˜์‹ ๊ฒฝ๋ง - Vanilla RNN)โ€™ ์˜์ƒ์˜ ์„ค๋ช…๊ณผ ์ด๋ฏธ์ง€, ์˜ˆ์ œ๋ฅผ ์ฐธ๊ณ ํ•˜์˜€์Šต๋‹ˆ๋‹ค. https://www.youtube.com/watch?v=PahF2hZM6cs
RNN ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์ž‘๋™ํ•˜๋Š” ์›๋ฆฌ๋ฅผ ๊ฐ„๋‹จํžˆ ์•Œ์•„๋ณผ๊ฒŒ์š”. ๋จผ์ € ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฌธ์žฅ์—์„œ ๋‹จ์–ด์˜ ํ’ˆ์‚ฌ๋ฅผ ๊ฐ๊ฐ ๋ถ„๋ฅ˜ํ•˜๋Š” POSํƒœ๊น…(ํ’ˆ์‚ฌ ํƒœ๊น…, Part-Of-Speech Tagging) ์ž‘์—…์„ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ด๋ด…์‹œ๋‹ค.
๐Ÿ“Ž
I, work, at google (๋‚œ ๊ตฌ๊ธ€์—์„œ ์ผํ•ด.) ๋Œ€๋ช…์‚ฌ, ๋™์‚ฌ, ์ „์น˜์‚ฌ, ๋ช…์‚ฌ
notion image
ํ’ˆ์‚ฌ๋ถ„๋ฅ˜๊ธฐ RNN ๋ชจ๋ธ ์˜ˆ์‹œ. ์‚ฌ์ง„ ์ถœ์ฒ˜ https://www.youtube.com/watch?v=PahF2hZM6cs
ํ’ˆ์‚ฌ๋ถ„๋ฅ˜๊ธฐ RNN ๋ชจ๋ธ ์˜ˆ์‹œ. ์‚ฌ์ง„ ์ถœ์ฒ˜ https://www.youtube.com/watch?v=PahF2hZM6cs
โœ…
ํƒ€์ž„์Šคํƒญ t์—์„œ RNN ๋ชจ๋ธ ๊ตฌํ˜„ ๊ณผ์ • a. ์ž…๋ ฅ์ธต : xt ํˆฌ์ž… b. ์€๋‹‰์ธต : xt์™€ ht-1์— ๋Œ€ํ•œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ฒ˜๋ฆฌ โ†’ yt , ht ์‚ฐ์ถœ (๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ๊ฐ’์€ ๋งค๊ฐœ๋ณ€์ˆ˜) c. ์ถœ๋ ฅ์ธต : yt์— ๋Œ€ํ•œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ฒ˜๋ฆฌ โ†’ ์˜ˆ์ธก๊ฐ’ ์‚ฐ์ถœ
๊ทธ ๋‹ค์Œ ์‹œ์ ์—์„œ โ€˜workโ€™๋ฅผ ์ฒ˜๋ฆฌํ•  ๋•Œ๋Š” โ€˜Iโ€™์— ๋Œ€ํ•œ ์ƒํƒœ๊ฐ’์„ ๊ณ ๋ คํ•˜์—ฌ, ํ™•๋ฅ ์ ์œผ๋กœ ๋ฌธ์žฅ์—์„œ ๋งจ ์•ž์— ๋Œ€๋ช…์‚ฌ๊ฐ€ ์˜จ ๋’ค์—๋Š” ๋™์‚ฌ๊ฐ€ ์˜จ๋‹ค๋Š” ์ถ”๋ก ์„ ๊ฑฐ์ณ โ€˜workโ€™์˜ ํ’ˆ์‚ฌ๋ฅผ โ€˜๋™์‚ฌโ€™๋กœ ํŒ๋‹จํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ๋‹จ์–ด ํ•˜๋‚˜๋ฅผ ์€๋‹‰์ธต์—์„œ ๊ณ„์‚ฐํ•ด ๋‚ด๋ณด๋‚ด๋Š” ์ฒ˜๋ฆฌ์˜ ๋‹จ์œ„๋ฅผ cell์ด๋ผ๊ณ  ํ•˜๋Š”๋ฐ, โ€˜workโ€™๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” cell์—์„œ๋Š” ๊ณผ๊ฑฐ์— ๊ณ„์‚ฐํ•œ โ€˜Iโ€™์— ๋Œ€ํ•œ ์ƒํƒœ๊ฐ’์ด ์ˆ˜ํ•™์ ์œผ๋กœ ๊ฐ™์ด ๊ณ ๋ ค๋˜์–ด ๊ณ„์‚ฐ์ด ๋˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋’ค์ด์–ด ์˜ค๋Š” โ€˜atโ€™์„ ์ฒ˜๋ฆฌํ•  ๋•Œ๋Š” โ€˜Iโ€™์™€ โ€˜workโ€™์˜ ์ƒํƒœ๊ฐ’์ด ๊ณ ๋ ค๋˜๊ณ , โ€˜googleโ€™์„ ์ฒ˜๋ฆฌํ•  ๋•Œ๋Š” โ€˜Iโ€™์™€ โ€˜workโ€™์™€ โ€˜atโ€™์˜ ์ƒํƒœ๊ฐ’์ด ๊ณ ๋ ค๋˜๊ฒ ์ฃ .
โœ…
t๋ฒˆ์งธ ์‹œ์ ์—์„œ ํ˜„์žฌ์˜ input์ธ xt์™€ ๊ณผ๊ฑฐ t-1๋ฒˆ์งธ ์‹œ์ ์—์„œ ์ฒ˜๋ฆฌํ•œ ์ƒํƒœ๊ฐ’ ht-1๋ฅผ ๊ณ ๋ คํ•ด ํ˜„์žฌ์˜ ์ƒํƒœ๊ฐ’ ht๋ฅผ ์‚ฐ์ถœํ•˜๋Š” ์‹
์ด๋Ÿฌํ•œ ์‹œ์ ์ด ์—ฌ๋Ÿฌ ๋ฒˆ ๋ฐ˜๋ณต๋˜๋ฉด ๋‹ค์Œ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด ์ง„ํ–‰๋ฉ๋‹ˆ๋‹ค. ์‹œ์ ๋งˆ๋‹ค ๋™์ผํ•œ cell์ด ์‚ฌ์šฉ๋˜๊ณ , ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ๊ฐ’๋„ ๋™์ผํ•˜๊ฒŒ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค. ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ๊ฐ’์€ RNN ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๊ณผ์ •์—์„œ ๋ชจ๋ธ์ด ์ตœ์ข…์ ์œผ๋กœ ๊ณ„์‚ฐํ•˜์—ฌ ์‚ฐ์ถœํ•œ ์˜ˆ์ธก๊ฐ’๊ณผ ์ •๋‹ต๊ฐ’ ์‚ฌ์ด ์ฐจ์ด๋ฅผ ์ค„์—ฌ์ฃผ๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๊ณ„์†ํ•ด์„œ ์กฐ์ •๋˜๋ฉฐ ์ตœ์ ํ™”๋ฉ๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ๋งค๊ฐœ๋ณ€์ˆ˜์ธ ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ๊ฐ’์„ ์ตœ์ ํ™”์‹œํ‚ค๋Š” ๊ณผ์ •์ธ back propagation ์—ญ์‹œ ์‹œ์ ์— ๋”ฐ๋ผ ์ˆœ์ฐจ์ ์œผ๋กœ ์ง„ํ–‰๋˜์–ด์„œ, BPTT(Back Propagation THROUGH TIME)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
โœ…
์‹œ์  t์—์„œ RNN ๋ชจ๋ธ์„ ์ง€๋„ํ•™์Šต(์ •๋‹ต์„ ์•Œ๋ ค์ฃผ๋Š” ํ•™์Šต)ํ•˜๋Š” ๊ณผ์ • : ์ถœ๋ ฅ์ธต์—์„œ ์‚ฐ์ถœํ•œ ์˜ˆ์ธก๊ฐ’๊ณผ ์ •๋‹ต๊ฐ’ ๊ฐ„ ์ฐจ์ด(error)๊ฐ€ ์ค„์–ด๋“ค๋„๋ก ๋งค๊ฐœ๋ณ€์ˆ˜(๊ฐ€์ค‘์น˜ & ํŽธํ–ฅ๊ฐ’)๋ฅผ ์ตœ์ ํ™”
ย 

2) RNN์˜ ๋‹จ์ ์„ ๋ณด์™„ํ•œ LSTM

ย 
์ฒซ ๋ฒˆ์งธ ์‹œ์ ์˜ ์ž…๋ ฅ๊ฐ’์ธ x1์˜ ์ •๋ณด๋Ÿ‰์ด ๋’ค๋กœ ๊ฐˆ์ˆ˜๋ก ์†์‹ค๋œ๋‹ค. ์‚ฌ์ง„์ถœ์ฒ˜ https://wikidocs.net/22888
์ฒซ ๋ฒˆ์งธ ์‹œ์ ์˜ ์ž…๋ ฅ๊ฐ’์ธ x1์˜ ์ •๋ณด๋Ÿ‰์ด ๋’ค๋กœ ๊ฐˆ์ˆ˜๋ก ์†์‹ค๋œ๋‹ค. ์‚ฌ์ง„์ถœ์ฒ˜ https://wikidocs.net/22888
RNN ๋ชจ๋ธ์€ ๋น„๊ต์  ์งง์€ ์‹œํ€€์Šค์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•  ๋•Œ์—๋Š” ํšจ๊ณผ์ ์ด์ง€๋งŒ, ๋ฐ์ดํ„ฐ๊ฐ€ ๋งŽ์•„์ง€๋ฉด ์•ž์„œ ์ฒ˜๋ฆฌ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๋งˆ์ง€๋ง‰๊นŒ์ง€ ๊ธฐ์–ตํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. ๊ทธ ์ด์œ ๋Š” RNN์˜ ์ถœ๋ ฅ ๊ฒฐ๊ณผ๊ฐ€ ์ด์ „ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ ์ฒ˜๋ฆฌ๋˜๊ธฐ ๋•Œ๋ฌธ์ด์—์š”. ์ด๋ ‡๊ฒŒ ์ด์ „ ๊ฒฐ๊ณผ์— ์˜์กดํ•˜๋Š” ๊ณ„์‚ฐ ๋ฐฉ์‹์—์„œ๋Š” input ๋‹จ์–ด ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ์•„์ง€๊ณ  ๊ฐ ๋‹จ์–ด๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ์‹œ์ ๋“ค์ด ๋งŽ์•„์ง€๋ฉด ๋งŽ์•„์งˆ์ˆ˜๋ก, ์ด์ „ ์‹œ์ ๋“ค์˜ ์ •๋ณด๊ฐ€ ์ถฉ๋ถ„ํžˆ ์ „๋‹ฌ๋˜๊ธฐ ์–ด๋ ค์›Œ์ง‘๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ RNN๋ชจ๋ธ์˜ ๋ฌธ์ œ๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ๋“ฑ์žฅํ•œ ๋ชจ๋ธ์ด ๋ฐ”๋กœ LSTM(Long Short-Term Memory, ์žฅ๋‹จ๊ธฐ ๋ฉ”๋ชจ๋ฆฌ)์ž…๋‹ˆ๋‹ค! LSTM์˜ cell ๋‚ด๋ถ€์—๋Š” ์ž…๋ ฅ๊ฐ’์„ hidden state ๊ฐ’(์ƒํƒœ๊ฐ’)์œผ๋กœ ๊ณ„์‚ฐํ•˜๋Š” ๊ณผ์ •๋ฟ ์•„๋‹ˆ๋ผ, ์ž…๋ ฅ๊ฐ’์„ cell state ๊ฐ’์œผ๋กœ ๊ณ„์‚ฐํ•˜๋Š” memory cell์˜ ๋งค์ปค๋‹ˆ์ฆ˜๋„ ์ถ”๊ฐ€๋˜์–ด ์žˆ์–ด์š”.
RNN ๋ชจ๋ธ ๋‚ด๋ถ€ ๊ตฌ์กฐ๋„. ์‚ฌ์ง„์ถœ์ฒ˜ https://wikidocs.net/22888
RNN ๋ชจ๋ธ ๋‚ด๋ถ€ ๊ตฌ์กฐ๋„. ์‚ฌ์ง„์ถœ์ฒ˜ https://wikidocs.net/22888
์ด์ œ ์˜ˆ์‹œ ๋ฌธ์žฅ๊ณผ ํ•จ๊ป˜ LSTM์˜ cell์ด ๊ตฌํ˜„๋˜๋Š” ๊ณผ์ •์„ ์‚ดํŽด๋ณผ๊นŒ์š”? * โ€˜[๋”ฅ๋Ÿฌ๋‹] LSTM ์‰ฝ๊ฒŒ ์ดํ•ดํ•˜๊ธฐโ€™ ์˜์ƒ์˜ ์„ค๋ช…๊ณผ ์ด๋ฏธ์ง€, ์˜ˆ์ œ๋ฅผ ์ฐธ๊ณ ํ•˜์˜€์Šต๋‹ˆ๋‹ค. https://www.youtube.com/watch?v=bX6GLbpw-A4
๐Ÿ“Ž
John is my best friend, .... (x1, x2, x3, x4, x5, ....) He likes basketball, .... (x23, x24, x25, ....) 1)_____ is still my best friend, .... (x50, x51, x52, x53, x54, x55, ....) Jane is his wife, .... (x80, x81, x82, x83, ....) 2)_____ knows I am best friend of John .... (x100, x101, x102, x103, ....) โ†’ ๋นˆ์นธ 1)์— ์˜ฌ ๋Œ€๋ช…์‚ฌ๋Š” โ€˜Heโ€™, ๋นˆ์นธ 2)์— ์˜ฌ ๋Œ€๋ช…์‚ฌ๋Š” โ€˜Sheโ€™
์œ„์˜ ์˜ˆ์‹œ ๋ฌธ์žฅ๋“ค์„ LSTM์ด ํ•™์Šตํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ด ๋ด…์‹œ๋‹ค. ๊ฐ ๋‹จ์–ด๋“ค์˜ ์‹œ์ ์€ ์ž„์˜๋กœ ์„ค์ •ํ•ด๋ดค์–ด์š”. ์ฒซ ๋ฒˆ์งธ ๋นˆ์นธ์— ๋„๋‹ฌํ•  ๋•Œ๊นŒ์ง€๋Š” ์ด์ „ ๋ฌธ์žฅ๋“ค์—์„œ ์–ธ๊ธ‰๋œ John์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๊ธฐ์–ตํ•ด๋‘๊ณ , ๋นˆ์นธ์— ๋“ค์–ด๊ฐˆ ๋ง์ด โ€˜heโ€™๊ฐ€ ๋  ๊ฒƒ์ด๋ž€ ์˜ˆ์ธก์„ ๊ฐ„์งํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‹ค ๋‘ ๋ฒˆ์งธ ๋นˆ์นธ์„ ์ฒ˜๋ฆฌํ•˜๋Š” 100๋ฒˆ์งธ ์‹œ์ ์ด ์˜ค๋ฉด ์ƒˆ๋กœ ๋“ค์–ด์˜จ Jane์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๊ธฐ์–ตํ•˜๊ณ , ์ด์ „์˜ John์— ๋Œ€ํ•œ ์ •๋ณด๋Š” ์ž ์‹œ ์žŠ์–ด์•ผ ๋นˆ์นธ์— ๋“ค์–ด๊ฐˆ ๋ง์ด โ€˜sheโ€™๊ฐ€ ๋  ๊ฒƒ์ด๋ผ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์„ ๊ฑฐ์˜ˆ์š”. ์ด์ฒ˜๋Ÿผ LSTM ๋ชจ๋ธ์—๋Š” ๋ถˆํ•„์š”ํ•œ ์ •๋ณด๋Š” ์žŠ์–ด๋ฒ„๋ฆฌ๊ณ , ํ•„์š”ํ•œ ์ •๋ณด๋Š” ๊ธฐ์–ต์„ ๊ฐ„์งํ•˜๋Š” ๋งค์ปค๋‹ˆ์ฆ˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฐ”๋กœ โ€˜์‚ญ์ œ ๊ฒŒ์ดํŠธโ€™์ž…๋‹ˆ๋‹ค. Jane์ด๋ž€ ๋‹จ์–ด๊ฐ€ ์ฒ˜์Œ ๋“ฑ์žฅํ•˜๋Š” 80๋ฒˆ์งธ ์‹œ์ ์„ ๊ธฐ์ค€์œผ๋กœ LSTM ๋ชจ๋ธ์ด ์ •๋ณด๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ณผ์ •์„ ๋”ฐ๋ผ๊ฐ€ ๋ณผ๊นŒ์š”?
๐Ÿšช
a. ์ž…๋ ฅ ๊ฒŒ์ดํŠธ (Input Mechanism)
notion image
๐Ÿšช
b. ์‚ญ์ œ ๊ฒŒ์ดํŠธ (Forget Mechanism)
notion image
๐Ÿšช
c. ์ถœ๋ ฅ ๊ฒŒ์ดํŠธ (Output Mechanism)
notion image
์ด์ฒ˜๋Ÿผ LSTM์—์„œ๋Š” ์ž…๋ ฅ, ์‚ญ์ œ, ์ถœ๋ ฅ ๊ฒŒ์ดํŠธ๋ฅผ ๊ฑฐ์ณ ๊ธฐ์–ตํ•  ์ •๋ณด์™€ ์‚ญ์ œํ•  ์ •๋ณด๋ฅผ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค. LSTM ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜๋ฉด ๊ธด ์‹œํ€€์Šค์˜ ๋ฐ์ดํ„ฐ๋„ ์ •๊ตํ•˜๊ฒŒ ๋ถ„๋ฅ˜ํ•ด๋‚ผ ์ˆ˜ ์žˆ๋‹ต๋‹ˆ๋‹ค.

3) LSTM์œผ๋กœ ๊ฐ์ • ๋ถ„๋ฅ˜ ํ•™์Šตํ•˜๊ธฐ

ย 
๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ์ด๋ณ„ ๊ฐ€์‚ฌ ๊ฐ์ • ๋ถ„๋ฅ˜ ๋ชจ๋ธ์€ ์ด๋ณ„ ๋…ธ๋ž˜ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ํ•™์Šต๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ ๋‹จ๊ณ„, ์ด๋ณ„ ๊ฐ€์‚ฌ ๊ฐ์ • ์‚ฌ์ „ ๊ตฌ์ถ• ๋‹จ๊ณ„, ์ด๋ณ„ ๊ฐ€์‚ฌ ๊ฐ์ • ๋ถ„๋ฅ˜ ๋‹จ๊ณ„๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ์ค‘ ๊ฐ€์‚ฌ ๊ฐ์ • ๋ถ„๋ฅ˜ ๋‹จ๊ณ„์—์„œ ๋ฌธ์žฅ์˜ ํ๋ฆ„์„ ํ™•์ธํ•˜๊ณ  ์ด๋ณ„ ๊ฐ€์‚ฌ์˜ ๊ฐ์ •์„ ์ •ํ™•ํ•˜๊ฒŒ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•ด LSTM ๋ชจ๋ธ์ด ํ™œ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. LSTM ๋ชจ๋ธ์„ ์ง€๋„ํ•™์Šต์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ ํ•™์Šต๋ฐ์ดํ„ฐ๋กœ ์•ž์„œ Word2Vec์œผ๋กœ ๊ตฌ์ถ•ํ•œ ๊ฐ์ • ์‚ฌ์ „์„ ์ด์šฉํ–ˆ๊ณ , ์ •๋‹ต ๋ฐ์ดํ„ฐ๋กœ๋Š” ๊ฐ์ • ๋‹จ์–ด์˜ ๋ผ๋ฒจ๋ง๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
notion image
ํ•™์Šต์— ์‚ฌ์šฉ๋œ ๋‹จ์–ด์˜ ๊ฐœ์ˆ˜๋Š” ์ด 30๊ฐœ๋กœ, ๊ฐ๊ฐ์˜ ๋‹จ์–ด๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ์‹œ์  ์—ญ์‹œ 30๊ฐœ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค. ์ด 30๊ฐœ์˜ ๋‹จ์–ด๋“ค์€ 3์ฐจ์› ๋ฒกํ„ฐ์ธ ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ ์ธต์— ์ž„๋ฒ ๋”ฉ์ด ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ ์ธต์€ ์ž…๋ ฅ๊ฐ’ X, ๋ผ๋ฒจ๋ง๋œ ๊ฐ์ •์€ ๊ฒฐ๊ณผ๊ฐ’ Y๋กœ ์ง€์ •ํ•˜์˜€์Šต๋‹ˆ๋‹ค. X์— ์žˆ๋Š” ์ž…๋ ฅ๊ฐ’ x1, x2, x3.....x30์ด ๊ฐ๊ฐ LSTM ๋‚ด๋ถ€ cell์„ ๊ฑฐ์ณ y1, y2, y3.... y30์˜ ๊ฒฐ๊ณผ๊ฐ’์œผ๋กœ ์‚ฐ์ถœ๋˜๊ณ , ์ด ๊ฒฐ๊ณผ๊ฐ’์ด ๋‹ค์‹œ ์ตœ์ข… ์˜ˆ์ƒ๊ฐ’์œผ๋กœ ์‚ฐ์ถœ๋˜๋Š” ๊ณผ์ •์—๋Š” softmax๋ผ๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜๊ฐ€ ์ ์šฉ๋˜์—ˆ๊ฒ ๋„ค์š”. ์ง€๋„ํ•™์Šต์„ ๊ฑฐ์นœ LSTM ๋ชจ๋ธ์€ ์ฃผ์–ด์ง„ ๋‹จ์–ด ์ž…๋ ฅ๊ฐ’์— ๋Œ€ํ•ด ๋…ผ๋ฌธ์—์„œ ๋ถ„๋ฅ˜ํ•œ 4๊ฐ€์ง€์˜ ์ด๋ณ„ ๊ฐ์ • โ€˜์Šฌํ””โ€™, โ€˜๋ถ€์ •โ€™, โ€˜๋ฌด๊ด€์‹ฌ(๊ด€์กฐ)โ€™, โ€˜๋ถ„๋…ธโ€™ ์ค‘ ํ•˜๋‚˜๋ฅผ ํŒ๋‹จํ•ด๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ์‚ฌ์ง„์„ ๋ณด๋ฉด ๊ฐ™์€ โ€˜ํ—ค์–ด์ง€๋‹คโ€™๋ผ๋Š” ๋‹จ์–ด๋„ ๋ฌธ์žฅ์˜ ๋‚ด์šฉ๊ณผ ๋ฌธ์žฅ ๋‚ด ์œ„์น˜ ๋“ฑ์ด ๊ณ ๋ ค๋˜์–ด ์„œ๋กœ ๋‹ค๋ฅธ ๊ฐ์ •์œผ๋กœ ๋ถ„๋ฅ˜๋˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์–ด์š”. โ€œ์šฐ๋ฆฌ ํ—ค์–ด์ ธ ๋ณด์žโ€๋ผ๋Š” ๋ฌธ์žฅ์—์„œ๋Š” ๋ฌด๊ด€์‹ฌ์˜ ๊ฐ์ •์œผ๋กœ, โ€œ๋‚œ ๋ชป ํ—ค์–ด์ ธ ๋‚˜๋ฅผ ๋– ๋‚˜๋ฉด ์•ˆ๋ผโ€ ๋ผ๋Š” ๋ฌธ์žฅ์—์„œ๋Š” ๋ถ€์ •์˜ ๊ฐ์ •์œผ๋กœ ๋ถ„๋ฅ˜๊ฐ€ ๋˜์—ˆ๋„ค์š”.
notion image

๐ŸŽง Outro. More & More

ย 
์ด๋ ‡๊ฒŒ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ๋ชจ๋ธ ์ค‘ Word2Vec์™€ LSTM๋ฅผ ํ™œ์šฉํ•ด ์ด๋ณ„ ๋…ธ๋ž˜ ๊ฐ€์‚ฌ์˜ ๊ฐ์ •์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ์ž‘์—…์ด ์–ด๋–ป๊ฒŒ ๊ฐ€๋Šฅํ•œ์ง€ ์•Œ์•„๋ณด์•˜์Šต๋‹ˆ๋‹ค. ์ด๋ณ„ ๋…ธ๋ž˜๋“ค ์ค‘์—์„œ๋„ ๋ฌด๊ด€์‹ฌ, ํ˜„์‹ค ๋ถ€์ •, ์Šฌํ”” ๋“ฑ ๊ฐ์ •์˜ ๊ฒฐ์ด ๋‹ค๋ฅด๊ฒŒ ๋‚˜๋‰  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ๊ณผ, ์ด๋Ÿฐ ์„ฌ์„ธํ•œ ๊ฐ์ •์˜ ์ฐจ์ด๋ฅผ ์‚ฌ๋žŒ์ด ์ผ์ผ์ด ๋ถ„์„ํ•˜์ง€ ์•Š์•„๋„ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ๋ฅผ ํ†ตํ•ด ์ž˜ ๋ถ„๋ฅ˜ํ•ด๋‚ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ด ์ •๋ง ์‹ ๊ธฐํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ๋…ธ๋ž˜ ๊ฐ€์‚ฌ๋Š” โ€œ์ด๋ณ„โ€์ด๋‚˜ โ€œ์‚ฌ๋ž‘โ€์„ ์ง์ ‘์ ์œผ๋กœ ์–ธ๊ธ‰ํ•˜์ง€ ์•Š๊ณ  ๋‹ค์–‘ํ•œ ์€์œ ๋กœ ๊ฐ์ •์„ ์ „๋‹ฌํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์€๋ฐ, ์ด๋Ÿฌํ•œ ์€์œ ์  ํ‘œํ˜„์˜ ์˜๋ฏธ๋„ Word2Vec ๋ชจ๋ธ์„ ํ†ตํ•ด ์ถฉ๋ถ„ํžˆ ํŒŒ์•…์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ์–ด์š”. NLP๋ฅผ ํ™œ์šฉํ•ด ๋…ธ๋ž˜ ๊ฐ€์‚ฌ์˜ ์ฃผ์ œ๋‚˜ ๊ฐ์ •์„ ๋ถ„๋ฅ˜ํ•ด๋‚ด๋Š” ๋ชจ๋ธ์ด ๋” ๋ฐœ์ „ํ•˜๊ฒŒ ๋œ๋‹ค๋ฉด, ๊ธฐ์กด์˜ ๋ฉœ๋กœ๋””์˜ ์œ ์‚ฌ์„ฑ์ด๋‚˜ ๋…ธ๋ž˜ ์ „๋ฐ˜์˜ ์žฅ๋ฅด๋‚˜ ๋ถ„์œ„๊ธฐ์— ๋Œ€ํ•œ ํ•ด์‹œํƒœ๊ทธ ํ‚ค์›Œ๋“œ์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ๋…ธ๋ž˜๋ฅผ ์ถ”์ฒœํ•˜๋Š” ์‹œ์Šคํ…œ์—์„œ ๋” ๋‚˜์•„๊ฐ„ ๋…ธ๋ž˜ ์ถ”์ฒœ๊ณผ ํ”Œ๋ ˆ์ด๋ฆฌ์ŠคํŠธ ํ๋ ˆ์ด์…˜์ด ๊ฐ€๋Šฅํ•ด์งˆ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์œ ์‚ฌํ•œ ๊ฐ์ •์ด ๋“œ๋Ÿฌ๋‚˜๋Š” ๋…ธ๋ž˜ ๊ฐ€์‚ฌ์˜ ํ•œ ๋ถ€๋ถ„์„ ์ถ”์ฒœํ•ด์ฃผ๋Š” ์‹œ์Šคํ…œ์ด ๊ฐ€๋Šฅํ•ด์ง„๋‹ค๋ฉด ์ถ”์ฒœ์˜ ๋‹จ์œ„๊ฐ€ ํ•˜๋‚˜์˜ ๊ณก์ด ์•„๋‹ˆ๋ผ ๊ทธ๋ณด๋‹ค ๋” ์„ธ๋ถ„ํ™”๋œ ๊ฐ€์‚ฌ ๊ตฌ์ ˆ์ด ๋˜๋Š” ๊ฒƒ์ด๋‹ˆ, ๋” ํ’๋ถ€ํ•œ ๋…ธ๋ž˜ ์ถ”์ฒœ์ด ๊ฐ€๋Šฅํ•ด์งˆ ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๋” ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์˜ NLP ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ์ž๋“ค์˜ ๋…ธ๋ ฅ๋„ ๋์—†์ด ์ด์–ด์ง€๊ณ , ์ ์  ๋” ๋ฐœ์ „๋˜๋Š” NLP ๋ชจ๋ธ์„ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ฒ”์œ„๋„ ๋์—†์ด ๋„“์–ด์งˆ ์•ž์œผ๋กœ๊ฐ€ ๊ธฐ๋Œ€๋˜์ง€ ์•Š์œผ์‹ ๊ฐ€์š”?
ย 
ย 

๐ŸŽง ์ฐธ๊ณ ๋ฌธํ—Œ

ย 
  • โœจย AI ML DL, [์ผ€๋ผ์Šค] ๋ฌด์ž‘์ • ํŠœํ† ๋ฆฌ์–ผ1 - Sequential Model ๊ตฌํ˜„, 2019.12.11., https://ebbnflow.tistory.com/120
  • ํ•˜์–€์ข…์ด๊ฐœ๋ฐœ์ž, ๋”ฅ๋Ÿฌ๋‹์—์„œ ๊ฐ€์ค‘์น˜(W), ํŽธํ–ฅ(Bias)์˜ ์—ญํ• , 2021.8.30. https://jh2021.tistory.com/3
ย