Character-level convolutional networks for text classification
๐Ÿ“

Character-level convolutional networks for text classification

Created
Aug 3, 2022
Editor
Tags
NLP
cleanUrl: 'paper/character-level-cnn'

1. ๋…ผ๋ฌธ ์„ ์ • ๋ฐฐ๊ฒฝ

๊ณผ๊ฑฐ ํ…์ŠคํŠธ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•ด CNN์„ ํ™œ์šฉํ•œ ๋ชจ๋ธ๋“ค์€ ์ž…๋ ฅ๊ฐ’์˜ ์ตœ์†Œ๋‹จ์œ„๋กœ ๋‹จ์–ด(embedded word vector)๋ฅผ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ, ๋ณดํ†ต word2vec ์ž„๋ฒ ๋”ฉ๋œ ๋‹จ์–ด ๋ฒกํ„ฐ๋“ค, TFIDF ์ •๋ณด, ํ˜น์€ n-gram ์ •๋ณด๋“ค์„ ์ทจํ•ฉํ•œ bag of word์ด ์ฃผ๋ฅผ ์ด๋ฃจ์—ˆ์Šต๋‹ˆ๋‹ค.
๋ฐ˜๋ฉด ๋ณธ ๋…ผ๋ฌธ์€ ๊ธฐ์กด ๋ชจ๋ธ๋“ค์˜ ์ ‘๊ทผ๋ฐฉ์‹์ด์—ˆ๋˜ ๋‹จ์–ด๋ณด๋‹ค ๋” rawํ•œ ์ •๋ณด์ธ ๋ฌธ์ž์— ์ฃผ๋ชฉํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ํ…์ŠคํŠธ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•ด ๋ฌธ์ž ๋‹จ์œ„๋ฅผ ConvNet์— ์ตœ์ดˆ๋กœ ์ ์šฉ์‹œ์ผฐ๋‹ค๋Š” ๋ฐ ์˜๋ฏธ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ๋ฌธ์ž๋ฅผ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ ๊ทผ๋ณธ์ ์ธ ์–ธ์–ด ๊ตฌ์กฐ์˜ ํŠน์ง•์„ ๋ฝ‘์•„๋‚ด๊ณ ์žํ•œ ์ ์ด ๋งค์šฐ ์ธ์ƒ๊นŠ์—ˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์–ด๋– ํ•œ ๋‚ด์šฉ์„ ๋‹ด๊ณ  ์žˆ๋Š”์ง€ ์ž์„ธํžˆ ์‚ดํŽด๋ณด๊ณ  ํ•จ๊ป˜ ๊ณต์œ ํ•˜๊ณ ์ž ํ•ด๋‹น ๋…ผ๋ฌธ์„ ์„ ์ •ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
ย 

2. Introduction

ํ…์ŠคํŠธ ๋ถ„๋ฅ˜๋Š” ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ์— ๋Œ€ํ•œ ๊ณ ์ „์  ์ฃผ์ œ์ž…๋‹ˆ๋‹ค. ํ˜„์žฌ๊นŒ์ง€์˜ ๋ชจ๋“  ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ ๊ธฐ์ˆ ๋“ค์€ ๋‹จ์–ด ์ˆ˜์ค€์— ๊ด€ํ•œ ๊ฒƒ์ด๋ฉฐ, ๊ทธ ์ค‘ ๋ช‡๋ช‡ ์ •๋ ฌ๋œ ๋‹จ์–ด ์กฐํ•ฉ(์˜ˆ. n-grams)์˜ ๊ฐ„๋‹จํ•œ ํ†ต๊ณ„๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์ตœ๊ณ ์˜ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•ฉ๋‹ˆ๋‹ค.
๋ฐ˜๋ฉด ๋งŽ์€ ์—ฐ๊ตฌ์ž๋“ค์€ CNN์ด ์Œ์„ฑ ์ธ์‹๊ณผ ๊ฐ™์€ raw signal๋กœ๋ถ€ํ„ฐ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ธฐ ์œ ์šฉํ•˜๋‹ค๋Š” ๊ฒƒ์„ ๋ฐœ๊ฒฌํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ฌธ์ž ์ˆ˜์ค€์˜ raw signal๋กœ ํ…์ŠคํŠธ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด 1D-CNN์„ ์ ์šฉํ•˜๋Š” ๋ฒ•์„ ์—ฐ๊ตฌํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ๋Œ€๊ทœ๋ชจ์˜ ๋ฐ์ดํ„ฐ์…‹์„ ํ•„์š”๋กœ ํ•˜๋Š” CNN์˜ ํŠน์„ฑ์ƒ ์—ฌ๋Ÿฌ ๋ฐ์ดํ„ฐ์…‹์„ ๊ตฌ์ถ•ํ–ˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ CNN์€ ๋‹จ์–ด์— ๋Œ€ํ•œ ์ง€์‹์„(ํ†ต์‚ฌ ๋˜๋Š” ์˜๋ฏธ๊ตฌ์กฐ๋ฅผ ํฌํ•จ) ํ•„์š”๋กœ ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋งค์šฐ ์šฉ์ดํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ์ด๋ ‡๊ฒŒ ๋ฌธ์ž ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ์€ ์กฐ๊ธˆ์˜ ์ˆ˜์ •์œผ๋กœ๋„ ์—ฌ๋Ÿฌ ์–ธ์–ด์— ์ ์šฉ๋  ์ˆ˜ ์žˆ๊ณ , ์ฒ ์ž ์˜ค๋ฅ˜๋‚˜ ์ด๋ชจํ‹ฐ์ฝ˜๋„ ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ์ž˜ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
ย 

3. Character-level Convolutional Networks

3.1 Key Modules

๋ชจ๋ธ์˜ ์ฃผ๋œ ๊ตฌ์„ฑ์€ ๋‹จ์ˆœํžˆ 1D Convolution๋งŒ ๊ณ„์‚ฐํ•˜๋Š” ์‹œ๊ฐ„์˜ Conv. module์ž…๋‹ˆ๋‹ค.
์ด์‚ฐ input function ์™€ ์ด์‚ฐ kernel function ์„ ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค.
๋‹ค์‹œ ๋งํ•˜๋ฉด, input function ๋Š” ์‹ค์ˆ˜ ๊ณต๊ฐ„ ๋‚ด ์›์†Œ๋กœ ์ •์˜๋˜๋ฉฐ, (kernel function) ๋Š” ์‹ค์ˆ˜๊ณต๊ฐ„ ๋‚ด ์›์†Œ๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค.
stride ๋ฅผ ๊ฐ–๋Š” ์™€ ์˜ Convolution ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋ฉ๋‹ˆ๋‹ค.
  • Stride : ์ž…๋ ฅ๋ฐ์ดํ„ฐ์— ํ•„ํ„ฐ๋ฅผ ์ ์šฉํ•  ๋•Œ ๊ฐ„๊ฒฉ์„ ์กฐ์ ˆํ•˜๋Š” ๊ฒƒ, ์ฆ‰ ํ•„ํ„ฐ๊ฐ€ ์ด๋™ํ•  ๊ฐ„๊ฒฉ์˜๋ฏธ.
    • ex) Stride = 1์ธ ํ•ฉ์„ฑ๊ณฑ
      notion image
๋‹จ, ์ด ๋•Œ ๋กœ, ์˜คํ”„์…‹ ์ƒ์ˆ˜์ž…๋‹ˆ๋‹ค.
  • ์˜คํ”„์…‹ ์ƒ์ˆ˜ : ๋™์ผ ์˜ค๋ธŒ์ ํŠธ ์•ˆ์—์„œ ์˜ค๋ธŒ์ ํŠธ ์ฒ˜์Œ๋ถ€ํ„ฐ ์ฃผ์–ด์ง„ ์š”์†Œ๋‚˜ ์ง€์ ๊นŒ์ง€์˜ ๋ณ€์œ„์ฐจ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ •์ˆ˜ํ˜•.
ย 
ย 
Vision์—์„œ ์ „ํ†ต์ ์ธ Convolution Net๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ, ๋ณธ ๋ชจ๋“ˆ์€ input ์™€ output ์˜ ์ง‘ํ•ฉ์—์„œ ๊ฐ€์ค‘์น˜(weights)๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” kernel function ( and )์˜ ์ง‘ํ•ฉ์— ๋Œ€ํ•ด์„œ ๋งค๊ฐœ๋ณ€์ˆ˜ํ™” ๋ฉ๋‹ˆ๋‹ค.
: input feature
: output feature
: input feature size
: output feature size
๋”ฐ๋ผ์„œ, output ๋Š” ์™€ ์˜ Convolution์„ ์— ๋Œ€ํ•ด ํ•ฉํ•˜์—ฌ ์–ป์–ด์ง‘๋‹ˆ๋‹ค.
ย 
๋” ๊นŠ์€ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ค๋Š”๋ฐ ๋„์›€์ด ๋œ ํ•ต์‹ฌ ๋ชจ๋“ˆ ์ค‘ ํ•˜๋‚˜๋Š” ์‹œ๊ฐ„ max-pooling์ž…๋‹ˆ๋‹ค. ์ปดํ“จํ„ฐ ๋น„์ „์—์„œ ์‚ฌ์šฉ๋˜๋Š” max-pooling์˜ 1-D ๋ฒ„์ „์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. (2์ฐจ์› โ†’ 1์ฐจ์›์œผ๋กœ ์ฐจ์› ์ถ•์†Œ)
input function ๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ์˜ max-pooling function ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜ ๋ฉ๋‹ˆ๋‹ค.
๋‹จ, ์ด ๋•Œ ๋กœ, ์˜คํ”„์…‹ ์ƒ์ˆ˜์ž…๋‹ˆ๋‹ค.
๋ฐ”๋กœ ์ด pooling module์€ 6๊ฐœ์˜ layer๋ณด๋‹ค ๋” ๊นŠ์€ ConvNets๋ฅผ ํ•™์Šต๊ฐ€๋Šฅํ•˜๊ฒŒ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.
๋ชจ๋ธ์˜ ๋น„์„ ํ˜•์„ฑ์€ thresholding function ์ด๋ฉฐ, ์ด๊ฒƒ์€ Convolutional layer๋ฅผ Rectified Linear Units(ReLUs)์™€ ๋น„์Šทํ•˜๊ฒŒ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
์‚ฌ์šฉ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋ฏธ๋‹ˆ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ๊ฐ€ 128์ธ ํ™•๋ฅ ์  ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•(SGD)์ด๋ฉฐ, momentum 0.9, initial step size๋Š” 0.01์„ ์‚ฌ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.
๊ฐ epoch๋Š” ํด๋ž˜์Šค ์ „์ฒด์—์„œ ๊ท ์ผํ•˜๊ฒŒ ์ƒ˜ํ”Œ๋ง ๋˜์–ด ๊ณ ์ •๋œ ์ˆ˜ ๋งŒํผ ๋ฌด์ž‘์œ„๋กœ train sample์„ ์ทจํ•ฉ๋‹ˆ๋‹ค.
์ด ๋ชจ๋ธ์€ ์ธ์ฝ”๋”ฉ๋œ ๋ฌธ์ž ์‹œํ€€์Šค๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„๋“ค์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ ์ธ์ฝ”๋”ฉ์€ ๊ฐœ์˜ ์•ŒํŒŒ๋ฒณ์— ๋Œ€ํ•ด one-hot ์ธ์ฝ”๋”ฉ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๊ฐ ์ž…๋ ฅ์€ ๊ณ ์ • ๊ธธ์ด๊ฐ€ ์ธ ์ฐจ์›์˜ ๋ฒกํ„ฐ๊ฐ€ ๋˜๋ฉฐ, ์ „์ฒด ์‹œํ€€์Šค๋Š” ์ฐจ์›์˜ ํ–‰๋ ฌ๋กœ ํ‘œํ˜„๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋•Œ ๊ธธ์ด๊ฐ€ ์„ ์ดˆ๊ณผํ•˜๋Š” ๋ชจ๋“  ๋ฌธ์ž๋Š” ๋ฌด์‹œ๋˜๋ฉฐ, ๊ณต๋ฐฑ ๋ฌธ์ž๋ฅผ ํฌํ•จํ•˜์—ฌ ์•ŒํŒŒ๋ฒณ์ด ์•„๋‹Œ ๋ชจ๋“  ๋ฌธ์ž๋Š” ๋ชจ๋‘ ์ œ๋กœ ๋ฒกํ„ฐ๋กœ ์–‘์žํ™”๋ฉ๋‹ˆ๋‹ค.
notion image
ย 
์ด ๋ชจ๋ธ์—์„œ๋Š” ์•ŒํŒŒ๋ฒณ์„ ์ด 70๊ฐœ์˜ ๋ฌธ์ž๋กœ ์ •์˜ํ–ˆ์Šต๋‹ˆ๋‹ค. 26๊ฐœ์˜ ์˜์–ด ๋ฌธ์ž, 10๊ฐœ์˜ ์ˆซ์ž, ๊ทธ๋ฆฌ๊ณ  33๊ฐœ์˜ ํŠน์ˆ˜๋ฌธ์ž์™€ ์ค„ ๋‚ด๋ฆผ ๋ฌธ์ž๋กœ ๊ตฌ์„ฑ๋˜์—ˆ์œผ๋ฉฐ ์†Œ๋ฌธ์ž๋กœ ์ž…๋ ฅ๋ฐ›๋„๋ก ํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์ „์ฒด ์•ŒํŒŒ๋ฒณ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
notion image
ย 

3.3 Model design

notion image
์ตœ์ข…์ ์œผ๋กœ 2๊ฐœ์˜ ConvNet์„ ์„ค๊ณ„ํ–ˆ์Šต๋‹ˆ๋‹ค. ํ•˜๋‚˜๋Š” ๋งŽ์€ feature๋ฅผ ๊ฐ€์ง€๋Š” ConvNet์ด๊ณ , ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” ์ ์€ feature๋ฅผ ๊ฐ€์ง€๋Š” ConvNet์œผ๋กœ feature ์ˆ˜๋ฅผ ์ œ์™ธํ•˜๊ณ ๋Š” ๋ชจ๋‘ ๋™์ผํ•ฉ๋‹ˆ๋‹ค. ์ด๋“ค์€ 6๊ฐœ์˜ Convolutional layer์™€ 3๊ฐœ์˜ fully-connected layer๋กœ ์ด๋ฃจ์–ด์ง„ ์ด 9๊ฐœ์˜ layer๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค.
๋”์šฑ ์„ธ๋ถ€์ ์œผ๋กœ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ž…๋ ฅ์˜ feature ์ˆ˜๋Š” 70์ด๋ฉฐ ์ž…๋ ฅ ๊ธธ์ด๋Š” 1014์ž…๋‹ˆ๋‹ค. ์ด๋Š” ์•ž์„œ ์–ธ๊ธ‰ํ•œ one-hot ์ธ์ฝ”๋”ฉ์„ ์‚ฌ์šฉํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— 70์ฐจ์›์˜ ๋ฒกํ„ฐ๊ฐ€ ๋˜๋Š” ๊ฒƒ์ด๋ฉฐ, 1014๊ฐœ์˜ ๋ฌธ์ž๊นŒ์ง€๋งŒ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›๋Š”๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์— ๋”ฐ๋ฅด๋ฉด ์ด ์ •๋„ ๊ธธ์ด์˜ ๋ฌธ์ž ์‹œํ€€์Šค๋ผ๋ฉด ํ…์ŠคํŠธ์˜ ๋Œ€๋ถ€๋ถ„์˜ ์ฃผ์š” ๋‚ด์šฉ์„ ์žก์•„๋‚ผ ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
์ •๊ทœํ™”๋ฅผ ์œ„ํ•˜์—ฌ 3๊ฐœ์˜ fully-connected layer ์‚ฌ์ด์— dropout์„ 2๋ฒˆ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ, ํ™•๋ฅ ์€ 0.5๋กœ ์„ค์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”๋Š” ๊ฐ€์šฐ์‹œ์•ˆ ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด๋„๋ก ํ•˜๊ณ  ๋ถ„ํฌ์˜ ํ‰๊ท ๊ณผ ๋ถ„์‚ฐ์€ ํฐ ๋ชจ๋ธ์— ๋Œ€ํ•ด์„œ๋Š” (0, 0.02)๋กœ ์ž‘์€ ๋ชจ๋ธ์€ (0, 0.05)๋กœ ์„ค์ •ํ–ˆ์Šต๋‹ˆ๋‹ค.
ย 
notion image
notion image
์œ„ ํ‘œ๋Š” ์•ž์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด feature ์ˆ˜์— ๋”ฐ๋ฅธ ํฌ๊ณ  ์ž‘์€ ๋ชจ๋ธ์˜ ์„ธ๋ถ€ ๊ตฌ์กฐ๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ํฐ ๋ชจ๋ธ์€ feature ์ˆ˜๋ฅผ 1024, ์ž‘์€ ๋ชจ๋ธ์€ 256์œผ๋กœ ์„ค์ •ํ•˜์—ฌ convolution์„ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰ ๋‹ค๋ฅธ ํฌ๊ธฐ์˜ ํ•„ํ„ฐ๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค๊ณ  ์ดํ•ดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฐธ๊ณ ๋กœ stride๋ฅผ 1๋กœ ํ•˜๊ณ  Pooling๊ณผ์ •์—์„œ overlap๋˜๋Š” ๋ถ€๋ถ„์ด ์—†๊ฒŒ ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
ย 

3.4 Data Augmentation using Thesaurus

๋ฐ์ดํ„ฐ ์ฆ๊ฐ•์€ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์—์„œ ์ผ๋ฐ˜ํ™” ์ •๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ํšจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ํ…์ŠคํŠธ์˜ ๊ฒฝ์šฐ ๋ฌธ์ž์˜ ์ˆœ์„œ๊ฐ€ ๋งค์šฐ ์ค‘์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๋ฏธ์ง€๋‚˜ ์Œ์„ฑ ์ธ์‹์—์„œ์ฒ˜๋Ÿผ ๋ฐ์ดํ„ฐ ๋ณ€ํ™˜์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๋Š˜๋ฆฌ๋Š” ๊ฒƒ์€ ๋ฐ”๋žŒ์งํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์‚ฌ์‹ค ๊ฐ€์žฅ ์ข‹์€ ๋ฐฉ๋ฒ•์€ ์‚ฌ๋žŒ์ด ์ง์ ‘ ๋ฌธ์žฅ์„ ๋ฐ”๊ฟ”์“ฐ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Š” ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ๊ฐ€ ์ฆ๊ฐ€ํ• ์ˆ˜๋ก ๋น„์šฉ์ด ๋งŽ์ด ์†Œ์š”๋˜๋ฏ€๋กœ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹จ์–ด๋‚˜ ๊ตฌ๋ฅผ ์œ ์˜์–ด๋กœ ๋Œ€์ฒด์‹œํ‚ค๋Š” ๋ฐฉ์‹์„ ํƒํ–ˆ์Šต๋‹ˆ๋‹ค(English Thesaurus ์‚ฌ์šฉ).
๋จผ์ € ์ฃผ์–ด์ง„ ํ…์ŠคํŠธ์—์„œ ๋Œ€์ฒด ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  ๋‹จ์–ด๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ~ ๋ฅผ ํ†ตํ•ด ์ƒ˜ํ”Œ๋ง ๋œ ๊ฐœ์˜ ๋‹จ์–ด๋ฅผ ์œ ์˜์–ด๋กœ ๋Œ€์ฒดํ•˜์˜€์œผ๋ฉฐ, ๋™์ผํ•œ ๊ธฐํ•˜๋ถ„ํฌ์ธ ~ ๋กœ๋ถ€ํ„ฐ ์ƒ˜ํ”Œ๋ง๋œ s๋กœ๋ถ€ํ„ฐ ์œ ์˜์–ด์˜ index๋ฅผ ๊ฒฐ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ธฐํ•˜๋ถ„ํฌ๋ฅผ ์‚ฌ์šฉํ•˜์˜€๊ธฐ ๋•Œ๋ฌธ์— ์ž์ฃผ ์‚ฌ์šฉ๋˜๋Š” ์˜๋ฏธ์™€ ๋ฉ€์–ด์งˆ์ˆ˜๋ก ์œ ์˜์–ด๊ฐ€ ์„ ํƒ๋  ๊ฐ€๋Šฅ์„ฑ์ด ์ ์„ ๊ฒƒ์ด๋ผ๊ณ  ์ถ”์ธกํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
ex) [์—ฌ์•„, ์†Œ๋…€, ์ฒ˜๋…€, ์•„์คŒ๋งˆ]์ผ ๋•Œ, ์—ฌ์•„์˜ ์œ ์˜์–ด๋กœ ์„ ํƒ๋  ํ™•๋ฅ : ์†Œ๋…€ > ์ฒ˜๋…€ > ์•„์คŒ๋งˆ

4. Comparison Models

Character CNN ๋ชจ๋ธ์„ ์ „ํ†ต์ ์ธ ์„ ํ˜•๋ชจ๋ธ๊ณผ ๋น„ ์„ ํ˜•์˜ ๋”ฅ๋Ÿฌ๋‹๋ชจ๋ธ๋กœ ๋น„๊ตํ•œ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.

4.1 Traditional Method(๊ธฐ์กด์˜ ์ „ํ†ต์ ์ธ ์„ ํ˜•๋ชจ๋ธ) : ๋ชจ๋‘ ๋‹คํ•ญ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ถ„์„ ์‚ฌ์šฉ

  • Bag-of-words and its TFIDF : bag-of-words ๋ชจ๋ธ์€ ๊ฐ ๋ฐ์ดํ„ฐ์…‹์— ๋นˆ๋„๊ฐ€ ๋†’์€ 50000๊ฐœ ๋‹จ์–ด๋“ค๋กœ ๊ตฌ์„ฑ๋จ
    • Bag of Words : ๋‹จ์–ด๋“ค์˜ ์ˆœ์„œ๋Š” ์ „ํ˜€ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ , ๋‹จ์–ด๋“ค์˜ ์ถœํ˜„ ๋นˆ๋„(frequency)์—๋งŒ ์ง‘์ค‘ํ•˜๋Š” ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ์ˆ˜์น˜ํ™” ํ‘œํ˜„ ๋ฐฉ๋ฒ•
    • TFIDF
      [TFIDF ๊ณต์‹]
      notion image
      TF : ๊ฐ ๋ฌธ์„œ์—์„œ์˜ ๊ฐ ๋‹จ์–ด์˜ ๋“ฑ์žฅ ๋นˆ๋„
      DF : ํŠน์ • ๋‹จ์–ด t๊ฐ€ ๋“ฑ์žฅํ•œ ๋ฌธ์„œ์˜ ์ˆ˜
      IDF : DF์˜ ๋ฐ˜๋น„๋ก€
      ย 
      1) ํ•ด๋‹น ๋ฌธ์„œ์—์„œ ๋‚˜ํƒ€๋‚œ ํšŸ์ˆ˜๊ฐ€ ๋งŽ์„์ˆ˜๋ก(TF)
      2) ๋‹ค๋ฅธ ๋ฌธ์„œ์—์„œ ๋‚˜ํƒ€๋‚œ ํšŸ์ˆ˜๊ฐ€ ์ ์„์ˆ˜๋ก(IDF)
      โ†’ ํ•ด๋‹น ๋ฌธ์„œ๋ฅผ ๋Œ€ํ‘œํ•˜๋Š” ํ‚ค์›Œ๋“œ
      โ†’ ๊ทธ์ € ๋‹จ์–ด์˜ ๋นˆ๋„๋งŒ์„ ์„ธ๋Š” bag-of-words์™€๋Š” ๋‹ค๋ฆ„!
  • Bag-of-ngrams and its TFIDF : 5-grams๊นŒ์ง€์˜ n-gram์—์„œ ๊ฐ€์žฅ ๋นˆ๋„๊ฐ€ ๋†’์€ ์ƒ์œ„ 500,000๊ฐœ๋กœ ๊ตฌ์„ฑ๋จ, TFIDF๋Š” ๋™์ผํ•œ ๊ณผ์ •
    • n-gram์— ๋Œ€ํ•œ ์ดํ•ด
      notion image
  • Bag-of-means on word embedding : word2vec์„ ์ ์šฉํ•œ ๊ฒƒ์— ๋Œ€ํ•ด k-means์„ ์‚ฌ์šฉํ•จ, ์ด๋ฅผ ํ†ตํ•ด ๋‚˜์˜จ ๋‹จ์–ด๋ฅผ ํด๋Ÿฌ์Šคํ„ฐ๋ง ๋œ ๋‹จ์–ด๋“ค์˜ ๋Œ€ํ‘œ ๋‹จ์–ด๋กœ ์‚ฌ์šฉํ•จ
embedding ์ฐจ์› : 300
*Word2vec์˜ ๋‹จ์–ด์˜ ๋ฒกํ„ฐํ™” : ์ฃผ๋ณ€ (context window)์— ๊ฐ™์€ ๋‹จ์–ด๊ฐ€ ๋‚˜ํƒ€๋‚˜๋Š” ๋‹จ์–ด์ผ์ˆ˜๋ก ๋น„์Šทํ•œ ๋ฒกํ„ฐ๊ฐ’์„ ๊ฐ–์Œ

4.2 Deep Learning Methods : ๋ชจ๋‘ word2vec์„ ์ด์šฉํ•˜์—ฌ ๋‹จ์–ด๋ฅผ ์ž„๋ฒ ๋”ฉํ•จ(embedding size : 300)

  • Word-based ConvNets
    • *์šฐ๋ฆฌ์˜ ๋ชจ๋ธ์€ character ๊ธฐ๋ฐ˜์ด๊ณ , ์‚ฌ์šฉํ•œ ๋น„๊ต๊ตฐ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ๋‹จ์–ด ๊ธฐ๋ฐ˜์ž„
  • LSTM(Long-short term memory)
    • ํ•™์Šต ์‹œ gradient clipping๊ณผ multinomial logistic regression์„ ์‚ฌ์šฉํ•˜์˜€์Œ

5. Large-scale Datasets and Results

CNN์€ ๋ณดํ†ต ํฐ ๋ฐ์ดํ„ฐ์…‹์— ํšจ๊ณผ์ ์ธ๋ฐ ํŠนํžˆ ์šฐ๋ฆฌ์˜ ๋ชจ๋ธ์ฒ˜๋Ÿผ character๋‹จ์œ„์˜ low-level์˜ raw features๋“ค์— ๋”์šฑ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ๋Œ€๋ถ€๋ถ„์˜ ํ…์ŠคํŠธ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ๊ฐ€ ์ž‘์œผ๋ฏ€๋กœ ํ•„์ž๋Š” ๋ฐ์ดํ„ฐ์…‹์„ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.

5.1 Dataset

notion image
ย 
  • AGโ€™s news corpus
  • Sogou news corpus
  • DBPedia ontology dataset
  • Yelp reviews
  • Yahoo! Answers dataset
  • Amazon reviews
ย 

5.2 Result

์œ„์˜ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ๋ชจ๋ธ๋“ค์„ ๋Œ๋ฆฐ testing error(%)๋ฅผ ๋‚˜ํƒ€๋‚ธ ํ‘œ์ž…๋‹ˆ๋‹ค. (๊ฐ’์ด ์ž‘์„์ˆ˜๋ก ์ข‹์€ ๊ฒƒ)
good : ํŒŒ๋ž€์ƒ‰, bad : ๋นจ๊ฐ„์ƒ‰์— ํ•ด๋‹น
Lg : large
Sm : small
w2v : word2vec
LK : lookup table
Th : thesaurus
notion image

6. Discussion

1) Character level ConvNet์€ ํšจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
๋‹จ์–ด ๋ง๊ณ ๋„ character ๋‹จ์œ„๋กœ๋„ ํ…์ŠคํŠธ ๋ถ„๋ฅ˜์— ํšจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ•์ด ๋  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
2) ๋ฐ์ดํ„ฐ์…‹์˜ ํฌ๊ธฐ๋Š” traditionalํ•œ ๋ชจ๋ธ๊ณผ ConvNets ๋ชจ๋ธ๋“ค ์‚ฌ์ด์—์„œ ์„ฑ๋Šฅ ์ฐจ์ด๋ฅผ ๋ณด์ž…๋‹ˆ๋‹ค.
์ž‘์€ ๋ฐ์ดํ„ฐ์…‹ โ†’ ์ „ํ†ต์ ์ธ NLP ๋ชจ๋ธ์ด ์„ฑ๋Šฅ ์šฐ์ˆ˜
ํฐ ๋ฐ์ดํ„ฐ์…‹ โ†’ ConvNets ๋ชจ๋ธ์ด ์„ฑ๋Šฅ ์šฐ์ˆ˜
โ‡’ ํ•™์Šต์„ ์œ„ํ•ด ๋งŽ์€ ๋ฐ์ดํ„ฐ๋ฅผ ํ•„์š”๋กœํ•˜๋Š” CNN์˜ ํŠน์„ฑ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
3) ConvNet์€ ์‚ฌ์šฉ์ž๊ฐ€ ๋งŒ๋“  ๋ฐ์ดํ„ฐ์—์„œ ์ข‹์Šต๋‹ˆ๋‹ค โ†’ real world์— ๋” ์ ํ•ฉํ•œ ๋ฐ์ดํ„ฐ์ž„์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
(ํ•˜์ง€๋งŒ convnet์ด ์ •๋ง ์˜คํƒ€๋‚˜ ์ด๋ชจํ‹ฐ์ฝ˜์˜ ๋ฌธ์ž๋“ค์— ๊ฐ•ํ•œ์ง€๋Š” ์‹คํ—˜์ด ๋” ํ•„์š”ํ•œ ์ƒํƒœ๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.)
4) ์•ŒํŒŒ๋ฒณ์˜ ์„ ํƒ์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ๋งŽ์ด ๋‹ฌ๋ผ์ง‘๋‹ˆ๋‹ค.
๋Œ€๋ฌธ์ž๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ๋ชจ๋ธ๋งํ•˜์˜€์„ ๋•Œ ์„ฑ๋Šฅ์ด ์ข‹์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.
์ €์ž๋“ค์€ ๋Œ€, ์†Œ๋ฌธ์ž๊ฐ„์˜ ์˜๋ฏธ์ฐจ์ด๊ฐ€ ์‹ค์ œ๋กœ ์กด์žฌํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์—, ์†Œ๋ฌธ์ž๋งŒ ์‚ฌ์šฉํ–ˆ์„๋•Œ regularization effect๋ฅผ ๊ฐ€์ ธ์˜จ๋‹ค๊ณ  ๋ถ„์„ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
5) task์— ๋”ฐ๋ฅธ ์„ฑ๋Šฅ ์ฐจ์ด๊ฐ€ ์—†์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
๊ฐ์„ฑ ๋ถ„์„๊ณผ ํ† ํ”ฝ๋ถ„๋ฅ˜์— ๋Œ€ํ•œ ๋‘๊ฐ€์ง€ task์— ์„ฑ๋Šฅ์„ ํ™•์ธํ•ด๋ณธ ๊ฒฐ๊ณผ, ๋ณ„๋‹ค๋ฅธ ์ฐจ์ด๊ฐ€ ์—†์—ˆ์Šต๋‹ˆ๋‹ค.
6) Word2Vec ๊ธฐ๋ฐ˜์˜ k-means ํด๋Ÿฌ์Šคํ„ฐ๋ง์„ ์ง„ํ–‰ํ•˜์—ฌ ์ž„๋ฒ ๋”ฉํ•˜์˜€์„ ๋•Œ, ๋ชจ๋“  ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด ์„ฑ๋Šฅ์ด ์ข‹์ง€ ๋ชปํ•˜์˜€์Šต๋‹ˆ๋‹ค(text classification task์—์„œ)
๋ถ„์‚ฐ ํ‘œํ˜„์„ ๋‹จ์ˆœํ•˜๊ฒŒ ํ™œ์šฉํ•˜์—ฌ ์—ญํšจ๊ณผ๊ฐ€ ์ƒ๊ฒผ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
7) ๋ชจ๋“  ๋ฐ์ดํ„ฐ์…‹์— ์žˆ์–ด ์ตœ์ ์˜ ๋ชจ๋ธ์€ ์—†๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค.
๊ฒฐ๊ตญ, ์‹คํ—˜์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ์…‹์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ๋ชจ๋ธ์„ ์ฐพ์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค.

7. Conclusion and Outlook

์ด ๋…ผ๋ฌธ์€ character-level์˜ convolutional networks๊ฐ€ text classification์—์„œ ํšจ๊ณผ์ ์œผ๋กœ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ํฐ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋งŽ์€ ์ „ํ†ต์ , ํ˜น์€ ๋”ฅ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•๋“ค๊ณผ character cnn ๋ชจ๋ธ์„ ๋น„๊ตํ•ด๋ณด์•˜์„ ๋•Œ, ๋ฐ์ดํ„ฐ์…‹์˜ ํฌ๊ธฐ ํ˜น์€ ์–ด๋–ค ์•ŒํŒŒ๋ฒณ์„ ์‚ฌ์šฉํ–ˆ๋Š”์ง€ ๋“ฑ์˜ ๋งŽ์€ ์š”์ธ๋“ค๋กœ ๋ชจ๋ธ์˜ ๊ฒฐ๊ณผ๊ฐ€ ๋‹ฌ๋ผ์ง€๊ธฐ๋„ ํ•œ๋‹ค๋Š” ์ ์ด ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.
ย 
+)
Word-based CNN๊ณผ ๋”๋ถˆ์–ด ๋ณธ ๋…ผ๋ฌธ์€ Text Classification์„ ์œ„ํ•œ Character level์˜ CNN ๋ชจ๋ธ์„ ์ œ์•ˆํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ๋Š” ๋ณธ ๋…ผ๋ฌธ์˜ ๋ฐฉ๋ฒ•์ด ๋งŽ์ด ์‚ฌ์šฉ๋˜์ง€๋Š” ์•Š์ง€๋งŒ, ๋ฌธ์ž ํ˜น์€ ๋ฌธ์žฅ ๋‹จ์œ„๊ฐ€ ์•„๋‹ˆ๋ผ character ๋‹จ์œ„๋กœ๋„ text classification์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์ ์ด ์ธ์ƒ๊นŠ์—ˆ๋˜๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค
ย 
ย 

8. Code

๋…ผ๋ฌธ์— ๋‚˜์™”๋˜ AGโ€™s News dataset์„ ์ด์šฉํ•ด ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์ฝ”๋“œ๋Š” Colab์œผ๋กœ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
ย 
ย 
0) Load Data
import numpy as np import pandas as pd from keras.preprocessing.text import Tokenizer from keras.preprocessing.sequence import pad_sequences from keras.layers import Input, Embedding, Activation, Flatten, Dense from keras.layers import Conv1D, MaxPooling1D, Dropout from keras.models import Model from google.colab import drive drive.mount('/content/drive') train_df=pd.read_csv('AG_news/train.csv') test_df=pd.read_csv('AG_news/test.csv') train_df.rename(columns={'Class Index':0,'Title':1,'Description':2},inplace=True) test_df.rename(columns={'Class Index':0,'Title':1,'Description':2},inplace=True) # concatenate column 1 and column 2 as one text for df in [train_df, test_df]: df[1] = df[1] + df[2] df = df.drop([2], axis=1)
ย 
1) Preprocessing
  • ํ…์ŠคํŠธ ์†Œ๋ฌธ์ž ๋ณ€๊ฒฝ
train_texts = train_df[1].values train_texts = [s.lower() for s in train_texts] test_texts = test_df[1].values test_texts = [s.lower() for s in test_texts]
  • Tokenizer
# Initialization tk=Tokenizer(num_words=None, char_level=True,oov_token='UNK') # Fitting tk.fit_on_texts(train_texts)
  • Construct Vocab
# construct a new vocabulary alphabet = "abcdefghijklmnopqrstuvwxyz0123456789,;.!?:'\"/\\|_@#$%^&*~`+-=<>()[]{}" char_dict = {} for i, char in enumerate(alphabet): char_dict[char] = i + 1 # Use char_dict to replace the tk.word_index tk.word_index = char_dict.copy() # Add 'UNK' to the vocabulary tk.word_index[tk.oov_token] = max(char_dict.values()) + 1 # oov_token: Out Of Vocabulary (oov) -> ๋ชจ๋ฅด๋Š” ๋‹จ์–ด๋กœ ์ธํ•ด ๋ฌธ์ œ๋ฅผ ํ‘ธ๋Š” ๊ฒƒ์ด ๊นŒ๋‹ค๋กœ์›Œ์ง€๋Š” ์ƒํ™ฉ ์ฒ˜๋ฆฌํ•จ # Convert string to index train_sequences = tk.texts_to_sequences(train_texts) test_texts = tk.texts_to_sequences(test_texts)
  • Padding
train_data = pad_sequences(train_sequences, maxlen=1014, padding='post') test_data = pad_sequences(test_texts, maxlen=1014, padding='post') # Convert to numpy array train_data = np.array(train_data, dtype='float32') test_data = np.array(test_data, dtype='float32')
ย 
  • Get Label
train_classes = train_df[0].values train_class_list = [x - 1 for x in train_classes] test_classes = test_df[0].values test_class_list = [x - 1 for x in test_classes] from tensorflow.keras.utils import to_categorical train_classes = to_categorical(train_class_list) test_classes = to_categorical(test_class_list)
ย 
2) Char CNN
  • Parameter
input_size = 1014 vocab_size = len(tk.word_index) embedding_size = 69 conv_layers = [[256, 7, 3], [256, 7, 3], [256, 3, -1], [256, 3, -1], [256, 3, -1], [256, 3, 3]] fully_connected_layers = [1024, 1024] num_of_classes = 4 dropout_p = 0.5 optimizer = 'adam' loss = 'categorical_crossentropy'
  • Embedding Layer
# Embedding weights embedding_weights = [] # (70, 69) embedding_weights.append(np.zeros(vocab_size)) # (0, 69) for char, i in tk.word_index.items(): # from index 1 to 69 onehot = np.zeros(vocab_size) onehot[i - 1] = 1 embedding_weights.append(onehot) embedding_weights = np.array(embedding_weights) print('Load') # Embedding layer Initialization embedding_layer = Embedding(vocab_size + 1, embedding_size, input_length=input_size, weights=[embedding_weights])
  • Model Construction
# Input inputs = Input(shape=(input_size,), name='input', dtype='int64') # shape=(?, 1014) # Embedding x = embedding_layer(inputs) # Conv for filter_num, filter_size, pooling_size in conv_layers: x = Conv1D(filter_num, filter_size)(x) x = Activation('relu')(x) if pooling_size != -1: x = MaxPooling1D(pool_size=pooling_size)(x) # Final shape=(None, 34, 256) x = Flatten()(x) # (None, 8704) # Fully connected layers for dense_size in fully_connected_layers: x = Dense(dense_size, activation='relu')(x) # dense_size == 1024 x = Dropout(dropout_p)(x) # Output Layer predictions = Dense(num_of_classes, activation='softmax')(x) # Build model model = Model(inputs=inputs, outputs=predictions) model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy']) # Adam, categorical_crossentropy model.summary()
notion image
ย 
  • Shuffle
indices = np.arange(train_data.shape[0]) np.random.shuffle(indices) x_train = train_data[indices] y_train = train_classes[indices] x_test = test_data y_test = test_classes
ย 
  • Training
learning_history=model.fit(x_train, y_train, validation_data=(x_test, y_test), batch_size=128, epochs=10, verbose=2)
ย 
  • Result
import matplotlib.pyplot as plt hist = pd.DataFrame(learning_history.history) hist['epoch'] = learning_history.epoch hist.tail() plt.xlabel('Epoch') plt.ylabel('Accuracy') plt.plot(hist['epoch'], hist['accuracy'], label = 'Train accuracy') plt.plot(hist['epoch'], hist['val_accuracy'], label = 'Val accuracy') plt.legend() plt.show()
notion image
ย 
ย 
*์ฐธ๊ณ ์ž๋ฃŒ
ย 
ย 
๐Ÿ“
1์ฃผ์ฐจ ์ธ์Šคํƒ€๊ทธ๋žจ ํฌ์ŠคํŠธ (1)