6-3 RNN ์‹ค์Šต : ์„ฑ์”จ ๊ตญ์  ๋ถ„๋ฅ˜(2)

SurnameClassifier ๋ชจ๋ธ

ย 
์ด์ œ ์„ฑ์”จ๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ชจ๋ธ, SurnameClassifier ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ด๋ด…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ์ž„๋ฒ ๋”ฉ ์ธต, ElmanRNN ์ธต, ๊ทธ๋ฆฌ๊ณ  Linear์ธต์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. SequenceVocabulary๋ฅผ ํ†ตํ•ด ์ •์ˆ˜๋กœ ๋งคํ•‘ํ•œ ํ† ํฐ์„ ๋ชจ๋ธ์— ์ž…๋ ฅ์„ ํ•ด์ฃผ๋ฉด, ๋จผ์ € ์ž„๋ฒ ๋”ฉ ์ธต์„ ์‚ฌ์šฉํ•ด ์ •์ˆ˜๋ฅผ ์ž„๋ฒ ๋”ฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ํ›„, RNN์œผ๋กœ ์‹œํ€€์Šค์˜ ๋ฒกํ„ฐ ํ‘œํ˜„, ์ฆ‰ ์„ฑ์”จ์˜ ๊ฐ ๋ฌธ์ž์— ๋Œ€ํ•œ ์€๋‹‰ ๋ฒกํ„ฐ๋ฅผ ๊ณ„์‚ฐํ•ด์ค๋‹ˆ๋‹ค. ์ด๋•Œ ์ „์ฒด ์‹œํ€€์Šค ์ž…๋ ฅ์„ ๊ฑฐ์นœ ๊ฒฐ๊ณผ๋ฌผ์€ ์„ฑ์”จ์˜ ๋งˆ์ง€๋ง‰ ๋ฌธ์ž์— ํ•ด๋‹นํ•˜๋Š” ๋ฒกํ„ฐ๊ฐ€ ๋˜๊ฒ ์ฃ ? ์ด์ œ ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ด ์ตœ์ข… ๋ฒกํ„ฐ๋ฅผ Linear์ธต์œผ๋กœ ์ „๋‹ฌํ•ด ์˜ˆ์ธก ๋ฒกํ„ฐ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
ย 
SurnameClassifier ๋ชจ๋ธ์€ ์•„๋ž˜ 4๊ฐœ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋ฐ›์Šต๋‹ˆ๋‹ค.
  • ๋ฌธ์ž ์ž„๋ฒ ๋”ฉ ํฌ๊ธฐ(embedding_size ๋ฐ์ดํ„ฐํ˜• int)
  • ์ž„๋ฒ ๋”ฉํ•  ๋ฌธ์ž ๊ฐœ์ˆ˜(num_embeddings ๋ฐ์ดํ„ฐํ˜• int)
  • ํด๋ž˜์Šค ๊ฐœ์ˆ˜(num_classes ๋ฐ์ดํ„ฐํ˜• int)
  • RNN์˜ ์€๋‹‰ ์ƒํƒœ ํฌ๊ธฐ (rnn_hidden_size ๋ฐ์ดํ„ฐํ˜• int)
ย 
์ด๋•Œ, ์ž„๋ฒ ๋”ฉ ํฌ๊ธฐ์™€ ํด๋ž˜์Šค ๊ฐœ์ˆ˜๋Š” ๋ฐ์ดํ„ฐ์— ๋”ฐ๋ผ ๊ฒฐ์ •๋˜๋Š” ๊ฐ’์ด์ง€๋งŒ, ์ž„๋ฒ ๋”ฉํ•  ๋ฌธ์ž ๊ฐœ์ˆ˜์™€ RNN ์€๋‹‰ ์ƒํƒœ ํฌ๊ธฐ๋Š” ๊ฐ’์„ ์ง์ ‘ ์ •ํ•ด์ฃผ์–ด์•ผํ•˜๋Š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ์ž…๋‹ˆ๋‹ค.
ย 
class SurnameClassifier(nn.module): '''RNN์œผ๋กœ ํŠน์„ฑ์„ ์ถ”์ถœํ•˜๊ณ  MLP๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ถ„๋ฅ˜ ๋ชจ๋ธ''' def __init__(self, embedding_size, num_embeddings, num_classes, rnn_hidden_size, batch_first = True, padding_idx = 0): super(SurnameClassifier, self).__init__() self.emb = nn.Embedding(num_embeddings = num_embeddings, embedding_dim = embedding_size, padding_idx = padding_idx) self.rnn = ElmanRNN(input_size = embedding_size, hidden_size = rnn_hidden_size, batch_first = batch_first) self.fc1 = nn.Linear(in_features = rnn_hidden_size, out_features = rnn_hidden_size) self.fc2 = nn.Linear(in_features = rnn_hidden_size, out_features = num_classes) def forward(self, x_in, x_lengths = None, apply_softmax = False): '''๋ถ„๋ฅ˜๊ธฐ์˜ ์ •๋ฐฉํ–ฅ ๊ณ„์‚ฐ''' x_embedded = self.emb(x_in) y_out = self.rnn(x_embedded) if x_lengths is not None: y_out = column_gather(y_out, x_lengths) else: y_out = y_out[:, -1, :] y_out = F.dropout(y_out, 0.5) y_out = F.relu(self.fc1(y_out)) y_out = F.dropout(y_out, 0.5) y_out = self.fc2(y_out) if apply_softmax: y_out = F.softmax(y_out, dim = 1) return y_out
ย 
forward()๋Š” ๋ฐฐ์น˜ ์† ๊ฐ ์‹œํ€€์Šค์˜ ๋งˆ์ง€๋ง‰ ๋ฒกํ„ฐ๋ฅผ ์ฐพ์•„์ฃผ๋Š” ๋งค์„œ๋“œ์ž…๋‹ˆ๋‹ค. column() ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฐฐ์น˜์˜ ํ–‰ ์ธ๋ฑ์Šค๋ฅผ ์ˆœํšŒํ•˜๋ฉด์„œ ์‹œํ€€์Šค์˜ ๋งˆ์ง€๋ง‰ ๋ฒกํ„ฐ๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
ย 
forward() ๋ฉ”์„œ๋“œ๋Š” 3๊ฐœ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ž…๋ ฅ๋ฐ›์Šต๋‹ˆ.
  • ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ํ…์„œ (x_in ๋ฐ์ดํ„ฐํ˜• torch.Tensor)
  • ๋ฐฐ์น˜ ์† ๊ฐ ์‹œํ€€์Šค์˜ ๊ธธ์ด (x_lengths ๋ฐ์ดํ„ฐํ˜• torch.Tensor)
  • ์†Œํ”„ํŠธ๋งฅ์Šค ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์œ„ํ•œ ํ”Œ๋ž˜๊ทธ (apply_softmax ๋ฐ์ดํ„ฐํ˜• bool)
ย 
def column_gather(y_out, x_lengths): '''y_out์— ์žˆ๋Š” ๊ฐ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ์—์„œ ๋งˆ์ง€๋ง‰ ๋ฒกํ„ฐ๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.''' x_lengths = x_lengths.long().detach().cpu().numpy() - 1 out = [] for batch_index, column_index in enumerate(x_lengths): out.append(y_out[batch_index, column_index]) return torch.stack(out) ''' ๋งค๊ฐœ๋ณ€์ˆ˜: y_out (torch.FloatTensor, torch.cuda.FloatTensor) shape: (batch, sequence, feature) x_lengths (torch.LongTensor, torch.cuda.LongTensor) shape: (batch,) ๋ฐ˜ํ™˜๊ฐ’: y_out (torch.FloatTensor, torch.cuda.FloatTensor) shape: (batch, feature) '''
ย 

๋ชจ๋ธ ํ›ˆ๋ จ๊ณผ ๊ฒฐ๊ณผ

ย 
์šฐ์„  ๋ฐฐ์น˜ ํ•˜๋‚˜์— ๋ชจ๋ธ์„ ์ ์šฉํ•˜๊ณ  ์˜ˆ์ธก ๋ฒกํ„ฐ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ํ›„, ์†์‹ค์„ ๊ณ„์‚ฐํ•˜์—ฌ ์†์‹ค๊ฐ’๊ณผ ์˜ตํ‹ฐ๋งˆ์ด์ €๋กœ ๊ทธ๋ ˆ์ด๋””์–ธํŠธ๋ฅผ ๊ตฌํ•˜๊ณ  ์ด๋ฅผ ํ† ๋Œ€๋กœ ๋ชจ๋ธ์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์—…๋ฐ์ดํŠธํ•ฉ๋‹ˆ๋‹ค. ์ด ํ›ˆ๋ จ ๊ณผ์ •์„ ๋ชจ๋“  ๋ฐฐ์น˜์— ๋ฐ˜๋ณตํ•ด์ฃผ๋ฉฐ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋†’์—ฌ์ฃผ๋Š” ์ตœ์ ์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ฐพ์•„์ค๋‹ˆ๋‹ค.
ย 
ย 
ย 
ย 
ย