Deep Neural Networks for YouTube Recommendations
โค๏ธ

Deep Neural Networks for YouTube Recommendations

Created
Mar 8, 2022
Editor
Tags
Recommendation System
cleanUrl: "paper/YouTubeRecommendation2"
๐Ÿ“„
๋…ผ๋ฌธ : Deep Neural Networks for YouTube Recommendations ์ €์ž : Paul Covington, Jay Adams, Emre Sargin

๋…ผ๋ฌธ ์„ ์ • ๊ณ„๊ธฐ

์ถ”์ฒœ์‹œ์Šคํ…œ(Recommender System)์ด ๊ฐ€์žฅ ๋งŽ์ด ์“ฐ์ด๊ณ  ์œ ๋ช…ํ•œ ๋ถ„์•ผ๋ผ๊ณ  ํ•˜๋ฉด โ€˜๋„ทํ”Œ๋ฆญ์Šคโ€™์™€ โ€˜์œ ํŠœ๋ธŒโ€™๋ฅผ ๋– ์˜ฌ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์šฐ๋ฆฌ๊ฐ€ ํ‰์ƒ์‹œ์— ๋งŽ์ด ์ ‘ํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์ด๊ธฐ๋„ ํ•˜๊ณ  ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•œ ๋ถ„์•ผ๋กœ ๊ฐ€์žฅ ๋จผ์ € ๋– ์˜ฌ๋ž๊ธฐ ๋•Œ๋ฌธ์— ์œ ํŠœ๋ธŒ์˜ ์ถ”์ฒœ์‹œ์Šคํ…œ์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ณ  ์‹ถ์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ 2016๋…„ ์œ ํŠœ๋ธŒ์—์„œ ๊ณต๊ฐœํ•œ ์ถ”์ฒœ์‹œ์Šคํ…œ ๊ด€๋ จ ๋…ผ๋ฌธ์„ ์„ ์ •ํ•˜๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
ย 

Introduction

์œ ํŠœ๋ธŒ๋Š” ์ถ”์ฒœ์‹œ์Šคํ…œ ๊ด€๋ จ ๋…ผ๋ฌธ์„ 2010, 2016, 2019๋…„ ์ด 3๋ฒˆ ๋ฐœํ‘œํ–ˆ์œผ๋ฉฐ ์ด ๋…ผ๋ฌธ์€ ๊ทธ ์ค‘ 2016๋…„์— ๋ฐœํ‘œํ•œ Deep Neural Networks for YouTube Recommendations์ž…๋‹ˆ๋‹ค. 2010๋…„์— ๋ฐœํ‘œํ•œ ์œ ํŠœ๋ธŒ์˜ ์ถ”์ฒœ์‹œ์Šคํ…œ์€ ์ „ํ†ต์ ์ธ ๋ฐ์ดํ„ฐ ๋งˆ์ด๋‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋งŒ๋“ค์–ด์ ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ตฌ์„ฑ์ด ๊ฐ„๋‹จํ•˜๊ณ  ๋””๋ฒ„๊น…์ด ์‰ฌ์› ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋”ฅ๋Ÿฌ๋‹์ด ์ฃผ๋ชฉ ๋ฐ›๊ฒŒ ๋˜๋ฉด์„œ ๋”ฅ๋Ÿฌ๋‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ์ถ”์ฒœ์‹œ์Šคํ…œ์„ ๋ฐœํ‘œํ•œ ๊ฒƒ์ด 2016๋…„ ์ด ๋…ผ๋ฌธ์ธ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
ย 
notion image
๋”ฅ๋Ÿฌ๋‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ์ถ”์ฒœ์‹œ์Šคํ…œ์€ (1)Candidate Generation -> (2)Ranking์˜ 2๋‹จ๊ณ„ ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ณธ์œผ๋กœ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Candidate Generation ๋‹จ๊ณ„์—์„œ๋Š” ์‚ฌ์šฉ์ž์˜ ์œ ํŠœ๋ธŒ ํ–‰๋™ ๊ธฐ๋ก๋“ค์„ ์‚ฌ์šฉํ•ด์„œ ์ „์ฒด ์˜์ƒ ์ค‘์—์„œ ํ•ด๋‹น ์‚ฌ์šฉ์ž๊ฐ€ ์„ ํ˜ธํ•  ๊ฒƒ ๊ฐ™์€ ์˜์ƒ๋“ค๋กœ ๊ฐœ์ธํ™”๋œ ํ›„๋ณด๋ฅผ ๋งŒ๋“ค์–ด์ฃผ๋Š” ๊ณผ์ •์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Ranking ๋‹จ๊ณ„์—์„œ๋Š” ์•ž์„œ ๋ฝ‘ํ˜€์ง„ ํ›„๋ณด ์˜์ƒ๋“ค ์ค‘์—์„œ ์‚ฌ์šฉ์ž์˜ ์„ ํ˜ธ๋„๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ถ”์ฒœํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
ย 

Candidate Generation

Candidate Generation์€ Ranking ๋‹จ๊ณ„์—์„œ์˜ ํ›„๋ณด ๋น„๋””์˜ค ์ˆซ์ž๋ฅผ ์ˆ˜๋ฐฑ๊ฐœ ์ˆ˜์ค€์œผ๋กœ ์ขํ˜€์„œ scalability๋ฅผ ํ™•๋ณดํ•ฉ๋‹ˆ๋‹ค.
ย 

1. Recommendation as Classification

์‚ฌ์šฉ์ž(U)์™€ Context(C)๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํŠน์ • ์‹œ๊ฐ„(t)์—์„œ ์ˆ˜๋ฐฑ๋งŒ๊ฐœ์˜ ์•„์ดํ…œ(V) ์ค‘ ๊ฐ ์•„์ดํ…œ(i)์˜ ์‹œ์ฒญ class๋ฅผ ์˜ˆ์ธกํ•˜๋Š” multiclass classification์œผ๋กœ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.
U : ์œ ์ € ์ •๋ณด
C : context ์ •๋ณด
u : ์œ ์ € ์ •๋ณด์™€ context ์ •๋ณด๋ฅผ ์กฐํ•ฉํ•ด ๋งŒ๋“ค์–ด์ง„ ์œ ์ € ์ž„๋ฒ ๋”ฉ
v : ๊ฐ ํ›„๋ณด ์˜์ƒ ์ฝ˜ํ…์ธ ์˜ ์ž„๋ฒ ๋”ฉ
ย 
์œ ํŠœ๋ธŒ์—๋Š” explicit feedback๊ณผ implicit feedback 2๊ฐ€์ง€๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ explicit feedback์€ ๊ด‘๋ฒ”์œ„ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์‚ฌ์šฉ์ž๊ฐ€ ์‹œ์ฒญ ์™„๋ฃŒํ•œ ์˜์ƒ ์ฝ˜ํ…์ธ ๋ฅผ positive๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” implicit feedback์„ ์ด์šฉํ•ฉ๋‹ˆ๋‹ค.
ย 

2. Model Architecture

notion image
๊ฐ ์œ ์ €๋“ค์ด ์‹œ์ฒญํ–ˆ๋˜ ๋น„๋””์˜ค ๋ชฉ๋ก, ๊ฒ€์ƒ‰ํ–ˆ๋˜ ํ‚ค์›Œ๋“œ, ๊ทธ๋ฆฌ๊ณ  Demographic feature (๋‚˜์ด, ์„ฑ๋ณ„, ๋‚˜์ด, ์œ„์น˜, ์‚ฌ์šฉ ๊ธฐ๊ธฐ, ๋กœ๊ทธ์ธ ์ƒํƒœ...)๋ฅผ ํ†ตํ•ด fixed-length vector๋กœ embeddingํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ฐ ๋น„๋””์˜ค๋“ค์˜ wach vector(์‹œ์ฒญ ๊ธฐ๋ก), search vector(๊ฒ€์ƒ‰ ๊ธฐ๋ก), geographic embedding(์ง€๋ฆฌ ์ •๋ณด), example age, gender ๋“ฑ์„ ๋ชจ๋‘ concatํ•˜์—ฌ ์‚ฌ์šฉ์ž ๋ฒกํ„ฐ์ธ u๋ฅผ ๊ตฌํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
ย 

3. Heterogeneous Signals

์—ฌ๊ธฐ์„œ ์œ ํŠœ๋ธŒ๊ฐ€ ์‹ ๊ฒฝ ์“ด ๋ถ€๋ถ„์€ ๋น„๋””์˜ค์˜ ๋‚˜์ด๋ฅผ ๊ณ ๋ คํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋งŽ์ด ๊ฐœ์„ ์‹œํ‚ค๋Š”๋ฐ keypoint๊ฐ€ ๋˜์—ˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
๊ณผ๊ฑฐ์˜ ๋น„๋””์˜ค ์˜์ƒ ๋ฐ์ดํ„ฐ ์œ„์ฃผ๋กœ ๋ชจ๋ธ์ด ํŽธํ–ฅ๋˜์–ด ํ•™์Šต๋˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์–ด ์ด๋ฅผ ๋ณด์ •ํ•˜๊ธฐ ์œ„ํ•ด Example Age๋ผ๋Š” feature๋ฅผ ๋„ฃ์–ด ๊ฐ training example (user watch log)์ด ํ•™์Šต ์‹œ์ ์œผ๋กœ๋ถ€ํ„ฐ ์–ผ๋งˆ๋‚˜ ์˜ค๋ž˜๋ฌ๋Š”์ง€๋ฅผ ๋ชจ๋ธ์— ๋ช…์‹œํ•ด์ค๋‹ˆ๋‹ค.
notion image
  • ํŒŒ๋ž€ ์„ : example age ์—†์ด ์˜ˆ์ธกํ•œ watch probability.
  • ๋นจ๊ฐ„ ์„ : example age๋ฅผ ๋„ฃ์–ด์„œ ์˜ˆ์ธกํ•œ watch probability
  • ์ดˆ๋ก ์„ : ์‹ค์ œ watch probability
ย 

4. Label and Context Selection

์‚ฌ์šฉ์ž๊ฐ€ ์œ ํŠœ๋ธŒ์˜ ์ถ”์ฒœ์‹œ์Šคํ…œ์ด ์•„๋‹Œ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ ์˜์ƒ์„ ์‹œ์ฒญํ–ˆ๋‹ค๋ฉด ๊ทธ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด์„œ ๋˜ ๋‹ค๋ฅธ collaborative filtering์ด ๊ฐ€๋Šฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ์ด์™ธ์˜ ๋ฐฉ๋ฒ•์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ ๋ฐ์ดํ„ฐ๊นŒ์ง€ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ์œ ํŠœ๋ธŒ๋ฅผ ๋งŽ์ด ์‹œ์ฒญํ•˜๋Š” ์‚ฌ์šฉ์ž๊ฐ€ ์žˆ์„ ๊ฒฝ์šฐ ํŽธํ–ฅ๋  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์‚ฌ์šฉ์ž๋งˆ๋‹ค ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋ฅผ ๊ณ ์ •ํ•จ์œผ๋กœ์จ ๋ชจ๋“  ์‚ฌ์šฉ์ž์˜ ๊ฐ€์ค‘์น˜๋ฅผ ๋™์ผํ•˜๊ฒŒ ์œ ์ง€ํ•˜๋Š” ๋ฐฉ๋ฒ• ๋˜ํ•œ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์„ ์ •ํ•˜๊ณ  ์ถ”์ฒœ ์•„์ดํ…œ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ณผ์ •์—์„œ ์‚ฌ์šฉ์ž์˜ ๋ฌด์ž‘์œ„๋กœ ์„ ์ •๋œ ์•„์ดํ…œ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค next ์•„์ดํ…œ์„ ์˜ˆ์ธกํ•˜๋„๋ก ๋ฐ์ดํ„ฐ๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๊ฒƒ์ด ํšจ๊ณผ์ ์ด๊ธฐ ๋•Œ๋ฌธ์— ์•„๋ž˜์™€ ๊ฐ™์ด ํŠน์ • ์‹œ์  ์ด์ „์˜ ๋ฐ์ดํ„ฐ๋กœ๋งŒ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐฉ์‹์„ ์ด์šฉํ•ฉ๋‹ˆ๋‹ค.
notion image
notion image
ย 

Ranking

Ranking ๋ชจ๋ธ์€ ๊ฐ ์‚ฌ์šฉ์ž์˜ feature๋ฅผ ์‚ฌ์šฉํ•ด ํ›„๋ณด ์•„์ดํ…œ์„ ํŠน์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Candidate Generation ๋ชจ๋ธ๊ณผ ๊ตฌ์กฐ ์ž์ฒด๋Š” ์œ ์‚ฌํ•˜๋‚˜ ๊ฐ ์•„์ดํ…œ์— score๋ฅผ ํ• ๋‹นํ•ด ์ •๋ ฌํ•จ์œผ๋กœ์จ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค.
notion image
ย 

1. Feature Representation

๋ฐ์ดํ„ฐ๋Š” ํ˜•ํƒœ์— ๋”ฐ๋ผ categorical(๋ฒ”์ฃผํ˜•)๊ณผ continuous(์—ฐ์†ํ˜•)๋กœ ๊ตฌ๋ถ„๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์˜๋ฏธ์— ๋”ฐ๋ผ query features์™€ impression features๋กœ ๊ตฌ๋ถ„๋ฉ๋‹ˆ๋‹ค.

Feature Engineering

์ˆ˜๋ฐฑ๊ฐœ์˜ feauture๊ฐ€ ์‚ฌ์šฉ๋˜๋ฉฐ ์ธ๊ณต์‹ ๊ฒฝ๋ง์— ๋„ฃ์–ด์ฃผ๊ธฐ ์œ„ํ•ด์„œ๋Š” ์–ด๋А์ •๋„์˜ raw data์— ๋Œ€ํ•œ ์ „์ฒ˜๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

Embedding Categorical Features

์˜์ƒ์˜ ID์™€ ๊ฒ€์ƒ‰ ๊ธฐ๋ก์„ ์ž„๋ฒ ๋”ฉํ•˜์—ฌ ์ธ๊ณต์‹ ๊ฒฝ๋ง์— inputํ•˜๊ณ  categorical ๋ฐ์ดํ„ฐ๊ฐ€ ์ง€๋‚˜์น˜๊ฒŒ ๋งŽ์„ ๊ฒฝ์šฐ, click์˜ ๋นˆ๋„์ˆ˜๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ top N์„ ์„ ์ •ํ•˜์—ฌ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋ฐ˜๋Œ€๋กœ ๋ถ€์กฑํ•œ ๊ฒฝ์šฐ์—๋Š” zero ์ž„๋ฒ ๋”ฉ ํ•˜์—ฌ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

Normalizing Continuous Features

continuous feature์˜ ๊ฒฝ์šฐ์—๋Š” 0~1๋กœ scalingํ•ด์ฃผ๊ณ  super/sub-linearํ•œ ํŠน์ง•์„ ๋ฐฐ์šฐ๊ธฐ ์œ„ํ•ด์„œ , ์˜ ๋ฐ์ดํ„ฐ ๋˜ํ•œ input์œผ๋กœ ๋„ฃ์–ด์ค๋‹ˆ๋‹ค.
ย 

2. Modeling Expected Watch Time

์œ ์ €๊ฐ€ ํด๋ฆญํ•œ ์˜์ƒ(positive)์™€ ํด๋ฆญํ•˜์ง€ ์•Š์€ ์˜์ƒ(negative)์˜ watch time์„ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด positive ์•„์ดํ…œ์˜ ๊ฒฝ์šฐ ์‚ฌ์šฉ์ž๊ฐ€ ์‹œ์ฒญํ•œ ์‹œ๊ฐ„์— ๋Œ€ํ•œ ๊ธฐ๋ก์ด ๋‚จ๊ฒจ์ ธ ์žˆ์–ด weighted logistic regression์„ ์‚ฌ์šฉํ•ด์„œ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ย 

3. Experiments with Hidden Layers

notion image
wider and deeper hidden ReLU ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•จ์— ๋”ฐ๋ฅธ weighted per-user loss ๊ฒฐ๊ณผ๊ฐ’์ž…๋‹ˆ๋‹ค.

Conclusion

  • ๋”ฅ๋Ÿฌ๋‹์„ ๋„์ž…ํ•œ ๋ชจ๋ธ๋กœ ์ด์ „ ๋ชจ๋ธ๋ณด๋‹ค ์„ฑ๋Šฅ์„ ๋งŽ์ด ๊ฐœ์„ ์‹œ์ผฐ์Šต๋‹ˆ๋‹ค.
  • ๋น„๋””์˜ค(์˜์ƒ)์˜ ๋‚˜์ด๋ฅผ ๋„์ž…ํ•œ ์ ๊ณผ ๊ฐ ์˜์ƒ์˜ ์‹œ์ฒญ ์‹œ๊ฐ„๋ณ„ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•œ ๊ฒƒ์ด ๋ชจ๋ธ์„ ๊ฐœ์„ ์‹œํ‚ค๋Š”๋ฐ ํฐ ์—ญํ• ์ด ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
ย 

Reference

ย 

์ด์ „ ๊ธ€ ์ฝ๊ธฐ

๐ŸŽž๏ธ
The YouTube Video Recommendation System