2-5 ๋‹จ์–ด ๋ถ„๋ฅ˜ํ•˜๊ธฐ: ํ’ˆ์‚ฌ ํƒœ๊น…

ย 
์ž์—ฐ์–ด์ฒ˜๋ฆฌ ๊ณผ์ •์—์„œ ๋ฌธ์„œ์— ๋ ˆ์ด๋ธ”์„ ํ• ๋‹นํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์€ ๊ฐœ๋…์œผ๋กœ ๋ฌธ์žฅ ์† ๋‹จ์–ด๋“ค์„ ์–ด๋–ค ํŠน์ •ํ•œ ๊ธฐ์ค€์— ๋”ฐ๋ผ ๋ถ„๋ฅ˜ํ•˜๋Š” ์ž‘์—…์„ ํ•ฉ๋‹ˆ๋‹ค. ๋‹จ์–ด๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ๋Š” ๊ฐœ์ฒด๋ช… ์ธ์‹(Named Entity Recognition)๊ณผ ํ’ˆ์‚ฌ ํƒœ๊น…(POS tagging)์ด ์žˆ์Šต๋‹ˆ๋‹ค.
ย 

๊ฐœ์ฒด๋ช… ์ธ์‹

๊ฐœ์ฒด๋ช… ์ธ์‹์€ ๋‹จ์–ด๊ฐ€ ์‚ฌ๋žŒ์ธ์ง€, ์กฐ์ง์ธ์ง€, ์‹œ๊ฐ„์„ ํ‘œํ˜„ํ•˜๋Š”์ง€ ๋“ฑ ๋‹จ์–ด์˜ ์œ ํ˜•์— ๋”ฐ๋ผ ๋‚˜๋ˆ„๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ํ†ตํ•ด The Indian Space Research Organisation or is the national space agency of India, headquartered in Bengaluru. It operates under Department of Space which is directly overseen by the Prime Minister of India while Chairman of ISRO acts as executive of DOS as well. ๋ผ๋Š” ๋ฌธ์žฅ ์† ๋‹จ์–ด๋“ค์„ ๊ฐœ์ฒด๋ช… ์ธ์‹์„ ํ†ตํ•ด ๋‹จ์–ด๋ฅผ ๋ถ„๋ฅ˜ํ•ด๋ด…์‹œ๋‹ค.
ย 
import spacy #spaCy ์‚ฌ์šฉ from spacy import displacy NER = spacy.load("en_core_web_sm") #NER๋Š” ๊ฐœ์ฒด๋ช… ์ธ์‹ Named Entity Recognition์˜ ์•ฝ์ž #raw_text๋Š” ์šฐ๋ฆฌ๊ฐ€ ๋‹ค๋ฃฐ ๋ฌธ์žฅ raw_text="The Indian Space Research Organisation or is the national space agency of India, headquartered in Bengaluru. It operates under Department of Space which is directly overseen by the Prime Minister of India while Chairman of ISRO acts as executive of DOS as well." text1= NER(raw_text) #๋ฌธ์žฅ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ for word in text1.ents: print(word.text,word.label_)
( ์ถœ์ฒ˜: https://www.analyticsvidhya.com/blog/2021/06/nlp-application-named-entity-recognition-ner-in-python-with-spacy/ )
ย 
์ฝ”๋“œ๋ฅผ ์‹คํ–‰์‹œํ‚ค๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด ๊ฒฐ๊ณผ๊ฐ€ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค. The Indian Space Research Organisation, the national space agency, Department of Space, ISRO, DOS๋Š” ์กฐ์ง(ORG), India, Bengaluru๋Š” ์œ„์น˜(GPE)๋กœ ๋ถ„๋ฅ˜๋˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ย 
The Indian Space Research Organisation ORG the national space agency ORG India GPE Bengaluru GPE Department of Space ORG India GPE ISRO ORG DOS ORG
( ์ถœ์ฒ˜: https://www.analyticsvidhya.com/blog/2021/06/nlp-application-named-entity-recognition-ner-in-python-with-spacy/ )
ย 
๊ฐœ์ฒด๋ช… ์ธ์‹ ๊ธฐ๋ฒ•์€ ์ธ๊ณต์ง€๋Šฅ ๋ฒˆ์—ญํ•  ๋•Œ ๊ต‰์žฅํžˆ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์•„์ดํฐ๊ณผ ๋งฅ๋ถ์„ ๋งŒ๋“œ๋Š” Apple์ด๋ผ๋Š” ํšŒ์‚ฌ๊ฐ€ ๋“ค์–ด๊ฐ„ ๋ฌธ์žฅ์„ ๋ฒˆ์—ญํ•  ๋•Œ ํšŒ์‚ฌ๋ช…์ด๋ผ๋Š” ๊ฒƒ์„ ์ธ์‹ํ•˜์—ฌ โ€˜์‚ฌ๊ณผ'๋กœ ์ง์—ญํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ, โ€˜์• ํ”Œโ€™๋กœ ๋ฒˆ์—ญํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ค๋‹ˆ๋‹ค.
ย 

ํ’ˆ์‚ฌ ํƒœ๊น…

ํ’ˆ์‚ฌ ํƒœ๊น…์€ ๋‹จ์–ด๋“ค์„ ๊ฐ๊ฐ์˜ ํ’ˆ์‚ฌ์— ๋”ฐ๋ผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ํ†ตํ•ด Mary slapped the green witch. ๋ผ๋Š” ๋ฌธ์žฅ์˜ ๋‹จ์–ด๋“ค์„ ํ’ˆ์‚ฌ ํƒœ๊น…ํ•ด๋ด…์‹œ๋‹ค.
ย 
import spacy nlp = spacy.load('en') doc = nlp(u"Marry slapped the green witch") for token in doc: print('{} - {}'.format(token, token.pos_))
ย 
์ฝ”๋“œ๋ฅผ ์‹คํ–‰์‹œํ‚ค๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด ๊ฒฐ๊ณผ๊ฐ€ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค. Mary๋Š” ๊ณ ์œ  ๋ช…์‚ฌ(PROPN), slapped๋Š” ๋™์‚ฌ(VERB), the๋Š” ํ•œ์ •์‚ฌ(DET), green์€ ํ˜•์šฉ์‚ฌ(ADJ), witch๋Š” ๋ช…์‚ฌ(NOUN), ๋งˆ์นจํ‘œ๋Š” punctuation(PUNCT)๋กœ ๋ถ„๋ฅ˜๊ฐ€ ๋œ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ย 
Mary - PROPN slapped - VERB the - DET green - ADJ witch - NOUN . - PUNCT
ย 
ํ’ˆ์‚ฌ ํƒœ๊น… ๊ธฐ๋ฒ•์€
2-3 ํ‘œ์ œ์–ด์™€ ์–ด๊ฐ„
์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ํ‘œ์ œ์–ด ์ถ”์ถœ์„ ํ•  ๋•Œ ์ •ํ™•๋„ ๋†’์€ ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•ด๋‚ด๋Š”๋ฐ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
ย