DICORA-TR Overview in Korean 인도네시아는 2015 년한국의전세계무역대상국중제 14 위수출대상국이다. 또한세계 5 위의인구수를보유하고있으며풍부한천원자원을보유하고있고, 동남아유일의 G20 회원국이다. 전세계온라인유저수및국가대비온라인

Similar documents
Page 2 of 5 아니다 means to not be, and is therefore the opposite of 이다. While English simply turns words like to be or to exist negative by adding not,

step 1-1

Hi-MO 애프터케어 시스템 편 5. 오비맥주 카스 카스 후레쉬 테이블 맥주는 천연식품이다 편 처음 스타일 그대로, 부탁 케어~ Hi-MO 애프터케어 시스템 지속적인 모발 관리로 끝까지 스타일이 유지되도록 독보적이다! 근데 그거 아세요? 맥주도 인공첨가물이

Page 2 of 6 Here are the rules for conjugating Whether (or not) and If when using a Descriptive Verb. The only difference here from Action Verbs is wh

KCC2011 우수발표논문 휴먼오피니언자동분류시스템구현을위한비결정오피니언형용사구문에대한연구 1) Study on Domain-dependent Keywords Co-occurring with the Adjectives of Non-deterministic Opinion

<B1E2C8B9BEC828BFCFBCBAC1F7C0FC29322E687770>

본문01

2 min 응용 말하기 01 I set my alarm for It goes off. 03 It doesn t go off. 04 I sleep in. 05 I make my bed. 06 I brush my teeth. 07 I take a shower.

1_2•• pdf(••••).pdf

퇴좈저널36호-4차-T.ps, page Preflight (2)

11¹Ú´ö±Ô

4번.hwp


<32382DC3BBB0A2C0E5BED6C0DA2E687770>

<30352DC0CCC7F6C8F B1B3292DBFACB1B8BCD2B1B3C1A42E687770>

<B3EDB9AEC1FD5F3235C1FD2E687770>

- 2 -

¹Ìµå¹Ì3Â÷Àμâ

大学4年生の正社員内定要因に関する実証分析

12Á¶±ÔÈŁ

Journal of Educational Innovation Research 2017, Vol. 27, No. 2, pp DOI: : Researc

DBPIA-NURIMEDIA

歯kjmh2004v13n1.PDF

274 한국문화 73

May 2014 BROWN Education Webzine vol.3 감사합니다. 그리고 고맙습니다. 목차 From Editor 당신에게 소중한 사람은 누구인가요? Guidance 우리 아이 좋은 점 칭찬하기 고맙다고 말해주세요 Homeschool [TIP] Famil

IKC43_06.hwp

민속지_이건욱T 최종

한국성인에서초기황반변성질환과 연관된위험요인연구

12È«±â¼±¿Ü339~370

49-9분동안 표지 3.3

2

<BFA9BAD02DB0A1BBF3B1A4B0ED28C0CCBCF6B9FC2920B3BBC1F62E706466>

03¹ü¼±±Ô

12¾ÈÇö°æ 1-155T304®¶ó

사용시 기본적인 주의사항 경고 : 전기 기구를 사용할 때는 다음의 기본적인 주의 사항을 반드시 유의하여야 합니다..제품을 사용하기 전에 반드시 사용법을 정독하십시오. 2.물과 가까운 곳, 욕실이나 부엌 그리고 수영장 같은 곳에서 제품을 사용하지 마십시오. 3.이 제품은

I&IRC5 TG_08권

Output file

<313020C1A4BFECBAC034332E687770>

ps

<332EC0E5B3B2B0E62E687770>



Stage 2 First Phonics

For example: 행복하다 = happy 행복 = happiness 성공하다 = succeed 성공 = success 말하다 = speak 말 = speech/words 성취하다 = achieve 성취 = achievement 취득하다 = acquire 취득 =

04-다시_고속철도61~80p

우리들이 일반적으로 기호

300 구보학보 12집. 1),,.,,, TV,,.,,,,,,..,...,....,... (recall). 2) 1) 양웅, 김충현, 김태원, 광고표현 수사법에 따른 이해와 선호 효과: 브랜드 인지도와 의미고정의 영향을 중심으로, 광고학연구 18권 2호, 2007 여름

<32B1B3BDC32E687770>

OP_Journalism

?? 1990년대 중반부터 일부 지방에서 자체적인 정책 혁신 을 통해 시도된 대학생촌관 정책은 그 효과에 비자발적 확산 + 대한 긍정적 평가에 힘입어 조금씩 다른 지역으로 수평적 확산이 이루어졌다. 이? + 지방 A 지방 B 비자발적 확산 중앙 중앙정부 정부 비자발적

09김정식.PDF

하나님의 선한 손의 도우심 이세상에서 가장 큰 축복은 하나님이 나와 함께 하시는 것입니다. 그 이 유는 하나님이 모든 축복의 근원이시기 때문입니다. 에스라서에 보면 하나님의 선한 손의 도우심이 함께 했던 사람의 이야기 가 나와 있는데 에스라 7장은 거듭해서 그 비결을

DBPIA-NURIMEDIA

歯1.PDF

<B7CEC4C3B8AEC6BCC0CEB9AEC7D B3E23130BFF9292E687770>

2011´ëÇпø2µµ 24p_0628

서론 34 2

04 형사판례연구 hwp

182 동북아역사논총 42호 금융정책이 조선에 어떤 영향을 미쳤는지를 살펴보고자 한다. 일제 대외금융 정책의 기본원칙은 각 식민지와 점령지마다 별도의 발권은행을 수립하여 일본 은행권이 아닌 각 지역 통화를 발행케 한 점에 있다. 이들 통화는 일본은행권 과 等 價 로 연

#Ȳ¿ë¼®

10송동수.hwp

<BFACBCBCC0C7BBE7C7D E687770>

현대영화연구

7 1 ( 12 ) ( 1912 ) 4. 3) ( ) 1 3 1, ) ( ), ( ),. 5) ( ) ). ( ). 6). ( ). ( ).


44-3대지.08류주현c

영어-중2-천재김-07과-어순-B.hwp


44-4대지.07이영희532~

_KF_Bulletin webcopy

0125_ 워크샵 발표자료_완성.key

30이지은.hwp

<C1DF3320BCF6BEF7B0E8C8B9BCAD2E687770>

<313120B9DABFB5B1B82E687770>


DBPIA-NURIMEDIA

2 동북아역사논총 50호 구권협정으로 해결됐다 는 일본 정부의 주장에 대해, 일본군 위안부 문제는 일 본 정부 군 등 국가권력이 관여한 반인도적 불법행위이므로 한일청구권협정 에 의해 해결된 것으로 볼 수 없다 는 공식 입장을 밝혔다. 또한 2011년 8월 헌 법재판소는

<30322D28C6AF29C0CCB1E2B4EB35362D312E687770>

[ 영어영문학 ] 제 55 권 4 호 (2010) ( ) ( ) ( ) 1) Kyuchul Yoon, Ji-Yeon Oh & Sang-Cheol Ahn. Teaching English prosody through English poems with clon

Vol.259 C O N T E N T S M O N T H L Y P U B L I C F I N A N C E F O R U M

BSC Discussion 1

KD hwp

2009년 국제법평론회 동계학술대회 일정

212 52,.,. 1),. (2007), (2009), (2010 ), Buzássyová, K.(1999), Bauer, L.(2001:36), Štekauer, P.(2001, 2002), Fernández-Domínguez(2009:88-91) (parole),

#중등독해1-1단원(8~35)학

슬라이드 1

p 19; pp 32 37; 2013 p ㆍ 新 興 寺 大 光 殿 大 光 殿 壁 畵 考 察 ; : 2006

산은매거진13

정진명 남재원 떠오르고 있다. 배달앱서비스는 소비자가 배달 앱서비스를 이용하여 배달음식점을 찾고 음식 을 주문하며, 대금을 결제까지 할 수 있는 서비 스를 말한다. 배달앱서비스는 간편한 음식 주문 과 바로결제 서비스를 바탕으로 전 연령층에서 빠르게 보급되고 있는 반면,

<5B335DC0B0BBF3C8BF2835B1B35FC0FAC0DAC3D6C1BEBCF6C1A4292E687770>

?

?????

Microsoft Word - HangeulWorkbook.doc

Jkcs022(89-113).hwp

#KM-250(PB)

01_60p_서천민속지_1장_최종_출력ff.indd

,,,,,, ),,, (Euripides) 2),, (Seneca, LA) 3), 1) )

<3136C1FD31C8A320C5EBC7D52E687770>

232 도시행정학보 제25집 제4호 I. 서 론 1. 연구의 배경 및 목적 사회가 다원화될수록 다양성과 복합성의 요소는 증가하게 된다. 도시의 발달은 사회의 다원 화와 밀접하게 관련되어 있기 때문에 현대화된 도시는 경제, 사회, 정치 등이 복합적으로 연 계되어 있어 특

歯M PDF

Transcription:

DICORA-TR-2016-07 1 2. MAIN I: INDONESIAN 인도네시아어감성사전및감성주석코퍼스 Edited by Soon-Kang Park NO. 2 INDONESIAN SELEX SESAC TOSAC INDONESIA

DICORA-TR-2016-07 2 Overview in Korean 인도네시아는 2015 년한국의전세계무역대상국중제 14 위수출대상국이다. 또한세계 5 위의인구수를보유하고있으며풍부한천원자원을보유하고있고, 동남아유일의 G20 회원국이다. 전세계온라인유저수및국가대비온라인활용률을살펴보면 2014 년 13 위를기록했다. 구글을비롯한글로벌서치엔진트래픽도상위권이고글로벌뉴스에대한관심도높으며, 한류에대한관심도폭발적이다. 인도네시아어는교착어로, 주어 - 목적어 - 동사의어순을지닌다. 하지만영어와는달리시제의변화가없고시간을나타내는지시어로시제를나타낸다. 또한성이나수의변화역시없으며, 명사의복수형의경우반복을통해서나타낼수있다. 이처럼모든표현은시제와성, 복수의형태가없이규칙적이다. 2016 년 2 차년도현재본연구에서는 DECO-SELEX 의경우총 7,276 개의감성어휘를획득했는데, 이중명사 1,839 개, 형용사 852 개, 동사 1777 개, 부사 498 개, 기타 971 개이다. SENT-SELEX 는 4,501 개의감성어휘를획득했고, 그중명사 1,479 개, 형용사 823 개, 동사 912 개, 부사 58 개, 기타 1,196 개이다. SESAC 은 539,167 개의트윗을검토하고 18,477 개의유의미한문장을획득했다.

DICORA-TR-2016-07 3 1. Overview 1.1. Motivation 1.1.1. Relations with Korea ( 한국과의교류비중 ) Indonesia is the top 14th economic partner of Korea. GDP of Indonesia was US$ 888 billion USA in 2014) and Big Max Index of 2016 is US$ 2.19, recorded world top 37 by the Economist. The volumn of trade was US$ 12.3 billion in 2014. The value of export was US$ 1.1 billion (1.8% decrease compared to the previous year) and Korea mostly exports the petroleum product, synthetic rubbers, precision chemical, flat panel display, steel products, textile, synthetic resins, fittings of marine vessels, electronic image devices and so on. Also the value of import was US$ 1.2 billion (7% decrease compared to the previous year). Korea also imports natural gas, crude oil, coal, petroleum product, paper stock, timbers, copper ore, and so on. ( 인도네시아는한국 14 위교역국 ) 1.1.2. Nation ( 대상국가 ) Indonesia ( 인도네시아 ) 1.1.3. Language ( 대상언어 ) Indonesian Language or Bahasa Indonesia ( 인도네시아어 ) 1.1.4. Language-population Rank ( 전세계언어사용인구 ) There are about 250 million native speakers, and Indonesia is the fifth most populous country over the world with almost 95% of its population speaking Indonesian. 255,462,000 (in 2015 census) ( 전세계인구약 2 억 5 천만명이사용 : 2015 년기준 ) top five most spoken language in the world ( 전세계인구수순위 5 위 )

DICORA-TR-2016-07 4 1.1.5. Linguistic Characteristics ( 언어유형론적특징 ) Ÿ Agglutinative language ( 교착어 ) Ÿ SOV (Subject-Object-Verb) ( 주어 - 목적어 - 동사 유형언어 ) Generally, Indonesian grammar (Tata Bahasa) is similar to European language (except Latin), especially English, which is classified as Subject-Verb-Object (SVO), rather than East Asian languages like Japanese or Korean that use the SOV word order. Indonesian and English also use the same alphabet, syntax, and punctuation. Indonesian is not a tonal language like Chinese, Vietnamese, or Thai. (from wikipedia) 1.2. Indonesian Language (from Wikipedia) Indonesian (Bahasa Indonesia) is the official language of Indonesia. It is a standardized register of Malay, an Austronesian language that has been used as a lingua franca in the Indonesian archipelago for centuries. Most Indonesians also speak one of more than 700 indigenous languages. Indonesia is the fourth most populous nation in the world (after China, India and the United States). Of its large population, the majority speak Indonesian, making it one of the most widely spoken languages in the world. Most Indonesians, aside from speaking the national language, are often fluent in another regional language (examples include Javanese, Sundanese and Madurese), which are commonly used at home and within the local community. Most formal education, and nearly all national media and other forms of communication, are conducted in Indonesian. The Indonesian name for the language is Bahasa Indonesia (literally "the language of Indonesia"). This term is occasionally found in English, and additionally "Malay-Indonesian" is sometimes used to refer collectively to the standardized language of Indonesia (Bahasa Indonesia) and the Malay language of Malaysia, Brunei, and Singapore (Bahasa Melayu).

DICORA-TR-2016-07 5 2. Indonesian Sentiment Lexicon (SELEX) This is the guideline for constructing DECO-SELEX, SENT-SELEX, and CORP-SELEX. Regarding the DECO-SELEX and SENT-SELEX, we asked the native students to check the words translated with Google translator and annotated them with POS and polarity in common. We couldn t find the important problems in this work, but machine translator like Google translator didn t translate properly, so we had to revise the guideline from time to time. 2.1. DECO-SELEX Using the Korean DECO dictionary, we translated them into the 11 kinds of languages with Google translator, and made the 2 columns which can show the POS and Polarity. The first column (Column 1) is for POS, and we made 6 kinds of choices separately. There are No word (#0), Noun (#1), Adjective (#2), Verb (#3), Others (1 word, #5), and Phrase (more than 2 words, #6). Also the second column is for the polarity, and there are 9 kinds of them. These are strongly positive (QXSP, #1), positive (QXPO, #2), neutral (QXNE, #3), negative (QXNG, #4), strongly negative (QXSN, #5), dependent (QXDE, #6), Accentuated Dependent (QXAD, #7), not sure (#8), and no opinion (#9). For example, in classifying the entries in DECO-SELEX, we found there were a lot of dialects or colloquial form of language, so we decided if they can normally be used and understood, they should be tagged, and otherwise, they should be ignored and tagged #0 (trash). 2.1.1. Example of DECO-SELEX POS POLARITY NO 한국어인도네시아어 COL 1. COL 2. #0 No Word #1 강한긍정 (QXSP) #1 Noun #2 긍정 (QXPO) #2 Adjective #3 중립 (QXNE) #3 Verb #4 부정 (QXNG) #4 Adverb #5 강한부정 (QXSN) #5 Others #6 상대극성 (QXDE) #6 Phrase #7 강한상대극성 (QXAD) #8 Not sure #9 No opinion NS QXSP 1 축복 berkat 1 2 NS QXSP 2 행복 kebahagiaan 1 2

DICORA-TR-2016-07 6 NS QXSP 3 감격 senang 4 2 NS QXSP 4 감동 emosi 4 4 NS QXSP 5 감명 kagum 4 2 NS QXSP 6 감탄 kekaguman 1 2 NS QXSP 7 강인성 kekerasan 1 4 NS QXSP 8 격상 upgrade 0 9 NS QXSP 9 극존 Geukjon 0 9 NS QXSP 10 대망 ditunggu 3 3 NS QXSP 11 만끽 menikmati 3 2 NS QXSP 12 만능 Serba bisa 4 2 NS QXSP 13 만세 hore 5 2 NS QXSP 14 만점 di luar 2 3 NS QXSP 15 만족 kadar 1 3 NS QXSP 16 박학 pengetahuan 1 2 NS QXSP 17 신중 kebijaksanaan 1 2 NS QXSP 18 야망 ambisi 1 4 NS QXSP 19 열성 semangat 1 2 NS QXSP 20 열성적 antusias 1 2 2.2. SENT-SELEX The process of constructing the SENT-SELEX is same as the one with DECO-SELEX. First of all, we extracted the words according to the intensity of the polarity. In short, we arranged the words which have positive or negative sentiment over 0.5 point respectively, and translated them into 12 kinds of foreign language with Google translator. After this work, we found the sentiment words, which are used in the each country, and tagged the polarity of them. The first column (Column 1) is for the Polarity, and there are 8 kinds of choices, strongly positive (QXSP, #1), positive (QXPO, #2), neutral (QXNE, #3), negative (QXNG, #4), strongly negative (QXSN, #5), dependent (QXDE, #6), Accentuated Dependent (QXAD, #7), not sure (#8). The second column is for POS, and we made also 6 kinds of choices separately. There are No word (#0), Noun (#1), Adjective (#2), Verb (#3), Others (1 word, #5), and Phrase (more than 2 words, #6). Concerning the SENT-SELEX, we made the specific guideline as below.

DICORA-TR-2016-07 7 SentiWordNet-based Lexicon (SENT-SELEX) GUIDELINE 1 If there re more than one POS (part of speech), please clarify and put all of them in the Column 1. For example, hope in English has two POS like noun and verb, so it should be annotated #1, 2. 2 The words, which don t have polarity, must be blank. That is, if the words have no sentiment, it should be blank without any information. For your reference, Neutral is something that there s certain kind of sentiment and it s hard to classify it without the context. 3 If the word can be used with correcting some spelling error, please correct and write down the revised word in the Column 3. But this is only for the case such as if there re just some spelling errors or incorrect word order or missing hyphen. When the words have to be revised and rewritten, please ignore the words and leave them with blank. 2.2.1. Example of SENT-SELEX POS POSI NEGA English Indonesian COL 1. COL 2. #1 Strongly Posi #1 Noun #2 Positive #2 Adjective #3 Neutral #3 Verb #4 Negative #4 Adverb #5 Strongly Nega #5 Others #6 Dependent #6 Phrase #7 Accentuated-Depd #8 Not sure 1 a 0 0.8 unable tidak 4 4 2 v 0 0.5 hyperventilate hyperventilate 3 r 1 0 fundamentally fundamental 6 2 4 r 1 0 essentially dasarnya 6 4 5 r 1 0 basically pada dasarnya 6 6 6 r 1 0 blessedly Syukurlah 1 4 7 r 0 0.5 boiling mendidih 8 r 1 0.1 enviably mengagumkan 1 3 9 r 0 0.8 negatively negatif 4 2 10 r 1 0 kindly silakan 11 r 1 0.1 unkindly unkindly 12 r 1 0 simply hanya 6 4 13 a 1 0 uncut yg belum diasah 14 a 1 0 full-length penuh 6 2

DICORA-TR-2016-07 8 15 a 1 0 absolute mutlak 6 2 16 a 1 0 direct langsung 6 4 17 a 1 0.5 unquestioning tidak perlu diragukan lagi 2 6 18 a 1 0.5 implicit implisit 6 2 19 r 1 0.1 alarmingly mengkhawatirk an 4 3 20 a 1 0.1 living hidup 21 a 0 0.5 relative relatif 6 2 22 a 0 0.5 comparative komparatif 6 2 23 r 1 0 significantly secara signifikan 6 6 24 r 0 0.6 insignificantly dgn remeh-temeh 4 6 25 v 0 0.5 wheeze desah 26 r 1 0 sprucely sprucely 27 r 1 0 smartly dgn tangkas 2 6 28 r 1 0 modishly modishly 29 a 0 0.8 assimilatory assimilatory 30 a 0 0.8 assimilative asimilatif 6 2

DICORA-TR-2016-07 9 3. Sentence-Level Sentiment-Annotated Corpus (SESAC) 3.1. Annotation Guideline Guideline for Sentiment-Classification of tweets Version 2016-03-04 HUFS 1. Classification for Tweets with 3 kinds of following annotations To carry out more efficient research, we ve decided to categorize as three kinds of columns as below. Column 1 is compulsory and Column 2 & 3 are optional. Column 1 (Obligatory: only for texts) [1] #1 Positive (e.g. Samsung G-phone is nice ) [2] #2 Negative (e.g. Samsung G-phone is bad ) [3] #3 Neutral (e.g. Samsung G-phone is average/ so-so à ONLY VERY RARELY ) [4] #4 Complex (e.g. Samsung G-phone is beautiful, but is too slow -two predicates) [5] #5 Objective (e.g. Samsung G-phone is a Korean product, I bought Galaxy S7 Edge. àobjective SENTENCES, FACT) [6] #6 AD (e.g. We offer 50% off for Samsung phone!! Vivid, Slim, Fast Samsung GALAXY S II à ADVERTISEMENT) [7] #7 Trash (e.g. Foreign languages or difficult to decide or non-understandable message) This Column 1 shows the basic information about the tweets regardless of emoticon. # 3 will be annotated very rarely because every person probably voices her/his opinion in the tweets. So please be cautious when you annotate # 3. Column 2 (Optional!!: Additional only for emoticon) [1] #1 Positive (e.g. ;-), :), ;-D, ^^, ) [2] #2 Negative (e.g. ;(, =((, x(, -.-, ) [3] #3 Complex (e.g. with positive emoticon and negative all together) For optional information of the tweets, we add this Column 2. Please remind this Column 2 is used just when there s any emoticon in the tweets regardless of content of the texts. Column 3 (Optional!!) [1] #1 Nonstandard(e.g. bd -> bad, gud->good or Samsung good is, https://www ) If there are any kinds of spelling errors or grammatical errors or hyperlinks or incomplete sentences If the tweets have an incorrect spelling or any other types of errors or irregularities (including hyperlinks), please mark the number 1 in the Column 4. 2. Notice Every annotator should mark in the Column 2 and 3 when they choose #1,2,3,4,5 in the Column 1. Note: Originally we classified the Column 3 for syntactic information. But Indonesian language is used regardless of the syntactic regulation. That s why we erase the Column 3 for this year.

DICORA-TR-2016-07 10 l Example No. Tweets Col. 1 Col. 2 Col. 3 Col. 4 1 2 3 4?Korea :: "@GADISmagz: Leeteuk & Kangin resmi membuka Korea Festival 2015 tgl 1-4 Oktober 2015 ^^ http://t.co/3gh6kydinv"?korea :: "@Hanna3424: RT @GADISmagz: Leeteuk & Kangin resmi membuka Korea Festival 2015 tgl 1-4 Oktober 2015 ^^ http://t.co/3krtyaim2v""?korea :: "@Heeyekim_96: "@.SUJUforINA: Korea Festival Indonesia 2015 Press Conference - Kangin & Leeteuk by GADISmagz http://t.co/v1cok75e8a"""?korea :: "@Lindaangeline02: Lebih sering dengerin lagu Indo / Barat / Korea? #MBMdiHati" barat 1 1 1 1 4 2 5 6 7?Korea :: "@MaiDavika_: pengen ke korea ketemu abang. naik elang http://t.co/hljow7vdlk"gblj?korea :: "@MentionRemaja: #buzzersquad Girl Band Korea Kesukaan kak?"a pink?korea :: "@Miharukiee: ELF_bandung Korea Festival #Leeteuk #Kangin #SuperJunior #Lotte http://t.co/znuf3xtgak cr.shiningkyu92" 1 1 4 3.3. Sentiment Classification Tagset for SESAC 1. POSITIVE: Positive emotion, feeling, assessment and judgement { 긍정적감정, 느낌, 평가, 판단 } (1.1) [KOR] 갤럭시노트 4 는디자인이정말완벽해요 [ENG] The design of Galaxy Note 4 is really perfect [IND] Desain galaxy note 4 sangat sempurna (1.2) [KOR] 목란탕수육은바삭하고맛있어요. [ENG] Sweet and Sour Pork in Moknan is crisp and delicious [IND] Moknan Sweet and Sour Pork renyah dan enak 2. NEGATIVE: Negative emotion, feeling, assessment and judgement { 부정적감정, 느낌, 평가, 판단 } (2.1) [KOR] 아이폰은배터리교체가안되서아쉬워요 [ENG] Too bad IPhone s battery is not replaceable [IND] Sayang sekali batere I phone tidak bisa ditukar (2.2) [KOR] 홍콩반점직원들너무불친절합니다.

DICORA-TR-2016-07 11 [ENG] The staffs of Hongkong-Banjum are very unkind [IND] Pegawai Hongkong-Banjum sangat tidak ramah 3. NEUTRAL: In-between/neutral emotion, feeling, assessment and judgement { 중간적인감정, 느낌, 평가, 판단 } (3.1) [KOR] 이수사음식맛은보통수준입니다. [ENG] The taste of the foods in Leesusa is moderate [IND] Rasa makanan di Leesusa lumayan (3.2) [KOR] 소니엑스페리아는그럭저럭쓸만합니다. [ENG] Sony Xperia is somehow usable [IND] Sony Xperia cukup layak digunakan 4. COMPLEX: Both positive and negative emotion, feeling, assessment and judgement in parallel { 긍정과부정의감정, 느낌, 평가, 판단이둘다나타남 } (4.1) [KOR] 아이폰 6s 플러스화면은커서마음에드는데가격이너무비쌉니다. [ENG] I like the big display screen on IPhone6+, but it is too expensive [IND] Saya suka I phone 6 karena layarnya besar tetapi harganya sangat mahal (4.2) [KOR] 이식당맛은괜찮은것같은데서비스가영아니네요! [ENG] The food of this restaurant is good, but service is terrible [IND] Makanan di restoran ini baik tetapi pelayanannya tidak memuaskan 5. OBJECTIVE: No subjective emotion, feeling, assessment and judgement, objective sentence { 주관적감정, 느낌, 평가, 판단이나타나지않는객관적인문장 } (5.1) [KOR] 이번주말에계림한정식에방문했었습니다. [ENG] We went to visit Gerim Korean restaurant last weekend [IND] Kami pergi ke restoran Korea Gerim minggu lalu (5.2) [KOR] 2013 년에출시된옵티머스 GK 는 80 만원이었습니다. [ENG] Optimus GK launched in 2013 was 800,000won [IND] Optimus GK yang diluncurkan pada tahun 2013 harganya 800.000 won 6. AD: Advertisement { 광고글 } (6.1) [KOR] 최신폰공짜, 사은품최대 16 종!! [ENG] Latest phones for free, up to 16 kinds of free gifts [IND] Gratis handphone keluaran terbaru dan 16 macam hadiah gratis lainnya

DICORA-TR-2016-07 12 (6.2) [KOR] 설빙 : 5 월한달간딸기빙수 50 프로할인 [ENG] Sulbing: Strawberry ice fakes 50% discount in May [IND] Sulbing: Strawberry ice fakes diskon 50% di bulan Mei 7. TRASH: Foreign language, meaningless sentence totally unrelated to target, hash tag, hyperlink etc. { 외국어, 대상과전혀관계없는의미없는문장, 문장에해시태그나링크만나타나는경우등 } (7.1) Betul lah dekat pv128 ada 설빙!!!! I saw the banner haihhhh" (7.2) 아이폰 6 :: # 증명사진찍음 # 여권사진!!!

DICORA-TR-2016-07 13 4. Token-Level Sentiment-Annotated Corpus (TOSAC) 4.1. POS TAGSET 1. NS (NOUN) 카메라 camera kamera 배터리 battery baterai 요리 cuisine kuliner, makanan, masakan 분위기 ambiance suasana, feeling 가격 price harga 직원 staff pegawai/karyawan 서비스 service servis 디자인 design desain 메뉴 menu manu 도시 city kota 2. VS (VERB) 사다 buy beli, membeli 만들다 make membuat 바꾸다 change 먹다 eat makan 주문하다 order memesan 좋아하다 like menyukai/suka 즐기다 enjoy menikmati 궁금하다 wonder bertanya-tanya/ingin tahu 가지다 have memiliki/mempunyai 놀라다 surprise kejutan

DICORA-TR-2016-07 14 3. AS (ADJECTIVE) 비싸다 expensive mahal 무겁다 heavy berat 맛있다 delicious enak, lazat 시끄럽다 noisy berisik, ribut 푸짐하다 generous berlimpah-limpah 유명하다 famous terkenal 깔끔하다 tidy rapi 좋다 good bagus 많다 numerous banyak 불편하다 uncomfortable tidak nyaman 4. DS (ADVERB) 너무 too terlalu 완전히 completely benar-benar 전혀 absolutely sama sekali 물론 of course tentu saja 자주 often sering 항상 always selalu 여전히 still masih 다시 again lagi 진짜 really benar-benar 그냥 simply biasa 5. OT (OTHERS) 6. PH (PHRASES-FROZEN) 애를먹다 have a hard job doing/to do something terganggu 어이가없다 be dumbfounded tidak masuk akal

DICORA-TR-2016-07 15 4.2. NAMED ENTITY TAGSET 1. XXPE (PERSON: Individual, People & Human group names) 박근혜 Park Geun-Hye Park Geun-Hye 오바마 Obama Obama 중국인 Chinese people orang Cina 엑소 EXO EXO 빌게이츠 Bill Gates Bill Gates 송중기 Song Joong-Ki Song Joong-Ki 소녀시대 Girl s Generation Girl s Generation 김수현 Kim Soo-Hyun Kim Soo-Hyun 슈퍼주니어 Super Junior Super Junior 스티브잡스 Steve Jobs Steve Jobs 2. XXOR (ORGANIZATION: Institution, Company & Organization names) 삼성 Samsung Samsung 애플 Apple Apple 국정원 National Intelligent Service 우리은행 Woori Bank Woori bank 한국외국어대학교 Hankuk University of Foreign Languages 엘지 LG LG 현대 Hyundai Hyundai 네이버 NAVER NAVER KBS Korean Broadcasting System S.M. 엔터테인먼트 S.M. ENTERTAINMENT

DICORA-TR-2016-07 16 3. XXGE (GEOGRAPHY: All naturally formed space names) 지리산 Gili-mountain gunung Gili 한강 Han-river sungain han 동강 Dong-river sungain Dong 동해 East Sea Laut Timur 용추계곡 Yongchu-valley lembah Yongchu 청계산 Cheonggye-mountain gunung Cheonggye 남산 Nam-mountain gunung Nam 고수동굴 Kosoo-cave Gua Gosu 천지연폭포 Cheonjiyeon-waterfall Air terjun Cheonjiyeon 4. XXLO (LOCATION: All artificially built location names) 부산 Busan 미국 USA Amerika serikat 세종시 SeJong-city kota Sejong 서울 Seoul 강남 Gangnam 가로수길 Garosu-street 인사동 Insa-dong 이태원 Itaewon 제주도 Jeju-iland pulau jeju 해운대 Haeundae 5. XXTI (TIME: All explicit time-(& event-)related names) 임진왜란 The Imjin War Peran Imjin 발렌타인데이 St. Valentine s day Han valentine 2016년 2016 Tahun 2016 크리스마스 Christmas Day Natal 삼일절 Independence Movement Day Hari peringatan gerakan anti Jepang 서울올림픽 Seoul Olympic Games Seoul Olympic Games

DICORA-TR-2016-07 17 스승의날 Teacher s Day hari guru 한국전쟁일 Korean War Memorial Day Hari peringatan perang Korea 한글날 Hangeul Proclamation Day Hari Hangeul 추석 Korean Thanksgiving Day hari thanksgiving Korea 6. XXEV (EVENT: All implicit event-(& time-)related names) 매일 everyday setiap hari 정기세일 regular sale Dijual biasa 방학 vacation liburan 작년 last year tahun lalu 간조시간 low tide time 일출시간 Sunrise time waktu matahari terbit 연말 The end of the year akhir tahun 장마철 Rainy wet season musim hujan 성수기 peak season musim ramai 개화기 the blooming season masa pemekaran 7. XXCO (CONCRETE: All immobile concrete construction names) 불국사 Bulguksa Candi Bulkuk 에펠탑 Eiffel Tower menara eifel 자유의여신상 Statue of Liberty pafung liberty 인공호수 Artificial lake danau buatan 동대문 Heunginjimun Gate dongdaemun 장례식장 A Funeral hall pemakaman 스타벅스 Starbucks Starbucks 면세점 Duty Free Shop Duty Free/ toko bebas cukai 에버랜드 Everland Everland 88 고속도로 Olympic Expressway Jalan bebas hambatan 88

DICORA-TR-2016-07 18 8. XXPR (PRODUCT: All mobile concrete product & creation names) 갤럭시폰 Galaxy phone Handphone Galaxy 소니디카 Sony Digital camera Kamera Digital Sony 탕수육 Sweet and Sour Pork Sweet and Sour Pork 고려청자 Goryeo celadon Porselen dari zaman Goryo 모나리자 Mona Lisa Mona Lisa 김치 Kimchi Kimchi 비비크림 Blemish Balm Cream Blemish Balm Cream 노트북 Laptop Laptop 그린티라떼 Greentea latte Greentea latte 책 book buku 9. XXCR (CREATION: All Abstract creation & created entity names - no color & no form) 오페라의유령 The Phantom of the Opera 애국가 National anthem Lagu kebangasaan 난타 Nanta Nanta 미션임파서블 Mission Impossible Mission Impossible 겨울왕국 Frozen Frozen 임금피크제 Salary peak Salary peak (puncak gaji) 태양의후예 Descendant of the Sun Descendant of the Sun 런닝맨 Running man Running man 벚꽃엔딩 Cherry Blossom Ending Cherry Blossom Ending 강남스타일 Gangnam Style Gangnam Style

DICORA-TR-2016-07 19 4.3. SENTIMENT POLARITY TAGSET 1. QXSP (STRONGLY POSITIVE: Score +2) 최고 the best terbaik 완벽하다 perfect sempurna 열광하다 exuberate sangat antusias 뛰어나게 outstandingly mengungguli/lebih unggul 행복 happiness kebahagiaan 소망 desire harapan 사모하다 love mencintai 굉장히 extremely sangat/terlalu 뛰어나다 excellent melebihi/unggul 위대하다 great terkemuka/besar 열광하다 exuberate sangat antusias 2. QXPO (POSITIVE: Score +1) 좋다 good bagus 아름답다 beautiful indah 만족 satisfaction kepuasan 좋아하다 like suka/menyukai 신중하게 carefully dengan teliti/dengan hati-hati 기적 miracle keajaiban 달성하다 achieve mencapai 거룩하게 holy suci/mulia 감동적이다 impressive mengesankan 겸손하게 modestly dengan rendah hati

DICORA-TR-2016-07 20 3. QXNE (NEUTRAL: Score 0) 보통 average sedang/ biasa/rata-rata 평범하다 ordinary sederhanac 평범해지다 normal normal/sederhana 그럭저럭 somehow biasa-biasa saja 평균값 average value harga rata-rata 적당하다 moderate cukup 중립적이다 neutral netral 어쨌거나 anyway bagaimanapun 중간적이다 medium medium/pertengahan 개그적이다 humor humoris 4. QXNG (NEGATIVE: Score -1) 불행 misfortune tidak bahagia/ ketidakbahagiaan/kesengsaraan 나쁘다 bad buruk 싫어하다 hate benci 고통스럽게 painfully menyakitkan 징벌 punishment hukuman 상처 wound luka 갈등하다 conflict konflik 가난하다 poor miskin 무례하게 rudely dengan kasar 귀찮게 annoyingly merepotkan 5. QXSN (STRONGLY NEGATIVE: Score -2) 증오 hatred dendam 잔혹하다 cruel sangat kejam 혐오하다 detest menjijikan 잔인하게 brutally dengan kejam/ brutal 사형 execution hukuman mati

DICORA-TR-2016-07 21 독살 poison racun 살해하다 murder membunuh 참수하다 behead memenggal kepala 가증스럽게 hatefully jahat/tercela 잔인하다 brutal dengan kejam 6. QXDE (DEPENDENT: Score <E>) 소량 little jumlah kecil/sedikit 길다 long panjang 감소하다 decrease menurun 심하게 severely parah 개방 open pembukaan 넓히다 widen memperluas 가늘다 thin tipis 가볍다 light ringan 거꾸로 backwards terbalik 까다롭게 picky rumit/ terlalu pemilih 7. QXAD (ACCENTUATED DEPENDENCY: Score <E>) 극소량 minimum minimum 거대하다 huge besar / sangat besar 장악하다 dominate menguasai 극심하게 badly ekstrem 독주 leave sb far behind meninggalkan orang lain jauh di belakang 강조하다 emphasize menekankan 급하다 urgent darurat 가장 most paling 미미하게 marginally tidak berarti/sepele 끈질기다 persistent mendesak