Understanding Society through Social Big Data Mining www.daumso).com 2012. June kysong@daumso3.com
0. Social Media
300,000 10,000,000 80% 30 1 80% 3
What people write and read online is more important than ever. It both reflects and shapes people s opinion. 4
? A report last year by the McKinsey Global InsAtute, the research arm of the consulang firm, projected that the United States needs 140,000 to 190,000 more workers with deep analyacal experase and 1.5 million more data- literate managers, whether retrained or hired. New York Times, February 11, 2012 5
View U&A HolisAc View Market View Level / - Risk Management - - SKT, SK, KT Industry - 3G, LTE,, WARP - /, - // - U&A 6
1. Social Media Mining
중앙일보 - 2010년 8월 19일(1면) 탐사기획 네티즌 세상 대해부 다음소프트의 텍스트마이닝 기술을 이용, 최근 2년 3개월간의 블로그데이터 분석 120만 블로그 6500만건 대상 8
중앙일보 - 2011년 10월 5일(6면) 9
- 2011 10 24 2011.10.24, 5 hap://joongang.joinsmsn.com/arecle/920/6487920.html?ctg=1000 10
SOCIALmetrics http://campaign.socialmetrics.co.kr/ 11
, 250,, 3000 2500 2000 1500 1000 500 0 20111003 20111004 20111005 20111006 20111007 20111008 20111009 20111010 20111011 20111012 SOCIALmetrics, 2011.10.3 ~ 2011.10.12 [Source: SOCIALmetrics 20110901-20110930] 12
Mood, Scope : 6 (2011 5 1 ~ 2011 10 29 ) : 521,435,373, : 2,504,018 13
Mood Network, 6, = 65.1% SocialMetrics: Twitter, 2011.5.1 ~ 2011.10.31 = [ freq] / [ + freq] 14
Journal of Computa.onal Science Volume 2, Issue 1, March 2011, Pages 1-8 ( ) 87.6% 15
& Connected 3, (1), (2), (3),, 15% '(happy)', '(h ug)', ' (sick)', ' (vile)' 16
2.Technology
Automated Online Opinion Analyzer 1) 2) (NLP, Natural Language Processing) (Text Mining) 3),,,, (Intelligent Opinion Mining So)ware) Talkro CIMS Talkro CIMS Filter Talkro Analyzer Social Metrics Viewing result, + () (,,,, +, ) - ; (ex. ) ) iphone4s(),, - ; () iphone4 (, (ex. iphone4) ) (), è NegaAve() * *, (UI), URL 18
Data Sources 문서를 수집하는 웹사이트는 전문 사이트, 커뮤니티, 쇼핑몰, 신문, 방송, 잡지 등의 미디어, 브랜드 사이트, 정부 기관 사이트, 블로그, 트위터, 페이스북 등의 소셜 미디어까지 모든 사이트를 망라하며, 프로젝트 성격에 따라 수집 범위를 달리함. 또한 고객 센터에서 작성한 내부 문서 포함 가능 Portal Community Shopping Media Forum Social Media Organization Brand 19
Sentiment Analysis Process Admin Tool Analysis Tool Miner NLP SM3, Controller Rule Loader /pn+ /j, /n+ /j,/v+ /ec Rulebase SM3/n+ /j /v+ /pe+/ec+,/cm /n+ /j /p+ /j Editor SM3/n+ /j /n+ /j /v+ /pe+/ec+,/cm /v+/ec /n+ /j /n+ /j /ad, /n /n+ /j /v+/ec /aj+ /ec /x+/et Classifier /n+ /j /ad /aj+ /ec /x+/et Reporter Rule DB /p+ /j <<SM3/n>>+ /j /v+ /pe+/ec+,/cm Display /n+ /j /n+ /j /v+/ec /n+ /j /ad /aj+ /ec /x+/et /p+ /j <<SM3/n>>+ /j /v+ /pe+/ec+,/cm /n+ /j /aj+/ec /n+ /j /ad /aj+/et Rules [[ /p+ /j] [<<SM3/n>>+ /j][ /v+ /pe+/ec+,/cm]] [[ /n+ /j] [ /aj+/ec]] Data [[ /n+ /j][ /ad] Colllector [/aj+/et]] Morpheme DicEonary Morphological Anayzer POS tagger Parser SemanEc Analyzer NLP System StaEsEcal Data Message board Crawler Analysis Data Text SyntacEc DicEonary Ontology DicEonary Blog Collector 20
Sentiment Analysis Process (Continued) Admin Tool Analysis Tool Miner NLP Rulebase Editor Controller Reporter Rule Loader Classifier : SM3 Morphological Anayzer P () N () POS tagger Rule DB Parser Display Semantic Analyzer NLP System Rules Data Colllector Message board Crawler Blog Collector Analysis Data Text [[ /p+ /j] [<<SM3/n>>+ /j] [ /v+ /pe+/ec+,/cm]] [[ /n+ /j] [ /aj+/ec]] [[ /n+ /j] [ /ad] [/aj+/et]] Morpheme Dictionary Syntactic Dictionary Dictionary Statistical Data Ontology 21
3. Marketing
3-1. Brand Communication
? 3-1-1. &.,, PC,,,,,,,,,.,,,, q Network of q SenEment Pie of [source: 2010.01.01~2011.04.06] 24
3-1-1. & 1.. 2. 2010 11. q TransiEon of 2011. 3 2 2010. 1 2010. 4 2010. 11 Source : SOCIALmetrics 2010.1.1. ~ 2011.4.6. 25
3-1-1. &,,,.. 2,, 2010 ~ ~ 2 ~ ~ 2 Source : SOCIALmetrics 2010.1.1. ~ 2011.4.7. 26
: 3-1-1. &, ipod 1 2 3 4 5 6 7 8 9 10 * 7,682 17,838 7,892 23,498 3,218 4,972 2,930 153,646 " " ** 1,524 3,484 496 1,920 514 424 280 23,068 " " *** 20% 19% 6% 8% 16% 9% 10% 15% * : ** : *** : Source : SOCIALmetrics 2010.12.1. ~ 2010.12.31. 27
: 3-1-1. &,..,..??... "..." 14,...,, ^^. ^, ^^ 3~4 ^^ ~ 28
3-1-1. &, MP3, Social Network, Source : SOCIALmetrics 2010.1.1. ~ 2010.12.31. 29
Decision Tree 3-1-1. & Battle Field, consideration Facto r 30
Category Analysis 3-1-2. Communication q POSITIVE/NEGATIVE RATE OF OPTIMUS 2X BY CATEGORY No. Category # of buzz Positive buzz rate Negative buzz rate Question buzz rate 1 Camera/DMB 48 96% 2% 2% 2 Battery 44 91% 7% - 3 Connectivity 42 98% 2% - 4 OS/Application 39 77% 21% 3% 5 Display 24 83% 17% - 6 Design 18 39% 61% - 7 Call 16 88% 13% - 8 Touch/Button 9 100% - - 31
Keywords by Category 3-1-2. Communication q FEATURE KEYWORDS BY CATEGORY OS/Application Camera Connectivity Marketing/Retail Design Memory Display Message Battery Touch/Button Call UI Others 29% 17% 13% 11% 9% 5% 4% 4% 3% 1% 1% 1% 2% (n=2,035) OS/APPLICATION CAMERA CONNECTIVITY MARKETING/RETAIL DESIGN No. Feature % of buzz 1 dualcore 28% 2 13% 3 2 9% 4 gingerbread 7% 5 CPU 7% 6 froyo 7% 7 6% 8 5% 9 Android OS 5% 10 RAM 3% No. Feature % of buzz 1 HD 18% 2 13% 3 13% 4 9% 5 9% 6 8% 7 7% 8 5% 9 4% 10 3% No. Feature % of buzz 1 HDMI 25% 2 DMB 17% 3 DMB 14% 4 8% 5 7% 6 7% 7 DLNA 4% 8 4% 9 3% 10 TV OUT 2% No. Feature % of buzz 1 54% 2 CES 11% 3 10% 4 7% 5 CES 2011 6% 6 3% 7 2% 8 2% 9 1% 10 1% No. Feature % of buzz 1 36% 2 13% 3 10% 4 7% 5 7% 6 6% 7 4% 8 3% 9 1% 10 1% 32
Key Messages by Issue Keywords 3-1-2. Communication ISSUE KEYWORDS HD HDMI RELATED KEYWORDS LG 2 1080P 800 HDMI DMB TV HDMI 3D HDMI 1080P APPEALING POINT - - 1080p - - HDMI TV (= ) - 3D, - TV (= ) - PAIN POINT - - LG - 33
Issue Keyword : PC Expectation 2X PC 2XSMS 1GHz 2,... (2010-12-16 http://blog.naver.com/ishine75?redirect=log&logno=50101392525) 2. (2010-12-13 http://cpu29k.blog.me/90102111020) LG 2 5. (2010-12-13 http://cafe.naver.com/articleread.nhn?clubid=1 4006524&menuid=724&boardtype=L&articleid=1440727&referrerAllArticl es=false) 3 1GHx,. 3G.. (2010-12-17 http://smartnbiz.blog.me/140119956758) TV. (2010-12-13 http://www.clubcity.kr/news/articleview.h tml?idxno=68453) Worry 3-1-2. Communication,?.... (2010-12-13 http://www.neoearly.net/2464465). (2010-12-15 http://mstoresnet.tistory.com/69) 2X?... 2X. (2010-12-16 http://rmawkjhd.blog.me/50101429570) CPU :(. 2 42%... 1500mAh,...... (2010-12-17 http://idgreat7.blog.me/50101459672)........ LG (2010-12-17 http://everyharu.tistory.com/46) 34
Category Analysis 3-1-2. Communication, Touch/Button, Battery,, q POSITIVE/NEGATIVE RATE BY CATEGORY ( ) Category % of buzz Pos. OS/Application 25% 20% 96% 97% Camera 18% 17% 78% 84% Design 11% 12% 78% 80% Connectivity 10% 9% 94% 95% Display 8% 7% 86% 83% Marketing 6% 7% 14% 14% General 5% 4% 90% 94% Touch/Button 4% 6% 94% 83% Call 3% 2% 91% 90% Others 2% 4% 91% 90% Memory 2% 4% 100% 92% Battery 2% 2% 63% 79% UI 1% 2% 89% 89% Sound/Vib. 1% 1% 94% 100% Service 0% 1% 50% 86% 10?.,. (2011-01-26, )... (2 011-01-18, ). (2011-01-18, Different Tastes Ltd. ). (2010-01-25, ) 1500mAh 2011. (2011-01-25, ). (2011-01-25, ) 35
Transition 3-1-2. Communication 1 7 1 17 q DAILY BUZZ TRANSITION (OPTIMUS 2X) # of Buzz 3000 3 2011.1.2 5~26 2500 2000 1500 2011.1.7 1 2011.1.1 7~18 1410 1454 2 2011.1.2 1~22 1461 1708 2659 1000 500 238 376 448 320 341 691 285 824 765 783 492 1039 541 326 1106 343 1115 590 818 0 62 51 Source : SOCIALmetrics 2011.1.1. ~ 2011.2.26. 36
Brand Positioning by Boutique 3-1-3. Brand Perception SK-II * Boutique 3% Share 37
Spokes Model Evaluation of Competitors 3-1-4. Spokes Person Top 5,,,,,,,,, * % = (Positive Sentiment Words ratio for 5 spokes models average Positive Sentiment words ratio) 38
3-2.
2 3-2-1. Consumer U&A 2,. *50 3000 2500 2000 1500 1000 *50 6000 5000 4000 3000 2000 500 1000 0 0 *50 1400 1200 1000 800 - - *50 5000 4000 3000 - - 600 400 200 0 2000 1000 0 200803 200804 200805 200806 200807 200808 200809 200810 200811 200812 200901 200902 200903 200904 200905 200906 200907 200908 200909 200910 200911 200912 - - - - Source : SOCIALmetrics 2008.3.1. ~ 2010.2.28. 40
3-2-1. Consumer U&A,,,,.,,. % 210 5.5% 150 3.9% 137 3.6% 119 3.1% 112 2.9% 109 2.9% 98 2.6% 95 2.5% 94 2.5% 91 2.4% 86 2.2% 78 2.0% 77 2.0% % 357 4.2% 281 3.3% 268 3.1% 247 2.9% 242 2.8% 241 2.8% 222 2.6% 211 2.5% 206 2.4% 188 2.2% 178 2.1% 178 2.1% 154 1.8% % % 165 6.7% 3360 15.4% 113 4.6% 1478 6.8% 112 4.5% 983 4.5% 95 3.8% 739 3.4% 88 3.6% 655 3.0% 74 3.0% 608 2.8% 62 2.5% 578 2.7% 58 2.3% 570 2.6% 58 2.3% 489 2.2% 58 2.3% 435 2.0% 53 2.1% 421 1.9% 52 2.1% 390 1.8% 51 2.1% 371 1.7% ( ) Source : SOCIALmetrics 2008.3.1. ~ 2010.2.28. 41
3-2-1. Consumer U&A.,, (,, ), % % 1 615 16% 2,688 16% 2 237 6% 2,543 15% 3 204 5% 2,478 15% 4 172 4% 926 5% 5 165 4% 614 4% 6 157 4% 592 3% 7 138 4% 523 3% 8 129 3% 467 3% 9 126 3% 395 2% 10 124 3% 288 2% 11 105 3% 262 2% 12 87 2% 235 1% 13 84 2% 231 1% 14 81 2% 226 1% 15 68 2% 198 1% % % 547 6% 239 15% 538 5% 153 10% 476 5% 125 8% 337 3% 107 7% 324 3% 98 6% 323 3% 64 4% 318 3% 54 3% 309 3% 51 3% 268 3% 44 3% 243 2% 41 3% 231 2% 29 2% 230 2% 27 2% 207 2% 27 2% 197 2% 27 2% 189 2% 25 2% Source : SOCIALmetrics 2008.3.1. ~ 2010.2.28. 42
/ 3-2-1. Consumer U&A,. /,,, [ / ] Source : SOCIALmetrics 2008.3.1. ~ 2010.2.28. 43
3-2-1. Consumer U&A,,, Source : SOCIALmetrics 2008.3.1. ~ 2010.2.28. 44
3 : 3-2-2. Intention, Behavior 2008 2009 2010 2011 1H Keyword Frequency Frequency Frequency Frequency 98.6 100% 178.8 181% 246.9 250% 340.0 345% 259% 341.1 100% 406.6 119% 428.1 125% 433.9 127% 124% 707.5 100% 805.4 114% 848.1 120% 789.1 112% 115% 6726.7 100% 6974.4 104% 7853.3 117% 7184.4 107% 109% 485.2 100% 539.3 111% 566.6 117% 461.4 95% 108% 502.7 100% 562.4 112% 525.2 104% 506 101% 106% 137.9 100% 157.6 114% 140.9 102% 131.7 96% 104% 1124.2 100% 1192.9 106% 1133.2 101% 1054.7 94% 100% 1 Source : SOCIALmetrics 2008.1.1. ~ 2011.8.31. 45
3-2-2. Intention, Behavior. (3 /10 ), 8 Peak. 600 500 400 300 200 100 0 10 Source : SOCIALmetrics 2008.7.1. ~ 2011.6.30. 46
,, 3-2-2. Intention, Behavior,,. 1 19364 42128 31281 2 5135 10186 6461 3 4417 4375 4380 4 3255 3506 4033 5 2084 3452 2101 6 2001 2476 2081 7 1862 1706 2075 8 1735 1239 1647 9 1687 1238 1640 10 1588 1128 1256 11 1559 1038 1130 12 1325 979 1043 13 1151 942 1034 14 1110 936 1034 15 1102 914 995 16 892 797 969 17 891 769 964 18 887 716 957, Refresh Source : SOCIALmetrics 2008.1.1. ~ 2011.6.30. 47
3-3.
3-3-1. Trend & NPD, - TOP10 *TOP10.....,......,................ ' '..., Source : SOCIALmetrics 2008.3.1. ~ 2010.2.28. 49
2 3-3-1. Trend & NPD 2,,,,,,,,,. 20080301-20090228 20090301-20100228 No. % No. % 1 60998 8.4% 1 84301 8.6% 2 52712 7.3% 2 72349 7.3% 3 49504 6.8% 3 69255 7.0% 4 39496 5.4% 4 55243 5.6% 5 34051 4.7% 5 41420 4.2% 6 20845 2.9% 6 31541 3.2% (2008 ) No. 2009-2008% 1 +3.2% 2 +0.2% 3 +0.2% 4 +0.2% 5 +0.2% 6 +0.2% 7 +0.1% 7 18912 2.6% 8 18554 2.6% 9 15938 2.2% 10 15901 2.2% 11 14805 2.0% 12 14640 2.0% 13 14624 2.0% 14 14329 2.0% 15 14212 2.0% 7 23530 2.4% 8 23366 2.4% 9 21606 2.2% 10 20941 2.1% 11 20735 2.1% 12 20204 2.0% 13 19785 2.0% 14 19316 2.0% 15 18905 1.9% (2008 ) No. 2009-2008% 1-2.5% 2-0.5% 3-0.2% 4-0.2% 5-0.2% 6-0.2% 7-0.2% Source : SOCIALmetrics 2008.3.1. ~ 2010.2.28. 50
/ 3-3-1. Trend & NPD /,,,, -/ 3500 3000 2500 2000 1500 1000 500 0 TITLE: - CONTENT: - 1.... 30-1 1 0.... TITLE: [ ] CONTENT:... ^0^/...... ~ ~^0^ ~ ^0^... TITLE: [] ~ CONTENT:.. (..,. ).... TITLE:!!! CONTENT:....... 5.... ^^ Source : SOCIALmetrics 2009.1.1. ~ 2010.7.17. 51
3-3-1. Trend & NPD 1 32, 42, 45. % 1 291265 9.6% 2 288772 9.5% 3 249003 8.2% 31 13460 0.4% 32 12965 0.4% 33 12733 0.4% 42 11064 0.4% 43 11041 0.4% 44 10560 0.3% 45 10346 0.3% *50 1800 1600 1400 1200 1000 800 600 400 200 0 Source : SOCIALmetrics 2009.1.1. ~ 2010.7.17. 52
Workflow 3-3-1. Trend & NPD P R O C E S S IDEA GENERATION???? CONCEPT DEVELOPMENT?? PRODUCT DEVELOPMENT??? <> E X A M P L E I N S I G H T No. 1 2 3 4 5 6 7... - 1. - -. 2.. - 3.. 4.,.!!... 5.. - - - - - < > < > < > 53
1. 3-3-1. Trend & NPD,,,, 3.,,. - No. % 1 293 13.0% 2 244 10.8% 3 194 8.6% 4 194 8.6% 5 167 7.4% 6 156 6.9% 7 98 4.3% 8 95 4.2% 9 90 4.0% 10 83 3.7% - No. % 1 430 26.1% 2 409 24.8% 3 168 10.2% 4 142 8.6% 5 110 6.7% 6 98 6.0% 7 61 3.7% 8 53 3.2% 9 49 3.0% 10 28 1.7% TITLE: CONTENT:...,,,,, 400ml,,.. TITLE: 2009 6 10 CONTENT:... :, 1/5.... TITLE: 2009.02.06-37 CONTENT:.... - 1,, ^ ^ - &.... TITLE: 2009.2.12 ~ CONTENT:... :,,,,,,? -,... TITLE: CONTENT: -, ( ) -,,... Source : SOCIALmetrics 2009.1.1. ~ 2010.7.17. 54
2. 3-3-1. Trend & NPD,,,,,... - No. % 1 1840 49.5% 2 498 13.4% 3 179 4.8% 4 103 2.8% 5 82 2.2% 6 75 2.0% 7 73 2.0% 8 63 1.7% 9 58 1.6% 10 56 1.5% - No. % 1 430 26.1% 2 409 24.8% 3 168 10.2% 4 142 8.6% 5 110 6.7% 6 98 6.0% 7 61 3.7% 8 53 3.2% 9 49 3.0% 10 28 1.7%........,.. Source : SOCIALmetrics 2009.1.1. ~ 2010.7.17. 55
3 3-2. 6 Beauty & Fashion EaEng & Drinking Housing AcEve Culturing InacEve Culturing Technology mp3 56
3-3 2. 08 08 09 09 10 10 Beauty & Fashion 1 2 3 EaEng & Drinking 1 2 3 Housing 1 2 3 AcEve Culturing 1 2 3 InacEve Culturing 1 2 3 Technology 1 2 3 Source : SOCIALmetrics 2008.1.1. ~ 2010.12.31. 57
4 1. 1 2 08 08 09 09 10 10,. 3 4 5 6.. 7 8 9 10.,,,!. 11 12 13 14 New.. 15 16 17 18 72.!..^_T Source : SOCIALmetrics 2008.1.1. ~ 2010.12.31. 58
3 3-2. 08 08 09 09 10 10 1 2 3 4 5 6 7 TV MP3 TV 8 MP3 TV TV TV TV 9 MP3 10 MP3 11 MP3 MP3 12 Source : SOCIALmetrics 2008.1.1. ~ 2010.12.31. 59
Metric
4. SocialMetrics -
People Are Connected and Start to Talk!!!!!!!!!!!!!!! 62
Daumsoft The Mining Company History Product Analysis Brand & CommunicaEon Services Customer Service / Issue Monitoring See customers common complaints and poteneal issues early by looking for negaeve buzz online. Product Analysis Understand how customers feel about specific product aaributes on an extremely granular level. Brand Analysis Know what customers think about your brand in comparison to compeetors and see who is winning in terms of share of buzz. Campaign EvaluaEon Measure how well a markeeng campaign has the intended effect among your target audience and make midstream adjustments along the way. Implemented for Client 2004 2006 2009 2010 Market & Trend New Product Development Recognize unmet customer needs that can lead to new, previously unimagined products. Trend Spozng Find new lifestyle trend by observing people s behavior and apply to the innovaeve way of thinking 2011 2011 63
Our Clients 64
감사합니다 www.daumsoft.com / www.some.co.kr SOCIALmetrics Product Analysis SOCIALmetrics TRENDMAP SOCIALmetrics Hub SOCIALmetrics Brand Analysis SOCIALmetrics www.some.co.kr SOCIALmetrics CAMPAIGN EDITION 65