Data-driven Industry Reinvention All Things Data Con 2016, Opening speech SKT 종합기술원 최진성원장
Big Data Landscape Expansion Big Data Tech/Biz 진화방향 SK Telecom Big Data Activities Lesson Learned and Other Topics Industry Reinvention
Big Data Landscape Expansion Big Data Tech/Biz eco-system 은 Intelligence 강화, 다양한 Industry 접목등을통해비약적으로확장중 2012 2016
참조. Evolution of Hadoop eco-system The stack is is continually evolving and growing Falcon Knox Flink Kudu RecordService Ibis Parquet Sentry Flume Bigtop Oozie Hcatalog Hue Spark Tez Impala Kafka Drill Sqoop Avro Core Hadoop (HDFS, MR) Solr Pig Hbase Zookeeper Hive Mahout
참조. Data analysis process
Big Data Tech/Biz 진화방향 (1/3) IoT 확산에따라심화되고있는 3V(velocity, volume, variety) 에대응하여 Big Data 기술진화중 Variety : Text, web behavior log Multimedia, machine log Velocity : Batch Real Time Volume : GB, TB PB, ZB IoT 전용망도입및확산 Self Driving/Connected Car 확산 Cellular N/W (2G, 3G, 4G, 5G) IoT Device 폭발적증가
Big Data Tech/Biz 진화방향 (2/3) Legacy analytics tool 을지원하는 Cost effective Infra (computing/storage) 관점의활용에서 분산환경 ML/DL 기반의 Intelligence analytics 까지수용하는형태로발전 Enterprise-level 기능 / 성능 Advanced Analytics ML/ DL Big Data 1.0 Big Data 2.0 Big Data 3.0 & beyond Big Data Technology Evolution Birth (HDFS, MapReduce, Hive, etc.) Growth in Storage/Computing (Hbase, Spark, Yarn, etc.) Integration Era of Infra and Intelligence (SparkML, Mahout, H2O, TensorFlow, etc.)
참조. Big Data 세부기술 Life cycle TechRadar : Big Data, Q1 2016, Forrester Research report
참조. Data Science Value Chain
Big Data Tech/Biz 진화방향 (3/3) Big Data / AI 기술이다양한산업영역에적용되어 Biz Reinvention 을일으키고있음 Various Industry 접목 (As-is) : 항공기엔진제조 엔진원격관리서비스 : AI 기반주식거래시스템, Robo-adviser ICT 산업중심활용 (As-was) 5.5 조 < 80 조 24 조 < 35 조
SK Telecom Big Data 관련 Activities (1/3) SKT 내부활용과함께기술공급자관점에서 Big Data B2B 솔루션의외부사업화를추진중 SKT 내부활용외부사업화 (Hi-Tech 제조업등 ) Marketing 효율화 : Voc 분석, Target marketing 등 실시간 FDC (Fault Detection & Classification) 패턴분석 Network 품질최적화 : OSS, APOLLO 대용량미세품질분석
참조. Hi-Tech 제조업의 Big data 기술활용사례 Big Data 기술활용 Before & After 정량적 BEFORE 정성적 Data Storage Size AFTER BEFORE AFTER Batch (Near) Real Time 1.5PB 분석처리시간 20 분 10 시간 ` 100TB 3 개월 12 개월 저장기간 단일공정 / 장비분석 ` 통합연계분석 500GB 300 개 IT supported & Static Analysis Self & Dynamic Analysis 15TB 15 만개 1 회분석량분석대상 parameter 수 Descriptive Analysis Predictive Analysis
SK Telecom Big Data 관련 Activities (2/3) Big Data, 음성 / 영상인식, 자연어처리기술등을융합하여 intelligence B2C 서비스제공추진중 [ Developers ] [ Intelligence Platform ] [ Smart Device ] [ Service ] ML/DL 음성인식 Backend Platform R&D User Interface ML Library Frontend Platform API / SDK T map 영상인식자연어처리 Infra Software (Hadoop, GPUs) Knowledge T Phone 3rd party Tech. Melon Text Speech Video Sensor Data
SK Telecom Big Data 관련 Activities (3/3) IoT 확산을위해 IoT 전용전국망 (LoRa) 및 IoT platform (ThingPlug) 구축을추진함 Service Platform Network Device
Lesson Learned and Other Topics Lesson Learned Other Topics 시행착오가필요한 emerging 영역으로서, Small start, Scale later 접근방식이선호됨 대부분의 Big data 과제는 BI 에서시작하여 New value creation 으로전이함 Raw data 는 Data lake 를통해저장하고, 단위시스템별로필요한가공데이터를 feeding 받는구조가이상적인형태임 Big data 기술은다양한응용 / 산업과융합될때 potential 이더큰 Enabling tech. 임 Ex. 자율주행차, Industry 4.0, O2O 등 ` Machine Learning & Big Data In-Memory Processing Data Discovery & Self-Service LOD & 정부 3.0 Spark 2.0 & Dataframe, ` Dataset and Structured Streaming KeyValue & Distributed File Stores Decision support vs. Decision making Definition of Intelligent Data H2O & Druid
Industry Reinvention ICT Tech. 공급자와 Domain 수요자간의협업을통한 Industry Reinvention Industry Customer Domain knowledge Enabling Industry Reinvention Digitalization Connectivity Automation & Intelligence SK 텔레콤 & Biz partner ICT Tech. Expertise
Appendix 1. Top 10 Machine Learning Algorithms 1. 1. Naïve Bayes Classifier 1. 2. Naïve K-Means Bayes Clustering Classifier 3. 1. Support Naïve Bayes Vector Classifier Machine 4. Apriori Algorithm 5. Linear Regression 6. Logistic Regression 7. Artificial Neural Networks 1. Naïve 8. Random Bayes Forests Classifier 1. Naïve 9. Decision Bayes Classifier Trees 1. 10. Naïve Nearest Bayes Neighbors Classifier
Appendix 2. Scikit-learn algorithm cheat-sheet
Appendix 3. Machine Learning Algorithms Mindmap
SK Telecom 의 Big Data Forum
SK Telecom 의 Open Innovation channel https://developers.sktelecom.com
Thank you