PowerPoint 프레젠테이션

Similar documents
쉽게 풀어쓴 C 프로그래밊

비긴쿡-자바 00앞부속

rmi_박준용_final.PDF

Secure Programming Lecture1 : Introduction

02 C h a p t e r Java

Mobile Service > IAP > Android SDK [ ] IAP SDK TOAST SDK. IAP SDK. Android Studio IDE Android SDK Version (API Level 10). Name Reference V

다중 한것은 Mahout 터 닝알 즘몇 를 현 다는것외 들을 현 Hadoop 의 MapReduce 프 워크와결 을 다는것 다. 계산 많은 닝은 컴퓨터의큰메 와연산기 을 만 Mahout 는최대한 MapReduce 기 을활용 터분 다용 졌다.. Mahout 의설 Mahou

/chroot/lib/ /chroot/etc/

PowerPoint 프레젠테이션

김기남_ATDC2016_160620_[키노트].key

목차 INDEX JSON? - JSON 개요 - JSONObject - JSONArray 서울시공공데이터 API 살펴보기 - 요청인자살펴보기 - Result Code - 출력값 HttpClient - HttpHelper 클래스작성 - JSONParser 클래스작성 공공

API STORE 키발급및 API 사용가이드 Document Information 문서명 : API STORE 언어별 Client 사용가이드작성자 : 작성일 : 업무영역 : 버전 : 1 st Draft. 서브시스템 : 문서번호 : 단계 : Docum

Microsoft PowerPoint - Supplement-03-TCP Programming.ppt [호환 모드]

Microsoft PowerPoint - Java7.pptx

PowerPoint Presentation

리눅스설치가이드 3. 3Rabbitz Book 을리눅스에서설치하기위한절차는다음과같습니다. 설치에대한예시는우분투서버 기준으로진행됩니다. 1. Java Development Kit (JDK) 또는 Java Runtime Environment (JRE) 를설치합니다. 2.

PowerPoint Presentation

1. 자바프로그램기초 및개발환경 2 장 & 3 장. 자바개발도구 충남대학교 컴퓨터공학과

자바-11장N'1-502

PowerPoint Presentation

mytalk

Microsoft PowerPoint - 03-TCP Programming.ppt

Cluster management software

<4D F736F F F696E74202D20C1A63234C0E520C0D4C3E2B7C228B0ADC0C729205BC8A3C8AF20B8F0B5E55D>

Microsoft PowerPoint SDK설치.HelloAndroid(1.5h).pptx

Apache Ivy

Spring Boot

PowerPoint 프레젠테이션

슬라이드 1

Interstage5 SOAP서비스 설정 가이드

신림프로그래머_클린코드.key

PowerPoint Presentation

PowerPoint 프레젠테이션

11 템플릿적용 - Java Program Performance Tuning (김명호기술이사)

Microsoft PowerPoint Android-SDK설치.HelloAndroid(1.0h).pptx

Analytics > Log & Crash Search > Unity ios SDK [Deprecated] Log & Crash Unity ios SDK. TOAST SDK. Log & Crash Unity SDK Log & Crash Search. Log & Cras

FileMaker ODBC and JDBC Guide

C++ Programming

07 자바의 다양한 클래스.key

iii. Design Tab 을 Click 하여 WindowBuilder 가자동으로생성한 GUI 프로그래밍환경을확인한다.

q 이장에서다룰내용 1 객체지향프로그래밍의이해 2 객체지향언어 : 자바 2

FileMaker ODBC and JDBC Guide

JAVA PROGRAMMING 실습 09. 예외처리

PowerPoint Presentation

JAVA 프로그래밍실습 실습 1) 실습목표 - 메소드개념이해하기 - 매개변수이해하기 - 새메소드만들기 - Math 클래스의기존메소드이용하기 ( ) 문제 - 직사각형모양의땅이있다. 이땅의둘레, 면적과대각

MPLAB C18 C

<4D F736F F F696E74202D20C1A63235C0E520B3D7C6AEBFF6C5A920C7C1B7CEB1D7B7A1B9D628B0ADC0C729205BC8A3C8AF20B8F0B5E55D>

JMF2_심빈구.PDF

12-file.key

PowerPoint Presentation


PowerPoint Presentation

Ä¡¿ì³»ÁöÃÖÁ¾

Recommender Systems - Beyond Collaborative Filtering

MasoJava4_Dongbin.PDF

09한성희.hwp

Network Programming

1

슬라이드 1

Semantic Consistency in Information Exchange

PowerPoint 프레젠테이션

歯JavaExceptionHandling.PDF


교육자료

Microsoft PowerPoint - 11주차_Android_GoogleMap.ppt [호환 모드]

Microsoft PowerPoint - RMI.ppt

Microsoft PowerPoint - java1-lab5-ImageProcessorTestOOP.pptx

슬라이드 1

Design Issues

Microsoft PowerPoint - 04-UDP Programming.ppt

강의10

개요오라클과티베로에서 JDBC 를통해접속한세션을구분할수있도록 JDBC 접속시 ConnectionProperties 를통해구분자를넣어줄수있다. 하나의 Node 에다수의 WAS 가있을경우 DB 에서 Session Kill 등의동작수행시원하는 Session 을선택할수있다.

슬라이드 1

PowerPoint 프레젠테이션

교육2 ? 그림

NoSQL

(Microsoft PowerPoint - java1-lecture11.ppt [\310\243\310\257 \270\360\265\345])

DIY 챗봇 - LangCon

파일로입출력하기II - 파일출력클래스중에는데이터를일정한형태로출력하는기능을가지고있다. - PrintWriter와 PrintStream을사용해서원하는형태로출력할수있다. - PrintStream은구버전으로가능하면 PrintWriter 클래스를사용한다. PrintWriter

untitled

PowerPoint Template

gnu-lee-oop-kor-lec11-1-chap15

Microsoft PowerPoint - 14주차 강의자료

Java

PowerPoint 프레젠테이션

RUCK2015_Gruter_public

untitled

Chap12

PowerPoint Presentation

Connection 8 22 UniSQLConnection / / 9 3 UniSQL OID SET

14-Servlet

chapter1,2.doc

Microsoft Word - ntasFrameBuilderInstallGuide2.5.doc

블로그_별책부록

Cluster management software

ALTIBASE 사용자가이드 Templete

@OneToOne(cascade = = "addr_id") private Addr addr; public Emp(String ename, Addr addr) { this.ename = ename; this.a

TEST BANK & SOLUTION

PowerPoint 프레젠테이션

예외 예외정의예외발생예외처리예외전파 단정 단정의선언 단정조건검사옵션 2

Transcription:

빅데이터 실전기술 Recommendation System using Mahout 2014.12.23 IT 가맹점개발팀이태영

Mahout 설치 1) Mahout 0.9 다운로드 http://mahout.apache.org 접속후다운로드 2) 계정홈디렉토리로 mv $ mv mahout-distribution-0.9.tar.gz ~ 3) 압축을풀고 mahout 심볼릭링크를생성 $ ln -s mahout-distribution-0.9 mahout 4).bash_profile에 MAHOUT_HOME과 PATH 추가 1 #.bash_profile 2 3 # Get the aliases and functions 4 if [ -f ~/.bashrc ]; then 5. ~/.bashrc 6 fi 7 8 # User specific environment and startup programs 9 10 export JAVA_HOME=$HOME/java 11 export HADOOP_HOME=$HOME/hadoop 12 export PYTHON_HOME=$HOME/python 13 export MAHOUT_HOME=$HOME/mahout 14 15 PATH=$PATH:$HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PYTHON_HOME/:$MAHOUT_HOME/bin 16 17 export PATH

협업필터링알고리즘 Collarborative filtering 1. User based ( 첫째, 비슷한사용자찾음 ) 취향이비슷한유저 B 가어떤아이템을구매했는지확인후 B 가구매했던상품으로추천 2. Item based ( 첫째, 비슷한아이템찾음 ) 내가구매했던상품들을기반으로, 연관성이있는상품을추천 User based Recommendation Item based Recommendation

유사도 Similarity 1. Euclidean Distance 두객체간의선호도거리를계산하여, 작을수록비슷한성향을가짐 2. Cosine Similarity (=Pearson Similarity) 두객체간의선호도를벡터화하여, 벡터사이의각도가적을수록유사 3. Jaccard Similarity 두객체간의요소들의전체요소들중교집합되는요소가차지하는비중

Mahout What is Mahout? The Apache Mahout project's goal is to build a scalable machine learning library. Clustering ( 군집화 ) Classification ( 분류 ) Recommendation ( 추천및협업필터링 ) Pattern Mining ( 패턴마이닝 ) Regression ( 회귀분석 ) Evolutionary Algorithms ( 진화알고리즘 ) Dimension reduction ( 차원리덕션 ) Mahout is made by JAVA We can use Mahout core libarary for java programming. NO HADOOP ONLY.

Recommendation 1. MovieLens 데이터셋 ( http://grouplens.org/datasets/movielens ) 10 만개 / 1 백만개 / 1 천만개별로평가데이터셋제공 미국내상영된영화를사용자들이평가한결과물 1997/9/19 ~ 1998/4/22 간 미네소타대학컴퓨터과학연구실에서수집한추천알고리즘을위한학습데이터 다운로드 ( 약 4.8MB)

Recommendation 2. SLF4J 라이브러리 (http://www.slf4j.org ) mahout 라이브러리호환성필요 다운로드 ( 약 4.3MB)

Recommendation 3. GUAVA 라이브러리 ( https://code.google.com/p/guava-libraries ) mahout 의데이터객체는 guava 라이브러리의존 다운로드 ( 약 4.3MB) 다운로드 ( 약 2.2MB)

Recommendation 4. Apache commons Math 라이브러리 (http://commons.apache.org/proper/commons-math ) 수치계산용라이브러리 다운로드 ( 약 14.3MB)

Recommendation 최종라이브러리리스트 commons-math3-3.4.jar guava-18.0.jar mahout-core-0.9.jar mahout-integration-0.9.jar mahout-math-0.9.jar slf4j-api-1.7.7.jar slf4j-nop-1.7.7.jar

Item Based Recommendation 프로젝트생성 ItemRecommender Java 5 SDK 이상

Item Based Recommendation 프로젝트개발준비 Libraries 는모두 lib 폴더밑으로복사 ml-100k.zip(movielens 데이터 ) 를압축을푼뒤 data 폴더로 u.data 를복사 README 파일내설명 u.data -- The full u data set, 100000 ratings by 943 users on 1682 items. Each user has rated at least 20 movies. Users and items are numbered consecutively from 1. The data is randomly ordered. This is a tab separated list of user id item id rating timestamp. The time stamps are unix seconds since 1/1/1970 UTC

Recommendation 프로젝트개발준비 u.data 파일을 csv 파일형태로변경하여 movies.csv 로저장 u.data 파일의각요소별구분자인 \t 을콤마 (,) 로치환

User Based Recommendation 클래스추가 패키지 : recommend.item 클래스 : UserRecommend Java Build Path 추가

UserRecommend 클래스 public class UserRecommend { public static void main(string[] args) throws Exception{ /* 데이터모델생성 */ DataModel dm = new FileDataModel(new File("data/movies.csv")); /* 유사도모델생성 */ UserSimilarity sim = new PearsonCorrelationSimilarity(dm); /* 모든유저들로부터주어진유저와특정임계값을충족하거나초과하는 neighborhood 기준 */ UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.1, sim, dm); /* 사용자추천기생성 */ UserBasedRecommender recommender = new GenericUserBasedRecommender(dm, neighborhood, sim); int x= 1; /* 데이터모델내의유저들의 iterator 를단계별로이동하며추천아이템들제공 */ for(longprimitiveiterator users = dm.getuserids(); users.hasnext();){ long userid = users.nextlong(); /* 현재유저 ID */ /* 현재유저 ID 에해당되는 5 개아이템추천 */ List<RecommendedItem> recommendations = recommender.recommend(userid, 5); for(recommendeditem recommenation : recommendations){ System.out.println(userID +","+ recommenation.getitemid()+","+recommenation.getvalue()); } } } } if(++x > 5) break; /* 유저 ID 5 까지만출력 */

UserRecommend 클래스실행결과 sim = new PearsonCorrelationSimilarity(dm); sim = new LogLikelihoodSimilarity(dm); 1,1558,5.0 1,1500,5.0 1,1467,5.0 1,1189,5.0 1,1293,5.0 2,1643,5.0 2,1467,5.0 2,1500,5.0 2,1293,5.0 2,1189,5.0 3,1189,5.0 3,1500,5.0 3,1302,5.0 3,1368,5.0 3,1398,4.759591 4,1104,4.7937207 4,853,4.729132 4,169,4.655577 4,1449,4.60582 4,408,4.582672 5,1500,5.0 5,1233,5.0 5,851,5.0 5,1189,5.0 5,119,5.0 < 유저 ID, 추천아이템 ID, 연결강도 > 1,1500,5.0 1,1467,5.0 1,1189,5.0 1,1293,5.0 1,1367,4.7517056 2,1500,5.0 2,1293,5.0 2,1189,5.0 2,1449,4.608227 2,1594,4.5082903 3,1500,5.0 3,1189,5.0 3,1293,5.0 3,1449,4.76954 3,1450,4.686902 4,1467,5.0 4,1500,5.0 4,1189,5.0 4,1293,5.0 4,1594,4.566541 5,1500,5.0 5,1467,5.0 5,1189,5.0 5,1293,5.0 5,1642,4.66432

Item Based Recommendation 클래스추가 패키지 : recommend.item 클래스 : ItemRecommend Java Build Path 추가

ItemRecommend 클래스 public class ItemRecommend { public static void main(string args[]){ DataModel dm; try { /* 데이터모델생성 */ dm = new FileDataModel(new File("data/movies.csv")); /* 유사도모델선택 */ ItemSimilarity sim = new PearsonCorrelationSimilarity(dm); /* 추천기선택 : ItemBased */ GenericItemBasedRecommender recommender = new GenericItemBasedRecommender(dm, sim); int x=1; /* 데이터모델내의 item 들의 iterator 를단계별이동하며추천아이템들제공 */ for(longprimitiveiterator items = dm.getitemids(); items.hasnext();){ long itemid = items.nextlong(); /* 현재 item ID */ /* 현재 item ID 와가장유사한 5 개아이템추천 */ List<RecommendedItem> recommendations = recommender.mostsimilaritems(itemid, 5); } } /* 유사한아이템출력 = " 현재아이템 ID 추천된아이템 ID 유사도 " */ for(recommendeditem recommendation : recommendations){ System.out.println(itemID + ","+recommendation.getitemid() + "," + recommendation.getvalue()); } x++;/* 아이템 ID 5까지유사한아이템들 5개씩 */ if(x>5) System.exit(0); } } catch (IOException TasteException e) { e.printstacktrace(); }

ItemRecommend 클래스실행결과 sim = new PearsonCorrelationSimilarity(dm); sim = new LogLikelihoodSimilarity(dm); 1,973,1.0 1,885,1.0 1,920,1.0 1,757,1.0 1,341,1.0 2,341,1.0 2,119,1.0 2,308,1.0 2,75,1.0 2,74,1.0 3,560,1.0 3,422,1.0 3,344,1.0 3,400,1.0 3,115,1.0 4,1038,1.0 4,868,1.0 4,927,1.0 4,643,1.0 4,360,1.0 5,348,1.0 5,34,1.0 5,113,1.0 5,35,1.0 5,6,1.0 < 기준아이템 ID, 비교아이템 ID, 유사도 > 1,117,0.9953521 1,151,0.9953065 1,121,0.9952347 1,405,0.99500656 1,50,0.99491894 2,403,0.9964998 2,233,0.9964557 2,161,0.9961404 2,231,0.9960143 2,385,0.9959657 3,405,0.99037176 3,235,0.9893157 3,121,0.9880421 3,250,0.9880041 3,100,0.98773706 4,56,0.99627966 4,174,0.99601305 4,204,0.9959589 4,202,0.99582237 4,385,0.9957967 5,218,0.99432045 5,98,0.9922024 5,234,0.99179345 5,56,0.99115413 5,53,0.9909523

References 1. Apache Mahtout Recommender Quick Start http://mahout.apache.org/users/recommender/quickstart.html 2. Recommendation System : 협업필터링을중심으로 http://rosaec.snu.ac.kr/meet/file/20120728b.pdf 3. Apache Mahout 맛보기 (30분만에추천시스템만들기 ) http://www.slideshare.net/pitzcarraldo/mahout-cook-book 4. Mahout를활용한영화추천샘플링 http://www.mimul.com/pebble/default/2012/03/23/1332494169544.html 5. Recommendation: 추천 알고리즘 : Item-Based Filtering http://hochul.net/blog/recommendation-daisy/