Open Cloud Engine Open Source Big Data Platform Flamingo Project Open Cloud Engine Flamingo Project Leader 김병곤 (byounggon.kim@opence.org)
빅데이터분석및서비스플랫폼 모바일 Browser 인포메이션카탈로그 Search 인포메이션유형 보안등급 생성주기 형식 사용자친밀도 1 매일새벽2시 XML 서비스요청 시스템 Open API 아이템추천 2 매일새벽 1 시 JSON 구매성향 3 매일저녁 8 시 XML/JSON 5 데이터이용자 6 오피니언리더점수 2 매일오전 10 시 XML/JSON 데이터이용자 4 분석결과를외부에제공하기위해 Open API 로노출 분석결과를재사용 7 Browser 디자이너 Search 수집 로그데이터 MapReduce 분석모듈 그래프분석 리더선출 분석결과검증 1 형태소분석 사용자별평가 3 데이터분석가 로그데이터 오피니언리더점수 데이터시각화를위한 Chart 2 워크플로우디자인 데이터분석가서비스기획자
재사용컴포넌트 UI 구성
<?xml version="1.0" encoding="utf-8"?> <collector xmlns="http://www.openflamingo.org/schema/collector" xsi:schemalocation="http://www.openflamingo.org/schema/collector flamingo-log-collector-1.0.xsd" xmlns:xsi="http://www.w3.org/2001/xmlschema-instance"> <description> </description> <globalvariables> <globalvariable name="currentdate" value="${dateformat('yyyymmdd-hhmmss')}" description="string"/> </globalvariables> <job name="seoul_rain" description=" - "> <schedule> <cronexpression>0 * * * *?</cronexpression> </schedule> <policy> <ingress> <fromhttp> <url>http://openapi.seoul.go.kr:8088/sample/xml/searchtrafficaccidentservice/1/5</url> <method type="get"/> </fromhttp> </ingress> <egress> <tohdfs cluster="dev"> <targetpath> /seoul/traffic/accident/${dateformat('yyyy')}/${dateformat('mm')} /${dateformat('dd')}/accident_${dateformat('yyyymmdd-hhmmss')}.txt </targetpath> </tohdfs> </egress> </policy> </job> </collector>
Workflow Designer (Java, MR, Hive, Pig, 기초통계및알고리즘 ) HDFS File System Browser (File & Directory 관리 ) HDFS Audit Log Pig Editor (Run, History, Save) Hive Editor (Run, Download, History, Explain) Dashboard Hive Metastore Browser (DB, Table, Column, Partition) 다국어 ( 영어, 한글 ) User Management (Login, Logout, Signup, Quota, User home) Hive Editor (Save, Load) Apache UIMA Integration ( 비정형 à 정형 ) for Exobrain Apache Mahout/Giraph Integration Hive Metastore Manager (Create, Drop) File System Browser Drag And Drop + Multi File Upload Linux File System Browser File Viewer Workflow Designer (Copy, Folder Management) Workflow Designer (R, MapReduce ETL) System Monitoring (CPU, RAM, Disk ) Preference UI Refactoring Eclipse Integration (MapReduce, Hive, Pig Job Submit) Hadoop Clustering Provisioning on OpenStack 기본기능제공목표 Designer Module 증가 Hive Editor 기능향상 File Browser 기능향상 Monitoring 추가
Hive & User Management Integration Apache Sqoop Integration for Designer FTP Integration (Active/Passive Mode) for Designer Log Collector UI Refactoring (Cron 및 Ingress/Egress UI 개발필요 ) Apache Tika Integration for Designer PPT, PDF, Email, Excel, Word 등의텍스트추출 Designer 고도화 Metadata Management + Workflow Designer Compression Apache Spark Integration 알티베이스 & 큐브리드 Integration for Designer Metadata Metadata Management + MapReduce Hortonworks Tez, Stringer Integration ( 고려중 ) Pivotal HD Hawk Integration ( 고려중 ) Metadata
Flamingo Project 의미래 Big Data on Cloud Netra (OpenStack based Hadoop Provisioning) + Flamingo (Hadoop based Workspace) Open Source based Big Data Platform Apache Hadoop EcoSystem Big Data Management Using Flamingo Apache Hadoop PaaS (Platform as a Service) Big Data All In One Package
참여와공유!! www.opencloudengine.org