Microsoft PowerPoint - T4S3_허준영.ppt

Similar documents
<4D F736F F F696E74202D20C0E5BCBABFEC5F4442BBF3BFA1BCADC0C720C1A4BAB8BAD0BCAE2DC3D6C1BEBABB2E707074>

Intra_DW_Ch4.PDF

Portal_9iAS.ppt [읽기 전용]

김기남_ATDC2016_160620_[키노트].key

歯목차45호.PDF

DW 개요.PDF

ETL_project_best_practice1.ppt

Oracle Apps Day_SEM

MS-SQL SERVER 대비 기능

ecorp-프로젝트제안서작성실무(양식3)

Slide 1

Microsoft PowerPoint - S4_통계분석시스템.ppt

PowerPoint 프레젠테이션

untitled

歯CRM개괄_허순영.PDF

슬라이드 1

PowerPoint 프레젠테이션

oracle9i_newfeatures.PDF

PowerPoint 프레젠테이션

PowerPoint Presentation

J2EE & Web Services iSeminar

Model Investor MANDO Portal Site People Customer BIS Supplier C R M PLM ERP MES HRIS S C M KMS Web -Based

Analyst Briefing

E-BI Day Presentation

Voice Portal using Oracle 9i AS Wireless

WINDOW FUNCTION 의이해와활용방법 엑셈컨설팅본부 / DB 컨설팅팀정동기 개요 Window Function 이란행과행간의관계를쉽게정의할수있도록만든함수이다. 윈도우함수를활용하면복잡한 SQL 들을하나의 SQL 문장으로변경할수있으며반복적으로 ACCESS 하는비효율역

untitled

슬라이드 제목 없음

SAS Customer Intelligence SAS Customer Intelligence Suite은 기업이 당면한 다양한 마케팅 과제들을 해결하기 위한 최적의 통합 마케팅 제품군으로 전사적 마케팅 자원관리를 위한 Marketing Operation Manageme

RUCK2015_Gruter_public

(Exposure) Exposure (Exposure Assesment) EMF Unknown to mechanism Health Effect (Effect) Unknown to mechanism Behavior pattern (Micro- Environment) Re

1.장인석-ITIL 소개.ppt

dbms_snu.PDF

CRM Fair 2004

Microsoft PowerPoint - 3.공영DBM_최동욱_본부장-중소기업의_실용주의_CRM

<30362E20C6EDC1FD2DB0EDBFB5B4EBB4D420BCF6C1A42E687770>

SW¹é¼Ł-³¯°³Æ÷ÇÔÇ¥Áö2013

Copyright 2012, Oracle and/or its affiliates. All rights reserved.,.,,,,,,,,,,,,.,...,. U.S. GOVERNMENT END USERS. Oracle programs, including any oper

solution map_....

PowerPoint 프레젠테이션

Simplify your Job Automatic Storage Management DB TSC

Oracle9i Real Application Clusters

Oracle Database 10g: Self-Managing Database DB TSC

04-다시_고속철도61~80p


PowerPoint 프레젠테이션

정보기술응용학회 발표

비식별화 기술 활용 안내서-최종수정.indd

±èÇö¿í Ãâ·Â

IT현황리포트 내지 완

PCServerMgmt7

Backup Exec

DBMS & SQL Server Installation Database Laboratory

서론 34 2

02이승민선생_오라클.PDF

The characteristic analysis of winners and losers in curling: Focused on shot type, shot accuracy, blank end and average score SungGeon Park 1 & Soowo

소프트웨어개발방법론

BSC Discussion 1

빅데이터시대 Self-BI 전략 이혁재이사 비아이씨엔에스

15_3oracle

SchoolNet튜토리얼.PDF

PowerChute Personal Edition v3.1.0 에이전트 사용 설명서

Title of the presentation This is the subtitle

Service-Oriented Architecture Copyright Tmax Soft 2005

methods.hwp

?

ORANGE FOR ORACLE V4.0 INSTALLATION GUIDE (Online Upgrade) ORANGE CONFIGURATION ADMIN O

목차 BUG offline replicator 에서유효하지않은로그를읽을경우비정상종료할수있다... 3 BUG 각 partition 이서로다른 tablespace 를가지고, column type 이 CLOB 이며, 해당 table 을 truncate

#Ȳ¿ë¼®

Orcad Capture 9.x

Microsoft PowerPoint - SVPSVI for LGNSYS_ ppt

SQL Developer Connect to TimesTen 유니원아이앤씨 DB 기술지원팀 2010 년 07 월 28 일 문서정보 프로젝트명 SQL Developer Connect to TimesTen 서브시스템명 버전 1.0 문서명 작성일 작성자

<31325FB1E8B0E6BCBA2E687770>

슬라이드 1

The Self-Managing Database : Automatic Health Monitoring and Alerting

13 Who am I? R&D, Product Development Manager / Smart Worker Visualization SW SW KAIST Software Engineering Computer Engineering 3

最即時的Sybase ASE Server資料庫診斷工具

DIY 챗봇 - LangCon

PRO1_09E [읽기 전용]

(, sta*s*cal disclosure control) - (Risk) and (U*lity) (Synthe*c Data) 4. 5.

Multi Channel Analysis. Multi Channel Analytics :!! - (Ad network ) Report! -! -!. Valuepotion Multi Channel Analytics! (1) Install! (2) 3 (4 ~ 6 Page

Manufacturing6

Copyright 0, Oracle and/or its affiliates. All rights reserved.,.,,,,,,,,,,,,.,...,. U.S. GOVERNMENT RIGHTS Programs, software, databases, and related

Chap7.PDF

APOGEE Insight_KR_Base_3P11

03여준현과장_삼성SDS.PDF

<31372DB9DABAB4C8A32E687770>


IBM SPSS Statistics 제품 소개 (2017 Aug)

Cache_cny.ppt [읽기 전용]

untitled

< FC8A8C6E4C0CCC1F620B0B3B9DF20BAB8BEC8B0A1C0CCB5E5C3D6C1BE28C0FAC0DBB1C7BBE8C1A6292E687770>

SLA QoS

Ç¥Áö

HTML5가 웹 환경에 미치는 영향 고 있어 웹 플랫폼 환경과는 차이가 있다. HTML5는 기존 HTML 기반 웹 브라우저와의 호환성을 유지하면서도, 구조적인 마크업(mark-up) 및 편리한 웹 폼(web form) 기능을 제공하고, 리치웹 애플리케이 션(RIA)을

조사연구 권 호 연구논문 한국노동패널조사자료의분석을위한패널가중치산출및사용방안사례연구 A Case Study on Construction and Use of Longitudinal Weights for Korea Labor Income Panel Survey 2)3) a

Software Requirrment Analysis를 위한 정보 검색 기술의 응용

에너지경제연구 Korean Energy Economic Review Volume 17, Number 2, September 2018 : pp. 1~29 정책 용도별특성을고려한도시가스수요함수의 추정 :, ARDL,,, C4, Q4-1 -

, Analyst, , , Figure 1 우리은행 12 개월 forward P/B 및 업종 대비 할증(할인) 추이, NPL 비율 추이

thesis

Transcription:

In-Database Analytics Page 1 허준영 DW&BI Business Development Manager 기술컨설팅본부한국오라클

목차 정보분석개요 효율적인정보분석방안고찰 오라클의정보분석전략 : In-Database Analytics DB 상에서의정보분석방안 In-Database Statistics OLAP Option Data Mining Option 요약및Q&A Page 2 Page 2

목차 정보분석개요 효율적인정보분석방안고찰 오라클의정보분석전략 : In-Database Analytics DB 상에서의정보분석방안 In-Database Statistics OLAP Option Data Mining Option 요약및Q&A Page 3 Page 3

정보분석개요 Query and Reporting OLAP Data Mining 상세정보의추출 요약및경향분석 숨겨진패턴의발견을통한지식획득 정보 분석 통찰 & 예측 지난 3 년간펀드를구입한사람들은누구인가? 펀드구매자들의지역별, 연도별평균이득은얼마인가? 다음 6 개월동안펀드를구매할것으로예측되는사람들은누구이며그이유는? Business Intelligence Page 4 Page 4

정보분석관련주요트렌드 정보의크기는점점커지는추세 3년전전세계에서가장큰 DW : 30TB 작년세계에서가장큰 DW : 100TB 2,3년내로 PB급 DW가나올것으로예상됨 정보의저장은이제큰문제가아님. 진짜문제는정보분석방법임 차원이 4000 개이고크기가 2TB 인정보를어떻게분석할것인가? [2005 Winter Corp] Page 5 Page 5

현재의정보분석프로세스 분석업무의분리서로다른곳에서분석수행 : 다른시스템, 다른담당자 개별적인분석애플리케이션사용업무별전문패키지사용 : 전문성은좋으나통합성은? 주요고찰이슈 : 구축및유지비용, 실시간대응성 Data Integration Engine Data Warehouse OLAP Engine Mining Engine Page 6 Page 6

현재프로세스의문제점 데이터의빈번한이동 데이터의크기가적을때는별문제가없음. 하지만, 분석해야할데이터의크기가커진다면?( 예 :TB급의고객정보분석 ) 데이터의크기가커질수록중요한이슈로대두됨 데이터중복저장비용 데이터이동에따르는시간손실 전문패키지의정보확장성및성능이슈 업무프로세스의분리 전체분석프로세스의지연 실시간분석및대응이불가능 Page 7 Page 7

바람직한정보분석프로세스 한곳에서정보관리및분석수행 데이터이동을최소화하여서버간이동에따르는불필요한시간지연제거 중복저장불필요 안전하고효율적인정보관리 Security, Scalability, Availability 정보분석업무의유기적연결및차별화 일반정보분석업무의상시 & 실시간화 단일 SQL로정보분석업무처리 필요시전문패키지를통한고급분석수행 Page 8 Page 8

오라클의정보전략 :In-Database Analytics Oracle 10g DB Data Warehousing ETL 단일 DB 내에서통합된정보분석업무지원 Data Warehouse Built-in Statistics OLAP Option Data Mining Option OLAP Statistics Data Mining Page 9 Page 9

Oracle Business Intelligence Know More, Do More, Spend Less! Query & Reporting Oracle BI Solution BI Beans Oracle Reports Oracle 10g DB Data Warehousing OLAP ETL Statistics PRODUCT TIME REGION Drill for Detail OLAP Option Spreadsheet Add-In Data Mining Access & Assemble Data Oracle Warehouse Builder Page 10 Page 10 Mine for New Insights Oracle Data Mining Option Spreadsheet Add-In Statistics Text Mining

In-Database Analytics 의장점 기술적인측면 데이터는항상적절한제어하에 DB 상에존재함 복합질의를통해직관적인분석처리가능 확장의용이성및우수한처리성능 Fast scoring : 단일 CPU 시스템에서 250만개의레코드를단6초만에점수부여작업을마침 Oracle 10g DB Data Warehousing OLAP ETL Data Mining Statistics 비즈니스적인측면 실시간의정보분석처리가능 TCO 의절감가능 Page 11 Page 11

In-Database Analytics : 사례 예제 : DVD 마케팅캠페인시행결과의통계적검정 사전정의된분류방법에의해반응모델이만들어져있을때, 이를이용하여어떤고객이마케팅캠페인에응할것인가를예측 각각의고객들이캠페인시행이전 3개월과이후 3개월동안얼마만큼 DVD를구매했는가를분석 예측된고객들의캠페인성공률과반응하지않은고객들의구매율을서로다른지역과회사별로비교하고, 이정보들의통계적으로유의한지의여부를검정 Page 12 Page 12

In-Database Analytics : 사례기존의처리방법 1단계 : 데이터마이닝프로그램 DB로부터고객데이터를전달받음 프로그램상에서예측작업수행 예측된사용자정보를 DB에재전송 2단계 : DB 검색 예측된고객정보를로딩 해당고객들의캠페인전후구매상황을검색 캠페인성공여부정보를검색하여정리 3단계 : 통계패키지 캠페인성공률정보를 DB로부터받음 통계적검증작업을수행 Page 13 Page 13

In-Database Analytics : 사례오라클상에서의처리방법 하나의 SQL 로수행가능 select responder, cust_region, count(*) as cnt, sum(post_purch pre_purch) as tot_increase, avg(post_purch pre_purch) as avg_increase, stats_t_test_paired(pre_purch, post_purch) as significance from ( select cust_name, prediction(campaign_model using *) as responder, sum(case when purchase_date < 15-Apr-2005 then purchase_amt else 0 end) as pre_purch, sum(case when purchase_date >= 15-Apr-2005 then purchase_amt else 0 end) as post_purch from customers, sales, products@proddb where sales.cust_id = customers.cust_id and purchase_date between 15-Jan-2005 and 14-Jul-2005 and sales.prod_id = products.prod_id and contains(prod_description, DVD ) > 0 group by cust_id, prediction(campaign_model using *) ) group by rollup responder, cust_region order by 4 desc; 통계 : 유의성검증 마이닝 : 캠페인예측 기본 DB 정보검색 Page 14 Page 14

In-Database Analytics : 사례 사례를통해본오라클방법의장점 데이터이동이전혀없음 (SQL 안에서 pipelining) 분석프로세스가단순해짐 실시간분석이가능 고려사항 DB와 DM, 통계를모두아는전문가필요 일반및고급분석프로세스의분리 일반분석 : 상시화및실시간화 고급분석 : 전문화 Page 15 Page 15

목차 정보분석개요 효율적인정보분석방안고찰 오라클의정보분석전략 : In-Database Analytics DB 상에서의정보분석방안 In-Database Statistics OLAP Option Data Mining Option 요약및Q&A Page 16 Page 16

10g 가제공하는통계처리기능들 Ranking functions rank, dense_rank, cume_dist, percent_rank, ntile Window Aggregate functions (moving and cumulative) Avg, sum, min, max, count, variance, stddev, first_value, last_value LAG/LEAD functions Direct inter-row reference using offsets Reporting Aggregate functions Sum, avg, min, max, variance, stddev, count, ratio_to_report Statistical Aggregates Correlation, linear regression family, covariance Linear regression Fitting of an ordinary-least-squares regression line to a set of number pairs. Frequently combined with the COVAR_POP, COVAR_SAMP, and CORR functions. Note: Statistics and SQL Analytics are included in Oracle Database Standard Edition Descriptive Statistics average, standard deviation, variance, min, max, median (via percentile_count), mode, group-by & roll-up DBMS_STAT_FUNCS: summarizes numerical columns of a table and returns count, min, max, range, mean, stats_mode, variance, standard deviation, median, quantile values, +/- n sigma values, top/bottom 5 values Correlations Pearson s correlation coefficients, Spearman's and Kendall's (both nonparametric). Cross Tabs Enhanced with % statistics: chi squared, phi coefficient, Cramer's V, contingency coefficient, Cohen's kappa Hypothesis Testing Student t-test, F-test, Binomial test, Wilcoxon Signed Ranks test, Chisquare, Mann Whitney test, Kolmogorov-Smirnov test, One-way ANOVA Distribution Fitting Kolmogorov-Smirnov Test, Anderson-Darling Test, Chi-Squared Test, Normal, Uniform, Weibull, Exponential Pareto Analysis (documented) 80:20 rule, cumulative results table Page 17 Page 17

In-Database Statistics 통계패키지로의데이터이동없이단순통계분석처리가능 예 : 가설검정 Note: Statistics and SQL Analytics are included in Oracle Database Standard Edition Page 18 Page 18

OLAP 개요 OLAP 의중요성 SQL 로처리가힘든 Ad-Hoc 질의의효율적인처리 다차원정보모델의효율적인처리 빠른처리성능 별도 OLAP 서버구성의단점 구축과유지에따르는고비용 가용성과확장성문제 임의적 API에따른애플리케이션호환성문제 Customer Product Aggregation Rules Month Quarter Year Product Share Sales Year to Date Profit Sales Average Selling Price Time Forecast Rules Manufacturer Brand Item Allocation Rules Page 19 Page 19

Oracle 10g OLAP Option DW 상에서 OLAP 동시구현 하나의 DB 상에서대규모의관계형데이터와다차원데이터집합동시지원 별도의데이터구축작업없이매핑작업만으로다차원큐브의신속한구축가능 압축, 파티션, 병렬처리를통한신속한 Ad-Hoc 질의처리가능 다차원데이터타입에대한일반적인 SQL 인터페이스지원 OLAP API를통한최적화및확장지원 주요장점 빠른데이터처리성능 큐브구축및질의실행모두빨라짐 손쉬운사용 SQL 및 OLAP API를통한손쉬운개발및검색가능 Page 20 Page 20

Integrated RDBMS-MDDS : OLAP in Oracle Single RDBMS-MDDS process Single Data Store Single Metadata Repository Single Set of Management Tools Single Security Model OLAP APIs Page 21 SQL Query Page 21

Oracle OLAP Platform Oracle HTML DB OracleBI Reports OracleBI Discoverer OLAP OracleBI Spreadsheet Add-In Oracle BI Beans Oracle Demand Planning Oracle Enterprise Planning & Budgeting Database OLAP Option: Query Analysis Planning Oracle Warehouse Builder Analytic Workspace Manager Page 22 Page 22

Oracle DB 에서 OLAP 다차원이점 쉽고빠르게차원데이터집계 쿼리속도가빠르고일관적 사용자에게쉽게비즈니스분석을제공 Calculations that compare things e.g. last year to now Advanced data selections using many combined criteria More sophisticated analytical calculations Page 23 Page 23

쉽고빠르게차원데이터집계 1A. 집계정의가쉽다. Page 24 Page 24

쉽고빠르게차원데이터집계 1B. 집계가빠르다 Build time for Oracle Financials dataset 600 500 400 480 6 Million Input Rows Minutes 300 200 100 0 Materialized Views (partially aggregated*) 9 OLAP (fully aggregated**) * MV aggregated 1 dimension and 1 measure ** OLAP aggregated 7 dimensions and 11 measures Page 25 Page 25

쿼리속도가빠르고일관적 Slower Relational Query Response OLAP Faster Standardized queries Few calculations Simple calculations Nature of Query More Ad-hoc Many calculations Multi-level, multi-step calculations Planning Applications See http://www.oracle.com/technology/products/bi/olap/1450_olap10g_enhance_content_performance.pdf Page 26 Page 26

쿼리속도가빠르고일관적 Time To Prepare Data for Query More Time Preparation Time Without OLAP With OLAP Less Time Less Ad-Hoc Predictable Queries Simple Calculations More Ad-Hoc Unpredictable Query Patterns Sophisticated Calculations Page 27 Ad-Hoc Nature of Application and Query Patterns Page 27

Case Study 10 dimensional model 4,608 level combinations 7.54 * 1020 cells Dimension Levels Member s Shipping Location 2 52 Market Maker 2 166 Buyer 2 38 Customer 3 4,998 Supplier 2 3 Product 3 7,099 Time 4 764 Area 2 13 Shipping Location 2 53 Shipped From 2 41 Page 28 Page 28

Case Study: Simple Queries 120 100 Time to build Time to execute simple queries 98 80 60 40 20 17 17 14 16 14 10 23 0 Analytic Workspace 14 MVs 214 MVs 518 MVs Page 29 Page 29

Case Study: OLAP Queries 450 400 350 300 411 Time to build Time to execute OLAP queries 250 200 150 100 126 98 147 50 0 17 23 10 17 Analytic Workspace 14 MVs 214 MVs 518 MVs Page 30 Page 30

사용자에게쉽게비즈니스분석을제공 3A. OLAP End-User Tool Page 31 Page 31

사용자에쉽게비즈니스분석을제공 3B. 분석을위한최적의 Technology 2004 년 UK 에서 Oracle 의매출은? 일반 SQL Select sum(f.sales) from fact f, time t, prod p, geog g where f.time_id = t.time_id and f.prod_id = p.prod_id and f.geog_id = g.geog_id group by t.year, p.prod, g.country having t.year = 2004 and p.prod = RDBMS and g.country = UK OLAP Limit prod to RDBMS Limit geog to UK Limit time to 2005 Show sales Page 32 Page 32

사용자에쉽게비즈니스분석을제공 3B. 분석을위한최적의 Technology 2004 년 UK 에서 Oracle 의매출은? 작년보다실적이좋은가? 일반 SQL Select lag(sum(f.sales), 1) over (partition by p.prod, g.country order by t.year) / sum(f.sales) from fact f, time t, prod p, geog g where f.time_id = t.time_id and f.prod_id = p.prod_id and f.geog_id = g.geog_id group by t.year, p.prod, g.country having t.year IN ( 2003, 2004 ) and p.prod = RDBMS and g.country = UK OLAP Show lagpct( sales, 1, time ) Page 33 Page 33

사용자에쉽게비즈니스분석을제공 3B. 분석을위한최적의 Technology 2004년 UK에서 Oracle의매출은? 작년보다실적이좋은가? 작년보다이익기여도는어떠한가? 일반 SQL Select lag(sum(f.sales) / sum(f.sales) over (partition by g.country),1) over (partition by g.country, p.prod order by t.year) from fact f, time t, prod p, geog g where f.time_id = t.time_id and f.prod_id = p.prod_id and f.geog_id = g.geog_id group by t.year, p.prod, g.country having t.year IN ( 2003, 2004 ) and p.prod = RDBMS and g.country = UK OLAP Show lagpct( sales /sales( prod ALL ), 1, time) Page 34 Page 34

사용자에게쉽게비즈니스분석을제공 3B. 분석을위한최적의 Technology Accessible via simplified SQL SELECT time_desc, channel_desc, product_desc, geography_desc, sales, sales_ly_pct, sales_pp_pct, sales_shr_parentprod FROM sales_cubeview WHERE product_colour = 'RED', geography_level = 'REGION', time_level = 'MONTH', time_parent = '2002'; Simple SQL No Joins No SQL Calcs No SQL Aggregations Very Clever, Very Fast Page 35 Page 35

Data Mining 개요 대량의데이터안에서숨겨진패턴들과새로운통찰적지식을찾아가는프로세스 Data Mining 이제공할수있는가치 목표속성과밀접히연관된요인들의파악 (Attribute Importance) 고객행위의예측 (Classification) 목표고객혹은물품의프로화일구축 (Decision Trees) 샘플정보의세그먼트화 (Clustering) 대상체내에존재하는중요한관련성의탐색 (Associations) 사기등과같은드문사건의파악 (Anomaly Detection) Page 36 Page 36

Data Mining 응용사례 금융 경쟁감손 (churn) 사기적발 대출부도 (Basel II) 판매기회파악 통신 이탈고객예측및일생가치를가지는목표고객탐색 교차판매기회파악 DB 마케팅 목표고객대상제품캠페인 교차및상향판매기회파악 보험, 공공 회계이상체크 (Sarbanes-Oxley) 의심되는업무의감사를통한비용절감 유통 충성고객프로그램 교차판매 시장바스켓분석 사기적발 생명과학 환자들과연관된의심요인들분석 목표유전자및단백질발견 신약개발의주도물질파악 Page 37 Page 37

Oracle Data Mining 10gR2 Oracle in-database Mining Engine Oracle Data Miner (GUI) Simplified, guided data mining Spreadsheet Add-In for Predictive Analytics 1-click data mining from a spreadsheet PL/SQL API & Java (JDM) API Develop advanced analytical applications 지원하는알고리즘 Anomaly detection Attribute importance Association rules Clustering Classification & regression Nonnegative matrix factorization Structured & unstructured data (text mining) BLAST (life sciences similarity search algorithm) Page 38 Page 38

Oracle Data Mining Oracle Data Mining provides summary statistical information prior to data mining Page 39 Page 39

Oracle Data Mining Oracle Data Mining provides model performance and evaluation viewers Oracle Data Mining s Activity Guides simplify & automate data mining for business users Page 40 Page 40

Oracle Data Mining Apply model viewers Additional model evaluation viewers Page 41 Page 41

Oracle BI EE 와의연계 -Administration in BI EE Oracle BI EE defines results for end user presentation Oracle Data Mining results available to Oracle BI EE administrators Page 42 Page 42

Oracle BI EE Reports Likelihood to buy Page 43 Page 43

Oracle BI EE Reports Create Categories of Customers Oracle Data Mining reveals important relationships, patterns, predictions & insights to the business users Page 44 Page 44

Use Results in Discover Reports Copyright 2006 Oracle Corporation Page 45 Page 45

Oracle Spreadsheet Add-In for Predictive Analytics 엑셀사용자가 Oracle이나엑셀 data를 predictive analytics 기능을이용하여쉽게사용 사용자는엑셀에서테이블 / 뷰등을지정하고속성을선택 Page 46 Page 46

Oracle Data Mining 알고리즘과응용예제 Attribute Importance 목표속성에가장큰영향을미치는속성들을파악함 고비용과가장밀접히연관된요인의파악 Classification & Prediction 다음의경향이가장큰고객을예측 캠페인혹은제안에반응 가장많은이익을제공 최고의고객을파악하고프로파일개발 Regression A1 A2 A3 A4 A5 A6 A7 Married >$50K Gender Income <=$50K Age M F >35 <=35 Status Gender HH Size Single F M >4 Buy = 0 Buy = 1 Buy = 0 Buy = 1 Buy = 0 <=4 Buy = 1 수치적예측을수행 평균구매금액및비용예측 Page 47 Page 47

Oracle Data Mining 알고리즘과응용예제 Clustering 자연스럽게발생하는그룹을발견 시장세그먼테이션 질병유발그룹파악 정상및비정상행위의구분 Association Rules 시장바스켓에서동시발생하는물품들파악 물품결합을제안 보다효율적인제품전시지원 Feature Extraction 대규모데이터를대표적인속성몇가지로축약 clustering 및 text mining 에활용 F1 F2 F3 F4 Page 48 Page 48

metagroup.com Copyright 2004 META Group, Inc. All rights reserved. METAspectrum 60.1 Page 49 Page 49

오라클의정보분석전략의이점 In-Database Analytics Benefit 분석애플리케이션을위한플랫폼제공 데이터이동을없애고보안이슈에의노출을최소화할수있음 빠른정보관리체인제공 넓은범위마이닝및통계처리알고리즘제공 대부분의정보분석문제에대한해결방안을제공 복수의 H/W, O/S 에서운용가능 다양한운영환경에서분석애플리케이션수행가능 오라클 DB 기술을최대한활용가능 Grid, RAC, 통합 BI, SQL & PL/SQL 사용가능 기존의 DB 기술최대한활용 Page 50 Page 50

Oracle Advanced Analytics Know More! Leverage your data and discover new hidden information and valuable insights Do More! Oracle 10g DB Data Warehousing OLAP ETL Data Mining Statistics Build applications that automate the extraction and dissemination of data mining s insights Move from Tool to Enterprise BI Application Spend Less! Option to Oracle 10g Database Enterprise Edition Eliminates need for new servers, new software, and new support skills/resources Page 51 Page 51