, an IBM Company 2005 IBM Corporation
Discover Prepare Transform & Deliver????????? Time To Value
DISCOVER ProfileStage Service-Oriented Architecture Event Management PREPARE,, QualityStage Enterprise Connectivity TRANSFORM and DELIVER DataStage Service-Oriented Architecture,, Connectivity IT (XML, EDI, JMS, JCA) Complementary To BPM, EAI, and EII Technologies
WebSphere Data Integration Suite,, ROI Discover Prepare Transform Ascential Enterprise Integration Suite Time-to-Value Impact on Strategic Objectives
Custom Apps on Mainframe Data Requirements Data Extraction Criteria Functional / Technical Input Template Common Target Format Custom Apps on UNIX 3 rd Party SW SAP Oracle Siebel PeopleSoft Initial Extraction Extract & Load into Stage 1 Data Assessment Discover Assess & Validate individual sources Data Alignment Cleanse Standardize individual sources Data Harmonization Consolidate Integrate Cleanse Normalize & Harmonize across sources Solution Implementation Target System Prepare Transform Load Load OLTP SAP BW SAP R/3 EDW External Data DataStage ProfileStage & DataStage QualityStage & DataStage QualityStage DataStage
Measure Analyze Improve
:, Training or Skill Why so many project overruns or failures? :,
rofilestage! /,,. - Doug Laney (4/3/2002)
hat does ProfileStage do? Interactive analysis of all data sources Table & Primary Key Analysis / Column Analysis Source 1 Source 2 Cross-Table & Relationship Analysis Provides comprehensive understanding of the data,,,,, to & job &
inimize Project Resources and Risk Projects with extensive manual source analysis Source System Analysis Transform & Risk!!! Projects with limited manual source analysis Analysis Transform Test Deploy!!! Projects using ProfileStage s automated data profiling Analysis Transform Test Deploy No surprises!! Analysis OR Project cancelled:!!!
ualitystage Data Harmonization QualityStage, Data Re-Engineering Investigation ( ) Standardization ( ) Matching ( ) Survivorship ( ) Assessment Customers Transaction Localization Rule Set Materials Vendor/ Supplier 1. Investigation 2. Standardization 3. Matching 4. Survivorship DB
ata Preparation Data Re-Engineering Step 1) Investigation bbbbbbbbbbbb nnnbnnnbnnnn nnnbnnnnnnnn 5657 3554 781 Percentage 56.570% 35.540% 7.810% 011 232 2323 011 99152365 nnnbnnbnnnnb 2 Percentage C_SG_BK 5151 51.510% C_ST_BK 1663 16.630% C_B_BK 1048 10.480% C_ST_DG_BK 862 8.620% C_SG_SK 339 3.390% % ^J,^?^?^B,?,^?,^?^?^E,N 6 1.364% 0.020% 011 88 9941 15 TEETH,1-3/4" BORE,TAPERLOCK,50 PITCH,3-5/16" OD,D Rule Set Rule Set,?M??^?^?^?^?M??^?^?^?^?^ 6 5 1.364% 1.136% BUTTON HEAD SOCKET SCREW 6-32 X 1/2" HEX HEAD CAP SCREW 1/2-13 X 5 GRADE 5
ata Preparation Data Re-Engineering Step 2) Standardization Reference DB,, Reference DB
ata Preparation Data Re-Engineering Step 2) Standardization ID Description ======================================================= = A0059 GASKET,UPPER STEAM CYL,1929;; A0060 GASKET,DOOR,MG018;CASE;REACH IN A0061 MOTOR,BLOWER,3M726B;1/50 HP;SHADE POLE A0062 MOTOR,DRIVE,GOLD MEDAL;82085;POPCORN A0063 MOTOR,MAIN,NSK;110VOLTS; A0064 16 TEETH,2-1/4" BORE,TAPERLOCK,50 PITCH,3-9/16", A0065 MOTOR,MAIN:20029;110VOLTS; A0066 19 TEETH,1610 BORE,50 PITCH,MARTIN, ID Name # Val1 Val2 Val3 Diameter Threads... ======================================================= = A0059 1929 GASKET UPPER STEAM CYL A0060 GASKET DOOR MG018 A0061 SHADE MOTOR BLOWER A0062 MEDAL 82085 MOTOR DRIVE GOLD A0063 NSK 110 MOTOR MAIN VOLTS A0064 MARTIN 16 TEETH 2-1/4 IN BORE... A0065 29920 MOTOR MAIN 110VOLTS A0066 MARTIN 19 TEETH 1610 BORE 50 PITCH... Reference DB Description,
ata Preparation Data Re-Engineering Step 3) Matching Matching 3 1. Blocking : 1 2. Scoring : 3. Cutoff : Matching Blocking ( ID) Blocking ID Score 37 37 37 MP DA DA 41.09 41.09 31.09 02 76X 700X 02 76X 700X 02 76X 700X.. 2 191.. 2 191.. 3 191 XXX XXX XXX Cutoff (20 ) 37 DA 11.09 02 76X 700X.. 2 193 XXX Scoring
ata Preparation Data Re-Engineering Step 4) Survivorship Matching 1/1/03 10/10/02 6/3/99 3 XG SOURCE 10/10/02 RECENCY 1/1/03 FREQUENCY Best-of-breed (Consolidated view) LENGTH 3 XG
Rule Data Data Rule Data Data Data Data Data
(P ) MES Data Quality Assessment PI Problem Solution Result ProfileStage AuditStage 10
(P ) As-is Source1 SAM File SAM File SAM File SAM Source2 LOGIC & Eye Check DB SAM File SAM File SAM File SAM Eye Checking Source
(P ) To-be Source1 SAM File SAM File SAM SAM AuditStage Source2 ProfileStage DB SAM File SAM File SAM SAM
(P ) (ProfileStage) Column Analysis Table Analysis Primary Key Analysis Cross-Table Analysis Relationship Analysis Normalization Analysis (AuditStage) Domain Analysis Completeness & Valid Assessment Structural Integrity Assessment
(Telstra ADBOR) Telstra Address DBOR Australia s largest Telco (one of the world s top 20) creates unique address verification system Problem Solution Result 170 Front-end / 100 Back-end Billing QualityStage Standardization, Matching 1 2 6.5 Telstra Address Database of Record (ADBoR) Real Time API Telstra
(Telstra ADBOR) 1. OS/390 QualityStage RealTime Server 2., HTTP MQSerie s Data Services Search (C++) 5. DB 3. 4. Window NT DB
QualityStage RealTime Architecture QualityStage Application APIs Standardization, Verification, Matching, Survivorship etc. API QualityStage Real Time Server API Indices Target DB Presentation Layer SUPPORTED API s: Java, C, C++, COM, Cobol Server Layer SERVER PLATFORMS: UNIX, OS/390, NT Application Server Target Layer Database Server
2005 IBM Corporation