Bayesian Networks April 30, 2012 Sung-Bae Cho Dept. of Computer Science, Yonsei University
Outline Bayesian Network Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Application Eample Summary & Review
robabilistic aradigm Advantages Can accommodate inaccurate models Can accommodate imperfect sensors Robust in real-world applications itfalls Computationally demanding False assumptions Approimate
robabilistic State Estimation Aioms of robability Theory ra denotes probability that proposition A is true 0 r A 1 r True 1 r False 0 True A A B B B r A B r A r B r A B Utilization r A A r True 1 r A r A r A r A A r A r A r False r A r A 0 1 r A
Joint & Conditional robability Joint and Conditional robability X= and Y=y =,y If X and Y are independent then,y = y y is the probability of given y y =,y / y,y = y y If X and Y are independent then y = Law of Total robability, Marginals Discrete case 1, y y y y y Continuous case p d 1 p p, y dy p p y p y dy
robability Calculation Bayes Formula Normalization Conditioning Total probability: Bayes rule and background knowledge: evidence prior likelihood, y y y y y y y 1 1 y y y y y y,, z y z z y z y dz y z z y y,
Outline Bayesian Network Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Application Eample Summary & Review
Rules of robability roduct Rule Marginalization Bayes Rule Chain rule of probability, X X Y Y Y X Y X,, Y Y Y, 1 n i i Y Y X binary:, H H E E E H E H E H H E E H n i i i n p p p p p p 1 1 1 2 1 3 1 2 1 1,...,...,,...,
The Joint Distribution Recipe for making a joint distribution of M variables Make a truth table listing all combinations of values of your variables For each combination of values, say how probable it is
Using the Joint Once you have the joint distribution, you can ask for the probability of any logical epression involving your attribute E row rows matching E
Using the Joint oor Male 0.4654 E row rows matching E oor 0.7604 E row rows matching E
Inference with Joint E 1 E 2 E1, E 2 E 2 rows matching E 1 rows matching E row and E 2 2 row
Inference with the Joint Male oor 0.4654/0.7604 0.612 E 1 E 2 E1, E 2 E 2 rows matching E 1 rows matching E row and E 2 2 row
Joint Distributions Good news Once you have a joint distribution, you can ask important questions about stuff that involves a lot of uncertainty Bad news Impossible to create for more than about ten attributes because there are so many numbers needed when you build them
Bayesian Networks 1 In general X 1, X n needs at least 2 n 1 numbers to specify the joint probability Eponential storage and inference Overcome the problem of eponential size by eploiting conditional independence REAL Joint robability Distribution 2 n -1 numbers Conditional Independence Domain knowledge or Derived from data Graphical Representation of Joint robability Distribution Bayesian Network
Bayesian Networks 2 A LS TA Visit to Asia Smoking Tuberculosis Lung Cancer S BS Bronchitis BN G, CD: Θ CT,L Chest X-ray Dyspnoea DT,L,B T L B D=0 D=1 0 0 0 0.1 0.9 0 0 1 0.7 0.3 0 1 0 0.8 0.2 0 1 1 0.9 0.1... A, S, T, L, B, C, D Conditional Independencies = A S TA LS BS CT,L DT,L,B Efficient Representation [Lauritzen & Spiegelhalter, 95]
Bayesian Networks 3 Structured, graphical representation of probabilistic relationships between several random variables Eplicit representation of conditional independencies Missing arcs encode conditional independence Efficient representation of joint pdf probabilistic distribution function Allows arbitrary queries to be answered lung cancer=yes smoking=no, dyspnoea=yes=?
Bayesian Networks 4 Also called belief networks, and directed acyclic graphical models Bayesian network Directed acyclic graph Nodes are variables discrete or continuous Arcs indicate dependence between variables Conditional robabilities local distributions
Bayesian Networks 5 S no, light, heavy Smoking Cancer S = no 0.80 S = light 0.15 C none, benign, malignant S = heavy 0.05 Smoking = no light heavy C = none 0.96 0.88 0.60 C = benign 0.03 0.08 0.25 C = malig 0.01 0.04 0.15
roduct Rule C,S = CS S S C none benign malignant no 0.768 0.024 0.008 light 0.132 0.012 0.006 heavy 0.035 0.010 0.005 S C none benign malig total no 0.768 0.024 0.008.80 light 0.132 0.012 0.006.15 heavy 0.035 0.010 0.005.05 total 0.935 0.046 0.019 Smoke Cancer
Bayes Rule S C C S S C C, S C S C none benign malig no 0.768/.935 0.024/.046 0.008/.019 light 0.132/.935 0.012/.046 0.006/.019 heavy 0.030/.935 0.015/.046 0.005/.019 Cancer= none benign malignant S=no 0.821 0.522 0.421 S=light 0.141 0.261 0.316 S=heavy 0.037 0.217 0.263
Missing Arcs Represent Conditional Independence Battery Engine Turns Over Start Start and Battery are independent, given Engine Turns Over, t s p t b s p,, t s p b t p b p s t b p n t t t n parents p p 1 2 1,...,, General product chain rule for Bayesian networks
Bayesian Network A, G, E, S, C, L, SC Age Gender A G Eposure to Toics Smoking E A S A, G Cancer C E, S Serum Calcium Lung Tumor SC C L C
Outline Bayesian Network Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Application Eample Summary & Review
Inference We now have compact representations of probability distributions: Bayesian Networks Network describes a unique probability distribution How do we answer queries about? We use inference as a name for the process of computing answers to such queries
Inference: The Good & Bad News We can do inference We can compute any conditional probability Some variables Some other variable values E 1 E 2 E1, E 2 E 2 joint entries matching row E joint entries matching 1 and E row E 2 2 The sad, bad news Conditional probabilities by enumerating all matching entries in the joint are epensive: Eponential in the number of variables Sadder and worse news General querying of Bayesian networks is N-complete Hardness does not mean we cannot solve inference It implies that we cannot find a general procedure that works efficiently for all networks For particular families of networks, we can have provably efficient procedures
Queries: Likelihood There are many types of queries we might ask. Most of these involve evidence Evidence e is an assignment of values to a set E variables in the domain Without loss of generality E = { X k+1,, X n } Simplest query: compute probability of evidence e 1 k 1,, k, e This is often referred to as computing the likelihood of the evidence
Queries Often we are interested in the conditional probability of a variable given the evidence This is the a posteriori belief in X, given evidence e A related task is computing the term X, e i.e., the likelihood of e and X = for values of X we can recover the a posteriori belief by, e e X e X X X X,, e e e
A osteriori Belief This query is useful in many cases: rediction: what is the probability of an outcome given the starting condition Target is a descendent of the evidence Diagnosis: what is the probability of disease/fault given symptoms Target is an ancestor of the evidence As we shall see, the direction between variables does not restrict the directions of the queries robabilistic inference can combine evidence from all parts of the network
Approaches to Inference Eact inference Inference in Simple Chains Variable elimination Clustering / join tree algorithms Approimate inference Stochastic simulation / sampling methods Markov chain Monte Carlo methods Mean field theory
Inference in Simple Chains How do we compute X 2? How do we compute X 3? we already know how to compute X 2... X 1 X 2 1 1, 1 2 1 2 1 2 X 1 X 2 X 3 2 2, 2 3 2 3 2 3 1 1, 1 2 1 2 1 2
Inference in Simple Chains X 1 X 2 X 3 X n How do we compute X n? Compute X 1, X 2, X 3, We compute each term by using the previous one Compleity: i 1 i i i 1 i O Val X Val X 1 Each step costs operations i Compare to naïve evaluation, that requires summing over joint values of n-1 variables X X, X,..., X n X 1, X 2,..., X n1 1 2 n i
Outline Bayesian Network Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Application Eample Summary & Review
Why Learning? knowledge-based epert systems -Answer Wizard, Office 95, 97, & 2000 -Troubleshooters, Windows 98 & 2000 -Causal discovery -Data visualization -Concise model of data -rediction data-based
Why Learning? Knowledge acquisition is bottleneck Knowledge acquisition is an epensive process Often we don t have an epert Data is cheap Amount of available information growing rapidly Learning allows us to construct models from raw data Conditional independencies & graphical language capture structure of many real-world distributions Graph structure provides much insight into domain Allows knowledge discovery Learned model can be used for many tasks Supports all the features of probabilistic learning Model selection criteria Dealing with missing data & hidden variables
Why Struggle for Accurate Structure? Earthquake Alarm Set Burglary Sound Adding an arc Missing an arc Earthquake Alarm Set Burglary Earthquake Alarm Set Burglary Sound Sound Increases the number of parameters to be fitted Wrong assumptions about causality and domain structure Cannot be compensated by accurate fitting of parameters Also misses causality and domain structure
Learning Bayesian Networks from Data Bayesian networks data X 1 X 2 X 3 True 1 0.7 False 5-1.6 False 3 5.9 true 2 6.3 Bayesian network learner X 1 X 2 X 3 X 4 X 5 X 6 X 7 + rior/epert information X 8 X 9
Learning Bayesian Networks Known Structure, Complete Data Unknown Structure, Complete Data Network structure is specified Inducer needs to estimate parameters Data do not contain missing values Known Structure, Incomplete Data Network structure is not specified Inducer needs to select arcs & estimate parameters Data do not contain missing values Unknown Structure, Incomplete Data Network structure is specified Data contain missing values Need to consider assignments to missing values Network structure is not specified Data contain missing values Need to consider assignments to missing values
Two Types of Methods for Learning BNs Constraint based Finds a Bayesian network structure whose implied independence constraints match those found in the data Scoring methods Bayesian, MDL, MML Find the Bayesian network structure that can represent distributions that match the data i.e., could have generated the data ractical considerations The number of possible BN structures is super eponential in the number of variables. How do we find the best graphs?
Outline Bayesian Network Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Application Eample Summary & Review
What are BNs useful for? rediction: symptomcause=? Cause Diagnosis: causesymptom=? Classification: ma class classdata Decision-making given a cost function Data mining: induce best model from data redictive Inference Effect Diagnostic Reasoning Cause Effect Medicine Speech recognition Bioinformatics Stock market Tet Classification Computer troubleshooting
Why use BNs? Eplicit management of uncertainty redictive-diagnostic Reasoning Modularity modular specification of a joint distribution implies maintainability Better, fleible and robust decision making MEU Maimization of Epected Utility, VOI Value of Information Can be used to answer arbitrary queries - multiple fault problems General purpose inference algorithm Easy to incorporate prior knowledge Easy to understand
Uncertainty Handling Why Bayesian Network? artial evidence handling Input/output converting 입력출력
Utilization of robability Evidence Why Bayesian Network? robabilistic evidence handling 0.60 0.65 0.78 0.30 0.37 0.88 0.83 0.15 0.88 입력출력 0.25 0.89 0.74
redictive Reasoning Why Bayesian Network? Evidence Direction of Reasoning
Diagnostic Reasoning Why Bayesian Network? Direction of Reasoning Evidence
redictive-diagnostic Reasoning Why Bayesian Network? Can get probability distribution of multiple target nodes Input Output
App. Eam: BN Based Robot Intelligence 플래너 : 문은어디에있는가? 도메인정보 : 벽쪽, 네모난형태 플래너 : 문통과가가능한가? 영상필터 : 문의종류 미는문, 문이열린상태 조금열림, 문과의거리 추론 : 통과가능판단 플래너 : 문으로이동, 장애물발견 추론 : 통과가능성낮아짐 BN 을이용한문통과가능성추론및장애물판단 추론 : 장애물이동방향입력결과통과가능성높아짐 플래너 : 문통과및이동 : 거실-> 거실문-> 주방
App. Eam: BN Based Robot Intelligence 2 플래너 : 냉장고예상위치는? 도메인지식 : 벽면쪽 추론 : 발견된물체 싱크대 근처, 벽면쪽 냉장고위치찾기 영상필터 : 냉장고앞, 의자발견 플래너 : 냉장고앞이동가능? 추론 : 장애물때문에이동불가, 치워야함 [ 영상필더 ]: 장애물 = 의자 추론 : 의자는로봇이옆으로움직일수있음 추론 : 유저에게치워도되는지물어보시오! 냉장고앞으로이동하기위한추론
App. Eam: BN Based Robot Intelligence 3 로봇 : " 의자를치우겠습니다. 괜찮습니까?" 유저 : " 그래치워라." 플래너 : 의자옆으로밀기 추론 : 유저위치 = 주방, 의자가있을위치 = 식탁옆 추론 : 식탁이가깝고, 식탁옆에공간있으므로그고스로이동하시오. 플래너 : 의자를식탁으로이동 플래너 : 냉장고앞으로이동, 냉장고문열기수행 [ 플래너 ]: 냉장고문열기중실패 추론 : 냉장고옆에다른장애물이있는것으로추정됨 장애물치우기 냉장고앞의다른장애물추론
1. Door assability Given: 조금먼거리, 문은여닫이문, 열려있음, 장애물은안보임 Output: 문통과가능성 70% 입력 쿼리
2. Approaching & Obstacle Detection Given: 문으로접근, 장애물발견 Output: 문통과가능성 55% 입력 쿼리
3. Inference of Obstacle s Status Given: 장애물위치는오른편, 장애물은오른쪽으로이동중 Output: 문통과가능성 68% 입력 쿼리
4. Approaching Refrigerator 입력 쿼리 장애물이냉장고오른편가까이있고냉장고에서약간가까이있을때 : 물체를치우기요청 89%
5. Removing Obstacle Given: 냉장고왼편에물체, 냉장고앞에의자. 테이블가까움 Output: 장애물치우기! 추천위치 1. 테이블근처 85% > 2. 냉장고오른쪽 76% 입력 쿼리
Applications Industrial rocessor Fault Diagnosis - by Intel Auiliary Turbine Diagnosis - GEMS by GE Diagnosis of space shuttle propulsion systems - VISTA by NASA/Rockwell Situation assessment for nuclear power plant NRC Military Automatic Target Recognition - MITRE Autonomous control of unmanned underwater vehicle - Lockheed Martin Assessment of Intent Medical Diagnosis Internal Medicine athology diagnosis - Intellipath by Chapman & Hall Breast Cancer Manager with Intellipath Commercial Financial Market Analysis Information Retrieval Software troubleshooting and advice - Windows 95 & Office 97 regnancy and Child Care - Microsoft Software debugging - American Airlines SABRE online reservation system
Outline Bayesian Network Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Application Eample 1 Application Eample 2 Summary & Review
Summary & Review Motivation Eplicit representation of uncertainty using the calculus of probability theory Bayesian Network Inference & Modeling Methods Applications Ongoing research How to use Bayesian intelligence for robot control How to design probabilistic models How to combine the deliberative method with reactive method
향후일정 5/2: IT 표준화최신현황및전망 진병문본부장, 5 시 B039 *** Review aper Due 5/72: 허윤진, 김용중 5/9: 정원섭 5/142: 이명춘, 양견모 5/16: 이시혁 5/212: 박성기, 정광복 5/23: 박광일 5/30: 이석준 6/4: 윤성재 6/11: 텀프로젝트최종발표