5. hamks@dongguk.ac.kr (regular expression): (recognizer) : F(, scanner) CFG(context-free grammar): : PD(, parser) CFG 1
CFG form : N. Chomsky type 2 α, where V N and α V *. recursive construction ) E E OP E (E) -E id OP + / V N = { E, OP } V T = { (, ),, id, +,, /} ) <if_statement> 'if' <condition> 'then' <statement> V N V T > symbol. ' symbol. (Derivation) α 1 α 2 symbol nonterminal nonterminal rhs. (1) : derives in one step. if γ P, α, β V * then αβ αγβ. (2) * : derives in zero or more steps. 1. α V *, α * α 2. if α * β and β γthen α * γ (3) + : derives in one or more steps. L(G) : G = {ω S * ω, ω V T* } : sentence : S * ω, ω V * T terminal. sentential form : S * ω, ω V *. nonterminal? sentential form nonterminal? α, where α V *. leftmost derivation: nonterminal. rightmost derivation: nonterminal. 2
) E E + E E * E (E) a (a+a) E lm (E) lm (E+E) lm (a+e) lm (a+a) E rm (E) rm (E+E) rm (E+a) rm (a+a) (parse) : parser. (left parse) :. - top-down parsing - symbol (right parse) :. - bottom-up parsing - nonterminal reduce symbol reduce. ) (a+a)*a. 1. E E + E 2. E E * E 3. E (E) 4. E a E 2 E * E 3 (E) * E 1 (E + E) * E 4 (a + E) * E 4 (a + a) * E 4 (a + a) * a : 231444 E 2 E * E 4 E * a 3 (E) * a 1 (E + E) * a 4 (E + a) * a 4 (a + a) * a : 441342 (derivation). CFG G = (V N,V T,P,S) ω V T*. 1 2... n 3
) G : E E + T T T T * F F F ( E ) a ω : a + a * a a + a * a : E / \ E + T / / \ T T * F / / \ F F a / / a a derivation tree., tree unique. context-free grammar G is ambiguous if and only if it produces more than one derivation trees for some sentence. ) dangling else : G: S if C then S else S if C then S a C b ω: if b then if b then a else a S S if C then S else S if C then S b if C then S a b if C then S else S b a b a a? 2.. α. sentential form : αα tree form : or 4
ω 1) + > 2) > + E E E E E + E a E + E E E a a a a a nonterminal G : E E + T T T T * F F F a E E + T T T * F (ambiguity) algorithm, formal. F F a a a expression : expression expression + term expression term term term term * factor term / factor factor factor primary factor primary primary - primary element element ( exp ) id id * id + id derivation tree: expression expression + term term factor term * factor primary factor primary element primary element id element id id 5
: L(G1) = L(G2) G1 G2 (equivalent ). (i) (Substitution) : if αbγ, B β1 β2 β3 βn P, then P' = ( P - { αbγ } ) { αβ1γ αβ2γ... αβnγ }. (ii) (Expansion) : αβ <=> αx, X β or Xβ, X α ) P : S a bb bb b B a a. (Useless Productions) : G=(V N,V T,P,S), S * uxv * ω, ω V T * X. : (i) Terminating nonterminal : α, α * ω, where V N and ω V * T. (ii) ccessible symbol : S * µxω, where X V and µ,ω V *. CFG Terminating nonterminal ccessible symbol Useful productions Terminating nonterminal algorithm terminating; begin V N ' := { ω P, ω V T* }; repeat V N ' := V N ' { α P, α ( V N ' U V T ) * } until no change end. ccessible symbol algorithm accessible; begin V' := { S }; repeat V' := V' { X some αxβ P, V' } until no change end. 6
ε- ε, V N ε- (ε -production). : CFG G = (V N, V T, P, S ) ε -free. (1) P ε-, (2) S ε-, S. ε -free : algorithm ε -free; begin P' := P { ε V N }; V Nε := { =>* ε, V N }; for α 0 B 1 α 1 B 2...B K α K P',where α i ε and B i V Nε do if B i β P' then P' = P' { α 0 X 1 α 1 X 2...X K α K X i = B i or X i = ε} else P' = P' { α 0 X 1 α 1 X 2...X K α K X i = ε} end for if S V Nε then P' := P' { S' ε S } end. V Nε : algorithm Compute_V Nε ; begin V Nε := { ε P}; repeat V Nε := V Nε {B B α P, α V Nε* } until no change end. ) S asbs bsas ε -free P' = {S asbs bsas } V Nε = {S} P' rhs S S : S asbs S asbs abs asb ab S bsas S bsas bas bsa ba P' = {S asbs abs asb ab bsas bas bsa ba}, S V Nε S' S P' ε -free S' S S asbs abs asb ab bsas bas bsa ba : B, where,b V N. algorithm Remove_Single_Production; begin P' := P { B, B V N }; for each V N do V N = { B + B } ; for each B V N do for each B α P' do (* not single production *) P' := P' { } end for end for end for end. algorithm Compute_V N ; begin V N := {B B P}; repeat V N := V N {C B C P, B V N } until no change end. 7
) E E + T T T T * F F F (E) a : CFG G = ( V N, V T, P, S ) is said to be cycle-free if there is no derivation of the form + for any in V N. G is said to be proper if it is cycle-free, is ε-free, and has no useless symbols. : CFG G = (V N,V T,P,S) is said to be in Chomsky Normal Form(CNF) if each production in P is one of the forms (1) BC with,b, and C in V N, or (2) a with a V T, or (3) if ε L(G), then S ε is a production, and S does not appear on the right side of any production. : CFG G = (V N,V T,P,S) is said to be in Greibach Normal Form(GNF) if G is ε-free and each non-ε-production in P is of the form aα with a V T and α V N *. < > ::= <S> ::= <> <B> <> ::= a <> a <B> ::= <B> b b <compound-statement> ::= begin <statement-list> end <statement-list> ::= <statement> <statement-list> ; <statement> Extended BNF(EBNF) meta symbol. meta symbol (repetitive part): { } (optional part): [ ] (alternative): ( ) ) <compound-statement> ::= begin <statement> {;<statement>} end ) <if-statement> ::= if <condition> then <statement> [else <statement>] ) <exp> ::= <exp> + <exp> <exp> - <exp> <exp> <exp> <exp> / <exp> <exp> ::= <exp> ( + / ) <exp> 8
(Syntax diagram) syntax diagram : :terminal symbol :nonterminal symbol : syntax diagram : terminal a nonterminal a ::= X 1 X 2... X n (1) X i nonterminal : X 1 X 2... X n (2) X i terminal : X 1 X 2... X n ::= α 1 α 2... α n α 1 α 2.. α 3 EBNF ::= {α} EBNF ::= [α] α α EBNF ::= (α 1 α 2 )β α 1 α 2 β 9
push-down automata PD: CFG -- push-down list(stack),, (finite state control). top. a 1 a 2... a n Finite state control Z 1 Z 2 Z n stack push-down automata : PD P = (Q, Σ, Γ, δ, q 0, Z 0, F), where, Q :. Σ :. Γ :. δ : Q (Σ {ε}) Γ Q Γ*, q 0 Q : (start state), Z 0 F :, F Q : (final state). δ : Q (Σ {ε}) Γ Q Γ* δ(q,a,z) ={(p 1, α 1 ), (p 2, α 2 ),...,(p n, α n )} : q a top Z, n (p i,α i ). (1) q a p i. (2) top Z α i. push-down automata P configuration : (q, ω, α) where, q : current state ω : input symbol α : contents of stack P (move): 1) a ε: (q, aω, Zα) ( q', ω, γα) 2) a = ε : (q, ε, Z) (q', ε, γ) <===> ε-move * : zero or more moves, + : one or more moves L(P) : the language accepted by PD P. - start state : (q 0, ω, Z 0 ) - final state : (q, ε, ), where q F, α Γ * L(P) = { (q 0, ω, Z 0 ) * (q, ε, α), q F, α Γ * }. 10
push-down automata ) P = ({q 0,q 1,q 2 }, {0, 1}, {Z, 0}, δ, q 0, Z, {q 0 }), where, δ(q 0, 0, Z) = {(q 1, 0Z)} δ(q 1, 0, 0) = {(q 1, 00)} δ(q 1, 1, 0) = {(q 2, ε)} δ(q 2, 1, 0) = {(q 2, ε)} δ(q 2, ε, Z) = {(q 0, ε)} - 0011 : (q 0, 0011, Z) (q 1, 011, 0Z) (q 1, 11, 00Z) (q 2, 1, 0Z) (q 2, ε, Z) (q 0, ε, ε) - 0 n 1 n (n 1) : (q 0, 0 n 1 n, Z) (q 1, 0 n-1 1 n, 0Z) n-1 (q 1, 1 n, 0 n Z) (q 2, 1 n-1, 0 n-1 Z) n-1 (q 2, ε, Z) (q 0, ε, ε) L(P) = {0 n 1 n n 1}. push-down automata PD : Q ( {ε}) Γ * Q Γ * - move stack top string string. (q, aω, αγ) (q', ω, βγ) -stack empty move ) PD = ({q, p}, {a, b}, {a, b, S, Z}, δ, q, Z, {p}) where, δ(q, a, ε) = {(q, a)} δ(q, b, ε) = {(q, b)} δ(q, ε, ε) = {(q, S)} S : center mark δ(q, ε, asa) = {(q, S)} δ(q, ε, bsb) = {(q, S)} δ(q, ε, SZ ) = {(p, ε)} push-down automata aabbaa : (q, aabbaa, Z) (q, abbaa, az) (q, bbaa, aaz) (q, baa, baaz) (q, baa, SbaaZ) (q, aa, bsbaaz) (q, aa, SaaZ) (q, a, asaaz) (q, a, SaZ) (q, ε, asaz) (q, ε, SZ) (q, ε, ε) L = { ωω R ω {a, b} + }. 11
push-down automata Le(P) : stack empty PD string., empty string. Le(P) = { ω (q 0, ω, Z 0 ) * (q, ε, ε), q Q}. Le(P') = L(P) P' : P = (Q, Σ, Γ, δ, q 0, Z 0, F) ===> P' = (Q {q e, q'}, Σ, Γ {Z'}, δ', q', Z', φ ), where δ ' : 1) q Q, a Σ {ε}, Z Γ, δ'(q, a, Z) = δ(q, a, Z). 2) δ'(q', ε, Z') = {(q 0, Z 0 Z')}. Z' : bottom marker 3) q F, Z Γ {Z'}, δ'(q, ε, Z) (q e, ε). 4) Z Γ {Z'}, δ'(q e, ε, Z) = {(q e, ε)} context-free, PD,. L(CFG) = L(PD) CFG <===> PD (i) CFG ===> PD (for a given context-free grammar, we can construct a PD accepting L(G).) Top-Down Method --- leftmost derivation, α Bottom-Up Method --- rightmost derivation, α ==> (ii) PD ===> CFG Top-Down Method CFG G Le(R)=L(G) PD R For a given G = (V N, V T, P, S), construct R = ({q}, V T, V N V T, δ, q, S, φ ), where δ : 1) if α P, then δ(q,ε,) (q,α). 2) a V T, δ(q, a, a) = {(q, ε)}. ex) G = ({E, T, F}, {a,, +, (, )}, P, E), P : E E + T T T T F F F (E) a ===> R = ({q}, Σ, Γ, δ, q, E, φ) where δ : 1) δ(q, ε, E) = {(q, E + T), (q, T)} 2) δ(q, ε, T) = {(q, T F), (q, F)} 3) δ(q, ε, F) = {(q, (E)), (q, a)} 4) δ(q, t, t) = {(q, ε)}, t {a, +,, (, )}. 12
a + a a : (q, a + a a, E) (q, a + a a, E + T) (q, a + a a, T + T) (q, a + a a, F + T) (q, a + a a, a + T) (q, + a a, + T) (q, a a, T) (q, a a, T F) (q, a a, F F) (q, a a, a F) (q, a, F) (q, a, F) (q, a, a) (p, ε, ε) top. Bottom-Up Method CFG ===> extended PD(rightmost derivation) ex) G = ({E, T, F}, {a,, +, (, )}, P, E), P : E E + T T T T F F F (E) a ===> R = ({q, r}, V T, V N V T {$}, δ, q, $, {r}) δ : 1) δ(q, t, ε) = {(q, t)}, t {a, +,, (, )} shift 2) δ(q, ε, E + T) = {(q, E)} δ(q, ε, T) = {(q, E)} δ(q, ε, T * F) = {(q, T)} δ(q, ε, F) = {(q, T)} δ(q, ε, (E)) = {(q, F)} δ(q, ε, a) = {(q, F)} 3) δ(q, ε, $E) = {(r, ε)} a + a a (q, a + a a, $) (q, + a a, $ a) (q, + a a, $ F) (q, + a a, $ T) (q, + a a, $ E) (q, a a, $ E +) (q, a, $ E + a) (q, a, $ E + F) (q, a, $ E + T) (q, a, $ E + T ) (q, ε, $ E + T a) (q, ε, $ E + T F) (q, ε, $ E + T) (q, ε, $ E) (r, ε, ε) 13
PD P L(G) = Le(P) CFG G Given PD P = (Q, Σ, Γ, δ, q 0, Z 0, F) ===> Construct cfg G = (V N, V T, P, S), where (1) V N = {[qzr] q, r Q, Z Γ} {S}. (2) V T = Σ. (3) P : δ(q, a, Z) k 1 (r, X 1...X k ) [qzs k ] a[rx 1 s 1 ][s 1 X 2 s 2 ]... [s k-1 X k s k ] P. s 1, s 2,..., s k Q. δ(q, a, Z) (r, ε) [qzr] a P. q Q S [q 0 Z 0 q] P. (4) S : start symbol. 1) L CFG G L(G). 2) L PD P L(P). 3) L PD P Le(P). 4) L extended PD L(P). 14