# Grammar Rules

$S$ or $R$ is used as the root node in the following discussion, so $O(S)=O(R)=1$.

Binary Rules (w) Unary Rules (w) Lexicons (w) Possible Chain Rules
D->BC (2)
E->BC (3)
F->BC (4)
S->AD (2)
S->AE (3)
S->AF (3)
S->AG (2)
F->E (2)
E->D (3)
G->E (3)
A->w0 (2)
B->w1 (2)
C->w2 (1)
null
F->E
G->E
E->D
F->E->D
G->E->D

When you draw out all the possible parse trees spanning the sentence w0,w1,w2, you could find the only difference among these parse trees is about the choices of chain rules. We only consider chain unary rules of length less than or equal to 2 in weighted context free grammar, and define 3 levels for inside scores and outside scores of nonterminals involved in such chains. Level index that ranges from 0 to 2 increases from the bottom up (top down) for inside scores (outside scores). Each level L contains nonterminals that are exactly the upmost (bottommost) ones in the chain unary rules of length L for inside scores (outside scores).

# Inside Scores & Outside Scores

Levels Inside Scores (w) Outside Scores
0 I0(D) = w(D->BC)I(B)I(C)=4
I0(E) = w(E->BC)I(B)I(C)=6
I0(F) = w(F->BC)I(B)I(C)=8
O0(D) = I(A)w(S->AD)O(S)=4
O0(E) = I(A)w(S->AE)O(S)=6
O0(F) = I(A)w(S->AF)O(S)=6
O0(G) = I(A)w(S->AG)O(S)=4
1 I1(E) = w(E->D)I0(D)=12
I1(F) = w(F->E)I0(E)=12
I1(G) = w(G->E)I0(E)=18
O1(D) = w(E->D)O0(E)=18
O1(E) = w(F->E)O0(F) + w(G->E)O0(G)=24
2 I2(F) = w(F->E)I1(E)=24
I2(G) = w(G->E)I1(E)=36
O2(D) = w(E->D)O1(E)=72

From inside scores in the chain unary rules in the above table, we can compute the score of the sentence $w_{0}w_{1}w_{2}$ as

\begin{align} I(S) = I(A)\sum_{X \in \mathbf{X}} w(S \rightarrow AX)\sum_{i = 0, 1, 2} I_{i}(X), \end{align}

where $S$ is the root node, $\mathbf{X}$ is the nonterminal set, $w(S \rightarrow AX) = I_{i}(X) = 0$ if such rules or inside scores do not exist. And

\begin{align*} I(S) &= 2 \cdot ((4 \times 2 + 6 \times 3 + 8 \times 3) \\ &+ (12 \times 3 + 12 \times 3 + 18 \times 2) + (24 \times 3 + 36 \times 2)) \\ &= ((16 + 36 + 46) + (72 + 72 + 72) + (144 + 144)) \end{align*}

We introduce a constant M to represent the weight of the partial parse trees that dominate $w_{2}w_{3}$ and replace $S$ with $J, K, L$. Also we add the additional grammar rules presented in the following table. The chain unary rules starting with $R$ are not included in possible chain rules.

Binary Rules (w) Unary Rules (w) Lexicons (w) Possible Chain Rules
J->AM (2)
K->AM (1)
L->AM (3)
K->J (2)
L->K (3)
R->J (3)
R->K (2)
R->L (1)
null
K->J
L->K
L->K->J
Levels Inside Scores (w) Outside Scores
0 I0(J) = w(J->AM)I(A)I(M)=4M
I0(K) = w(K->AM)I(A)I(M)=2M
I0(L) = w(L->AM)I(A)I(M)=6M
O0(J) = w(R->J)O(R)=3
O0(K) = w(R->K)O(R)=2
O0(L) = w(R->L)O(R)=1
1 I1(K) = w(K->J)I0(J)=8M
I1(L) = w(L->K)I0(K)=6M
O1(J) = w(K->J)O0(K)=4
O1(K) = w(L->K)O0(L)=3
2 I2(L) = w(L->K)I1(K)=24M O2(J) = w(K->J)O1(K)=6
\begin{align*} I(R) &= ((3 \times 4M + 2 \times 2M + 1 \times 6M) + (2 \times 8M + 1 \times 6M) + (1 \times 24M)) \\ &= ((12M + 4M + 6M) + (16M + 6M) + (24M)) \end{align*}

## Counts of Binary Rule $c(S \rightarrow AX)$

For any nonterminal $X$, we need to find all its occurances in three levels. When $X = D$,

\begin{align*} c(S \rightarrow AD) &= I(A) O(S) w(S \rightarrow AD) I(D) \\ &= I(A) w(S \rightarrow AD) \sum_{j = 0, 1, 2}O_{j}(S) \sum_{i = 0, 1, 2} I_{i}(D) \\ \end{align*} \begin{align*} c(J \rightarrow AM) &= w(R \rightarrow J)w(J \rightarrow AM)w(A \rightarrow w_{1})M \\ &+ w(R \rightarrow K)w(K \rightarrow J)w(J \rightarrow AM)w(A \rightarrow w_{1})M \\ &+ w(R \rightarrow L)w(L \rightarrow K)w(K \rightarrow J)w(J \rightarrow AM)w(A \rightarrow w_{1})M \\ &= 12M + 16M + 24M = 52M \\ &= \color{red}{ I(A)w(J \rightarrow AM)O(J)M} \\ &= \color{red}{ I(A)w(J \rightarrow AM)\sum_{i = 0,1,2}O_{i}(J)M} \\ &= 2 \times 2 \times (3 + 4 + 6)M = 52M \end{align*}

## Counts of Unary Rule $c(X \rightarrow Y)$

\begin{align*} c(X \rightarrow Y) &= \{c(X \rightarrow Y) | X \rightarrow Y \text{ is the chain rule of length 1}\} \\ &+ \{\sum_{Z \in \mathbf{X}} c(X \rightarrow Y \rightarrow Z) + c(Z \rightarrow X \rightarrow Y) | X \rightarrow Y \text{ is involved in the chain rule of length 2}\} \\ &= w(X \rightarrow Y)(O_{0}(X)I_{0}(Y) + O_{0}(X)I_{1}(Y) + O_{1}(X)I_{0}(D)) \end{align*}

Let $X = E, Y = D$, we can verify the above formula.

The score of all the parse trees containing the unary rule $E \rightarrow D$ is

\begin{align*} c(E \rightarrow D) &= \sum_{T \sim w_{0}w_{1}w_{2} \text{ and } E \rightarrow D \in T} w(T) \\ &= w(A \rightarrow w_{0}) w(S \rightarrow AE) w(E \rightarrow D) w(D \rightarrow BC) w(B \rightarrow w_{1}) w(C \rightarrow w_{2}) \\ &+ w(A \rightarrow w_{0}) w(S \rightarrow AE) w(F \rightarrow E) w(E \rightarrow D) w(D \rightarrow BC) w(B \rightarrow w_{1}) w(C \rightarrow w_{2}) \\ &+ w(A \rightarrow w_{0}) w(S \rightarrow AE) w(G \rightarrow E) w(E \rightarrow D) w(D \rightarrow BC) w(B \rightarrow w_{1}) w(C \rightarrow w_{2}) \\ &= 2 \times 3 \times 3 \times 2 \times 2 \times 1 \\ &+ 2 \times 3 \times 2 \times 3 \times 2 \times 2 \times 1 \\ &+ 2 \times 2 \times 3 \times 3 \times 2 \times 2 \times 1 \\ &= 72 + 144 + 144 \\ &= \color{red}{w(E \rightarrow D)(O_{0}(E)I_{0}(D) + O_{1}(E)I_{0}(D))} \\ &= 3 \times (6 \times 4 + 24 \times 4) \\ &= 72 + 288 \end{align*}
0%