$S$ or $R$ is used as the root node in the following discussion, so $O(S)=O(R)=1$.

Binary Rules (w) | Unary Rules (w) | Lexicons (w) | Possible Chain Rules |
---|---|---|---|

`D->BC` (2)`E->BC` (3)`F->BC` (4)`S->AD` (2)`S->AE` (3)`S->AF` (3)`S->AG` (2) | `F->E` (2)`E->D` (3)`G->E` (3) | `A->w0` (2)`B->w1` (2)`C->w2` (1) | `null` `F->E` `G->E` `E->D` `F->E->D` `G->E->D` |

When you draw out all the possible parse trees spanning the sentence `w0,w1,w2`

, you could find the only difference among these parse trees is about the choices of chain rules. We only consider chain unary rules of length less than or equal to 2 in weighted context free grammar, and define 3 levels for inside scores and outside scores of nonterminals involved in such chains. Level index that ranges from 0 to 2 increases from the bottom up (top down) for inside scores (outside scores). Each level `L`

contains nonterminals that are exactly the upmost (bottommost) ones in the chain unary rules of length `L`

for inside scores (outside scores).

Levels | Inside Scores (w) | Outside Scores |
---|---|---|

0 | `I0(D) = w(D->BC)I(B)I(C)=4` `I0(E) = w(E->BC)I(B)I(C)=6` `I0(F) = w(F->BC)I(B)I(C)=8` | `O0(D) = I(A)w(S->AD)O(S)=4` `O0(E) = I(A)w(S->AE)O(S)=6` `O0(F) = I(A)w(S->AF)O(S)=6` `O0(G) = I(A)w(S->AG)O(S)=4` |

1 | `I1(E) = w(E->D)I0(D)=12` `I1(F) = w(F->E)I0(E)=12` `I1(G) = w(G->E)I0(E)=18` | `O1(D) = w(E->D)O0(E)=18` `O1(E) = w(F->E)O0(F) + w(G->E)O0(G)=24` |

2 | `I2(F) = w(F->E)I1(E)=24` `I2(G) = w(G->E)I1(E)=36` | `O2(D) = w(E->D)O1(E)=72` |

From inside scores in the chain unary rules in the above table, we can compute the score of the sentence $w_{0}w_{1}w_{2}$ as

\begin{align}I(S) = I(A)\sum_{X \in \mathbf{X}} w(S \rightarrow AX)\sum_{i = 0, 1, 2} I_{i}(X),\end{align}where $S$ is the root node, $\mathbf{X}$ is the nonterminal set, $ w(S \rightarrow AX) = I_{i}(X) = 0 $ if such rules or inside scores do not exist. And

\begin{align*}I(S) &= 2 \cdot ((4 \times 2 + 6 \times 3 + 8 \times 3) \\&+ (12 \times 3 + 12 \times 3 + 18 \times 2) + (24 \times 3 + 36 \times 2)) \\&= ((16 + 36 + 46) + (72 + 72 + 72) + (144 + 144))\end{align*}We introduce a constant M to represent the weight of the partial parse trees that dominate $ w_{2}w_{3} $ and replace $S$ with $J, K, L$. Also we add the additional grammar rules presented in the following table. The chain unary rules starting with $R$ are not included in possible chain rules.

Binary Rules (w) | Unary Rules (w) | Lexicons (w) | Possible Chain Rules |
---|---|---|---|

`J->AM` (2)`K->AM` (1)`L->AM` (3) | `K->J` (2)`L->K` (3)`R->J` (3)`R->K` (2)`R->L` (1) | `null` `K->J` `L->K` `L->K->J` |

Levels | Inside Scores (w) | Outside Scores |
---|---|---|

0 | `I0(J) = w(J->AM)I(A)I(M)=4M` `I0(K) = w(K->AM)I(A)I(M)=2M` `I0(L) = w(L->AM)I(A)I(M)=6M` | `O0(J) = w(R->J)O(R)=3` `O0(K) = w(R->K)O(R)=2` `O0(L) = w(R->L)O(R)=1` |

1 | `I1(K) = w(K->J)I0(J)=8M` `I1(L) = w(L->K)I0(K)=6M` | `O1(J) = w(K->J)O0(K)=4` `O1(K) = w(L->K)O0(L)=3` |

2 | `I2(L) = w(L->K)I1(K)=24M` | `O2(J) = w(K->J)O1(K)=6` |

For any nonterminal $X$, we need to find all its occurances in three levels. When $X = D$,

\begin{align*}c(S \rightarrow AD) &= I(A) O(S) w(S \rightarrow AD) I(D) \\&= I(A) w(S \rightarrow AD) \sum_{j = 0, 1, 2}O_{j}(S) \sum_{i = 0, 1, 2} I_{i}(D) \\\end{align*}\begin{align*}c(J \rightarrow AM) &= w(R \rightarrow J)w(J \rightarrow AM)w(A \rightarrow w_{1})M \\&+ w(R \rightarrow K)w(K \rightarrow J)w(J \rightarrow AM)w(A \rightarrow w_{1})M \\&+ w(R \rightarrow L)w(L \rightarrow K)w(K \rightarrow J)w(J \rightarrow AM)w(A \rightarrow w_{1})M \\&= 12M + 16M + 24M = 52M \\&= \color{red}{ I(A)w(J \rightarrow AM)O(J)M} \\&= \color{red}{ I(A)w(J \rightarrow AM)\sum_{i = 0,1,2}O_{i}(J)M} \\&= 2 \times 2 \times (3 + 4 + 6)M = 52M\end{align*}Let $X = E, Y = D$, we can verify the above formula.

The score of all the parse trees containing the unary rule $E \rightarrow D$ is

\begin{align*}c(E \rightarrow D) &= \sum_{T \sim w_{0}w_{1}w_{2} \text{ and } E \rightarrow D \in T} w(T) \\&= w(A \rightarrow w_{0}) w(S \rightarrow AE) w(E \rightarrow D) w(D \rightarrow BC) w(B \rightarrow w_{1}) w(C \rightarrow w_{2}) \\&+ w(A \rightarrow w_{0}) w(S \rightarrow AE) w(F \rightarrow E) w(E \rightarrow D) w(D \rightarrow BC) w(B \rightarrow w_{1}) w(C \rightarrow w_{2}) \\&+ w(A \rightarrow w_{0}) w(S \rightarrow AE) w(G \rightarrow E) w(E \rightarrow D) w(D \rightarrow BC) w(B \rightarrow w_{1}) w(C \rightarrow w_{2}) \\&= 2 \times 3 \times 3 \times 2 \times 2 \times 1 \\&+ 2 \times 3 \times 2 \times 3 \times 2 \times 2 \times 1 \\&+ 2 \times 2 \times 3 \times 3 \times 2 \times 2 \times 1 \\&= 72 + 144 + 144 \\&= \color{red}{w(E \rightarrow D)(O_{0}(E)I_{0}(D) + O_{1}(E)I_{0}(D))} \\&= 3 \times (6 \times 4 + 24 \times 4) \\&= 72 + 288\end{align*}]]>

**The University of Edinburgh**

My research interest is in structured prediction and latent variable models. Check out my recent work or email me.

I am a Ph.D. student in the ILCC at the University of Edinburgh, under the supervision of Prof. Ivan Titov and Mirella Lapata. I am also a member of the Edinburgh NLP. Before moving to Edinburgh, I spent three wonderful years in the SIST at ShanghaiTech University and worked with Prof. Kewei Tu.

- First EMNLP paper received review scores (4.5/5, 4.5/5, 4.5/5)
*“A clear step forward with novel methods and strong improvement.”*– Anonymous Reviews - First ACL paper received review scores (6/6, 5/6, 5/6)
*“The first general framework of using latent vectors for grammars.”*– Anonymous Reviews

- Visually Grounded Compound PCFGs
**Yanpeng Zhao**and Ivan Titov.

EMNLP 2020. [paper], [code], [reviews].**Best Paper Honorable Mention**[link].

*Preprints*

- An Empirical Study of Compound PCFGs
**Yanpeng Zhao**.

Preprint 2020. [paper], [code]. - Unsupervised Transfer of Semantic Role Models from Verbal to Nominal Domain
**Yanpeng Zhao**and Ivan Titov.

Preprint 2020. [paper], [code].

*Before 2018*

- Gaussian Mixture Latent Vector Grammars
**Yanpeng Zhao**, Liwen Zhang, Kewei Tu.

ACL 2018 (oral). [paper], [code], [reviews].**Nominated for the Best Paper Award.** - Structured Attentions for Visual Question Answering

Chen Zhu,**Yanpeng Zhao**, Shuaiyi Huang, Kewei Tu, and Yi Ma.

ICCV 2017 (poster). [paper], [code], [reviews]. - Learning Bayesian Network Structures Under Incremental Construction Curricula
**Yanpeng Zhao**, Yetian Chen, Kewei Tu, and Jin Tian.

Neurocomputing 2017. [paper]. - Sequence Prediction Using Neural Network Classifiers
**Yanpeng Zhao**, Shanbo Chu, Yang Zhou, and Kewei Tu.

ICGI 2016 (workshop track). [paper], [slides]. - Curriculum Learning of Bayesian Network Structures
**Yanpeng Zhao**, Yetian Chen, Kewei Tu, and Jin Tian.

ACML 2015 (oral). [paper], [code], [poster], [reviews]. - Language Style Transfer from Non-Parallel Text with Arbitrary Styles
**Yanpeng Zhao**, Victoria W. Bi, Deng Cai, Xiaojiang Liu, Kewei Tu, Shuming Shi.

Technical report, 2018. [paper].

- The avatar photo was taken during my road bike trip around the Qinghai Lake in the summer of 2012

Weka Wiki, Codes of BNSL in Weka lie in `weka.classifiers.bayes.net.search`

. You can refer to to be added about the usage of Weka BNSL.

Developed by Kevin Murphy, Refer to How to use the Bayes Net Toolbox, Codes of BNSL lie in *SLP*. By the way, SLP is separately developed based on BNT, and it has been already included in BNT. Another package developed based on BNT is Mateda2.0. Kevin Murphy has also developed BDAGL:Bayesian DAG learning, but I never tried it.

Source codes of the algorithm described in this paper are not provided (only *.p* files of the Matlab are available). My implementation of this algorithm in java can be accessed from MMPC. Here is MMHC paper home.

I tried it for interchanging file formats of BN. The Interchange Format for Bayesian Networks is to summarize and distribute information on an effort to standardize formats for representation of Bayesian networks and other related graphical models, but it seems to have stalled.

If you wanna get ideas of the implementation of Sparse Candidate, you can look through this one. But the documentation is not that good. E.g., Dev Docs doesn’t give the clear update infos.

Simple and clear, could be used as the reference in developments.

The project is hosted on Bitbucket.

Leaded by Peter Spirtes. Here is the induction from it’s home page

…is to develop, analyze, implement, test and apply practical, provably correct computer programs for inferring causal structure under conditions where this is possible.

Some classical algorithms such as PC, K2 are included.

It perfectly explains OOP (Object Oriented Programming) in matlab.

I have never tried them. Here are some other summaries of software packages of PGM: Graphical Models Software Tools, Bayes Nets.

BNLearn and Software Packages for Graphical Models and GalElidan.

]]>`_`

in Markdown conflicts with that in Latex, here is the solution to it:Hexo uses Nunjucks to render posts (Swig was used in older version, which share a similar syntax). Content wrapped with

`{{ }}`

or`{% %}`

will get parsed and may cause problems. You can wrap sensitive content with the raw tag plugin.

1 | {% raw %} |

Results

$$e^{i \theta_{t}} = \cos \theta_{t} + i \sin \theta_{t}$$Ref: https://hexo.io/docs/troubleshooting.html#Escape-Contents

]]>