In this talk I will present a discriminative learning model based on
Perceptron and dynamic programming search for predicting the syntactic
structure of natural language sentences. I will describe the general
form of discriminative methods for predicting structures, and I will
argue that syntactic analysys poses several important challenges for
this class of learning algorithms. I will focus on aspects related to
representation of syntactic structures and efficiency of parsing
algorithms, which are critical for training discriminative models.
I will present a model based on Tree Adjoining Grammars that can
represent syntactic structures with a very rich set of features and is
also efficient. Key to the efficiency of our approach is the use of
lower-order models to prune the space of full parsing structures, and
that performs remarkably well. Experiments on parsing the Penn WSJ
Treebank ---the de facto standard dataset--- show that our model
obtains state-of-the-art results.
Joint work with Michael Collins and Terry Koo.
