Structure of the Enju system

Japanese version


Structure of Sign

The grammar of Enju is based on the theory of Head-driven Phrase Structure Grammar (HPSG). In HPSG, constraints on the structure of a language are represented with typed feature structures. LiLFeS Home Page presents a brief introduction to typed feature structures.

One of the characteristics of HPSG is that most of the constraints on syntax and semantics are represented in lexical entries, while only a few grammar rules (corresponding to CFG rules) are defined and they represent general constraints irrelevant to specific words. This is because the constraints on the structure of a sentence are mostly introduced by words.

Syntactic/semantic constraints of words/phrases are represented in the data structure called sign. In the current implementation of Enju, the structure of the sign basically follows [Pollar and Sag, 1994] and LinGO English Resource Grammar (ERG), while the type hierarchy is much simplified and modified not to use complex constraints nor Minimal Recursion Semantics (MRS).

PHON: A sequence of words governed by the phrase
SYNSEM
LOCAL
CAT
HEAD Constraints inherited from the head daughter
MOD: Constraints of a modifying phrase
POSTHEAD: Whether this phrase modifies a preceding phrase
VAL Subcategorization frame
SUBJ: Constraints of left phrases that will be subcategorized
COMPS: Constraints of right phrases that will be subcategorized
SPR: Constraints of specifiers that will be subcategorized
SPEC: Constraints of a specifying phrase
CONJ: Constraints of conjuncts
CONT: Predicate-argument structure
CONX: Currently not used
NONLOCAL Constraints of long-distance dependency
INHER Constraints inherited from the daughters
QUE: Currently not used
REL: Constraints of an antecedent phrase of a relative clause
SLASH: Constraints of a phrase in long-distance dependencies
F_REL: Currently not used
TO_BIND Constraints bound in this phrase
QUE: Currently not used
REL: Constraints of an antecedent phrase of a relative clause
SLASH: Constraints of a phrase in long-distance dependencies
F_REL: Currently not used

Constraints of phrases include various syntactic features (part-of-speech, agreement, tense, etc.).

CONT feature has a predicate-argument structure of the phrase. Predicate-argument structures represent relations of logical subject/object and modifying relations. The CONT feature of the sign of the top node shows the predicate-argument structure of the whole sentence.

Types and features used in the Enju grammar are all defined in "enju/types.lil". For details, see the source file.


System architecture

The Enju system uses UP (included in the MAYZ package), a general-purpose parser for unification grammars. UP parses a sentence with provided lexical entries and grammar rules. Enju creates the data passed to UP in the following way.

Architecture of
the grammar

An input sentence is passed to sentence_to_word_lattice/2, and converted to a word lattice, i.e., a list of extents (a pair of word position and word information). A word lattice is passed to UP, and parsing starts. sentence_to_word_lattice/2 first applies a POS tagger to an input sentence (external_tagger/2), splits it into words, and applies stemming to the words. sentence_to_word_lattice/2 is implemented in "enju/grammar.lil". By default, a POS tagger is "uptagger", and it can be changed by "-t" option of "enju" (or by initialize_external_tagger/2).

The predicate (lexical_entry/2 and lexical_entry_sign/2) makes lexical entries by using two databases. One is a mapping from a word/POS pair into a list of the names of lexical entry templates assigned to the word (lookup_lexicon/2). The other is a mapping from the name of a template into a feature structure of the template (lookup_template/2). Lexical entries are constructed by adding word-specific information (e.g. PHON feature) to lexical entry templates. This predicate is implemented in "enju/grammar.lil".

Grammar rules (i.e., schemas) are implemented in "enju/schema.lil". They define phrase structure rules to make a mother from its daughters.


Enju Manual Enju Home Page Tsujii Laboratory
MIYAO Yusuke (yusuke@is.s.u-tokyo.ac.jp)