treetrans: Tool for tree transformation
Japanese version
This is a tool for the conversion of Penn Treebank-style trees using
pattern rules.
treetrans [options] rule_module input_file output_database
|
rule_module | lilfes program in which pattern rules are implemented
|
input_file | Input treebank (Penn Treebank style)
|
output_database | Output treebank (lildb format)
|
Options
|
-v | print debug messages
|
-vv | print many debug messages
|
This tool inputs Penn Treebank-style trees from a text file, applies
tree conversion rules to each input tree, and outputs the results into
a lildb-style database. Pattern rules are implemented as lilfes
programs with interfaces defined in "treetrans.lil". Parse trees are
represented in feature structures defined in "treetypes.lil". For
example, the following pattern rule converts a tree like
"(... than/IN XXX)" into "(... (PP than/IN XXX:argument))".
tree_transform_class("than", "topdown", "weak").
tree_subst_pattern("than",
TREE_NODE\$Node & TREE_DTRS\$Dtrs,
TREE_NODE\$Node & TREE_DTRS\$NewDtrs) :-
$Dtrs = [$Left & tree_any & ANY_TREES\[_|_],
$Than & tree & TREE_NODE\(SYM\"IN" & WORD\SURFACE\"than"),
$Right & tree & TREE_NODE\HEAD_MARK\argument],
$NewDtrs = [$Left,
TREE_NODE\(SYM\"PP" & MOD\[] & ID\[] & HEAD_MARK\modifier) &
TREE_DTRS\[$Than, $Right]].
How to write tree conversion rules
First, write "tree_transform_class/3" in order to specify the
name of a conversion rule, the order of rule application, and the
behavior in which the rule application fails.
tree_transform_class(+$Name, +$Direction, +$Strict)
|
+$Name | The name of the conversion rule
|
+$Direction | The order of applying the rule
- "topdown": From a root to leaves
- "bottomup": From leaves to a root
- "rootonly": Only to the root of a tree
+$Strict | The behavior in which the rule
application fails
- "strict": Fail the conversion of a whole tree
- "weak": Ignore the failure of this rule
| |
Next, write conversion rules with the following interfaces. In all
the interfaces, the first argument is the name of a rule that has been
specified in "tree_transform_class/3".
The treetrans tool traverses each node in parse trees and
applies conversion rules in the order of
"tree_transform_class/3" in the program file.
tree_ignore(+$Name, ?$Tree)
|
+$Name | rule name
|
+$Tree | tree: parse tree
|
Remove a subtree that is unifiable with +$Tree.
|
tree_transform_rule(+$Name, +$InTree, -$OutTree)
|
+$Name | rule name
|
+$InTree | tree: input parse tree
|
-$OutTree | tree: output parse tree
|
Convert $InTree into $OutTree.
|
tree_subst_pattern(+$Name, +$InPattern, +$OutPattern)
|
+$Name | rule name
|
+$InTree | tree: pattern of an input tree
|
+$OutTree | tree: pattern of an output tree
|
Convert a parse tree that matches with $InTree using
"tree_match/2" into $OutPattern.
|
tree_unify(+$Name, ?$Tree)
|
+$Name | rule name
|
+$Tree | tree: parse tree
|
Unify $Tree with the target tree.
|
tree_match_pattern(+$Name, +$Pattern)
|
+$Name | rule name
|
+$Tree | tree: pattern on a parse tree
|
Unify $Pattern with the target tree using
"tree_match/2".
|
Conversion rules are applied in the order of tree_ignore/2,
tree_transform_rule/3, tree_subst_pattern/3, tree_unify/2,
tree_match_pattern/2.
Additionally, the following interfaces may be used for formatting an
input tree before applying conversion rules.
delete_tree(+$Tree)
|
+$Tree | tree: parse tree
|
Remove a subtree that is unifiable with +$Tree.
|
nonterminal_mapping(+$InSym, -$OutSym)
|
+$InSym | nonterminal symbol of an input tree
|
-$OutSym | nonterminal symbol of an output tree
|
Convert nonterminal symbol $InSym into $OutSym.
|
preterminal_mapping(+$InSurface, +$InSym, -$OutSurface, -$OutSym)
|
+$InSurface | input word (surface form)
|
+$InSym | input nonterminal symbol
|
-$OutSurface | output word (surface form)
|
-$OutSym | output nonterminal symbol
|
Convert a word, $InSurface/$InSym, into $OutSurface/$OutSym.
|
preterminal_projection(+$InSym, -$NewSym)
|
+$InSym | preterminal symbol
|
-$NewSym | nonterminal symbol
|
Insert a nonterminal symbol as the mother of
preterminal $InSym.
|
See the developers' manual of "treetrans.lil" for details. In
conversion rules, you can use several tools such as "tree_binarize/2"
(implemented in "binarizer.lil"
to binarize a tree) and "mark_head/1", "mark_modifier/1" (defined
in "markhead.lil" to annotate
head/modifier/argument marks.
MAYZ Toolkit Manual
MAYZ Home Page
Tsujii Laboratory
MIYAO Yusuke (yusuke@is.s.u-tokyo.ac.jp)