Lexicrunch : an expert system for word morphology
Golding, Andrew Robert
MetadataShow full item record
Natural language programs typically store words like pig and pigs as independent entries in their dictionaries, thus neglecting the obvious morphological relationship between them. Lexicrunch tries to induce such relationships from examples of root forms of words and the corresponding inflected forms. The program collates ,he examples into classes according to the difference between the inflected form and its root -- e.g. the classes for the plural noun inflection in English might include "root forms to which an -s is added" pig, apple, etc.) and "root forms which take -es" (fox, box, etc. . It then characterizes each class using a modified version of Quinlan's ID3 procedure. The resulting rule will be along the lines of, "If a noun ends in -x, form its plural by adding -es; otherwise, add -s." The program then needs to store only root forms in its dictionary; it can reconstruct plurals on demand by applying its rule. It thereby eliminates redundancy and compacts the lexicon. Lexicrunch's formalism for representing morphological rules wag influenced by the Two-level model of Koskenniemi. The program was tested on the past tense inflection in English, the first person singular present indicative of Finnish, and the past participle in French. It appeared to pick up most of the regularities in the data successfully. However, a meta-level extension to the program is indicated to enable it to capture regularities across its rules.