Alternative Phrases Theoretical Analysis and Practical Application
"All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, the fresh-water system and public health, what have the Romans ever done for us?" (Monty Python, The Life of Brian) Alternative phrases identify selected elements from a set and subject them to particular scrutiny with respect to the sentence's predicate. For instance, in the above example, sanitation, medicine, etc. are all identified as elements in the set of things the Romans have done for us" that should not be included in the response to the question. They are alternative responses to the desired ones. Alternative phrases come in a variety of constructions and perform a variety of tasks: excluding elements (apart from), expressing preference for particular elements (especially), and simply identifying representative examples (such as). Not a great deal of work has been done on alternative phrases in general. Hearst (1992) used a pattern-matching analysis of certain alternative phrases to learn hyponyms from unannotated corpora. Also, a few examples from a subset of alternative phrases, called exceptive phrases, have been studied, most recently, by von Fintel (1993) and Hoeksema (1995). But not all constructions are amenable to pattern-matching techniques, and the work on exceptive phrases focuses on some very specific semantic points. The focus of this thesis is to present a general program for analyzing a wide variety of alternative phrases including their presuppositional and anaphoric properties. I perform my analyses in Combinatory Categorial Grammar, a lexicalized formalism. The semantic aspects of the analysis benefit greatly from the concept of alternative sets, sets of propositions that differ in one or more argument (Karttunen and Peters, 1979; Rooth, 1985, 1992; Prevost and Steedman, 1994; Steedman, 2000a). In addition, elegant solutions are made possible by separating the semantics into assertion and presupposition (Stalnaker, 1974; Karttunen and Peters, 1979; Stone and Doran, 1997; Stone and Webber, 1998; Webber et al., 1999b)| with each performing quite different tasks. My second goal is to demonstrate the practicality and importance of this analysis to real systems. Although it is relevant to many practical applications, I will focus primarily on natural language information retrieval (NLIR) as a case study. In such a domain, queries like Where can I find other web browsers than Netscape for download? and Where can I find shoes made by Bufialino, such as the Bushwackers? are often observed. I review several techniques for NLIR and demonstrate that implementations of those techniques perform poorly on such queries. I show that understanding alternative phrases can enable simple techniques which greatly improve precision. To bridge the gap between these goals, I present Grok, a modular natural language system. Several general NLP issues necessary to support my linguistic analysis are discussed: anaphora resolution, processing of presuppositions, interface to knowledge representation, and the creation of a wide-coverage lexicon. Special attention is paid to the lexicon, which is a combination of a hand-built and an acquired lexicon.