Show simple item record

dc.contributor.advisorLapata, Mirella
dc.contributor.advisorLavrenko, Victor
dc.contributor.authorMitchell, Jeffrey John
dc.date.accessioned2011-06-22T09:29:29Z
dc.date.available2011-06-22T09:29:29Z
dc.date.issued2011-06-30
dc.identifier.urihttp://hdl.handle.net/1842/4927
dc.description.abstractDistributional models of semantics have proven themselves invaluable both in cognitive modelling of semantic phenomena and also in practical applications. For example, they have been used to model judgments of semantic similarity (McDonald, 2000) and association (Denhire and Lemaire, 2004; Griffiths et al., 2007) and have been shown to achieve human level performance on synonymy tests (Landuaer and Dumais, 1997; Griffiths et al., 2007) such as those included in the Test of English as Foreign Language (TOEFL). This ability has been put to practical use in automatic thesaurus extraction (Grefenstette, 1994). However, while there has been a considerable amount of research directed at the most effective ways of constructing representations for individual words, the representation of larger constructions, e.g., phrases and sentences, has received relatively little attention. In this thesis we examine this issue of how to compose meanings within distributional models of semantics to form representations of multi-word structures. Natural language data typically consists of such complex structures, rather than just individual isolated words. Thus, a model of composition, in which individual word meanings are combined into phrases and phrases combine to form sentences, is of central importance in modelling this data. Commonly, however, distributional representations are combined in terms of addition (Landuaer and Dumais, 1997; Foltz et al., 1998), without any empirical evaluation of alternative choices. Constructing effective distributional representations of phrases and sentences requires that we have both a theoretical foundation to direct the development of models of composition and also a means of empirically evaluating those models. The approach we take is to first consider the general properties of semantic composition and from that basis define a comprehensive framework in which to consider the composition of distributional representations. The framework subsumes existing proposals, such as addition and tensor products, but also allows us to define novel composition functions. We then show that the effectiveness of these models can be evaluated on three empirical tasks. The first of these tasks involves modelling similarity judgements for short phrases gathered in human experiments. Distributional representations of individual words are commonly evaluated on tasks based on their ability to model semantic similarity relations, e.g., synonymy or priming. Thus, it seems appropriate to evaluate phrase representations in a similar manner. We then apply compositional models to language modelling, demonstrating that the issue of composition has practical consequences, and also providing an evaluation based on large amounts of natural data. In our third task, we use these language models in an analysis of reading times from an eye-movement study. This allows us to investigate the relationship between the composition of distributional representations and the processes involved in comprehending phrases and sentences. We find that these tasks do indeed allow us to evaluate and differentiate the proposed composition functions and that the results show a reasonable consistency across tasks. In particular, a simple multiplicative model is best for a semantic space based on word co-occurrence, whereas an additive model is better for the topic based model we consider. More generally, employing compositional models to construct representations of multi-word structures typically yields improvements in performance over non-compositonal models, which only represent individual words.en
dc.contributor.sponsorEconomic and Social Research Council (ESRC)en
dc.language.isoenen
dc.publisherThe University of Edinburghen
dc.relation.hasversionMitchell, J. and Lapata, M. (2008). Vector-based models of semantic composition. In Proceedings of ACL-08: HLT, pages 236–244, Columbus, Ohio. Association for Computational Linguistics.en
dc.relation.hasversionMitchell, J. and Lapata, M. (2009). Language models based on semantic composition. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 430–439, Singapore. Association for Computational Linguistics.en
dc.relation.hasversionMitchell, J. and Lapata, M. (2010). Composition in distributional models of semantics. Cognitive Science, 34(8):1388–1429.en
dc.relation.hasversionMitchell, J., Lapata, M., Demberg, V., and Keller, F. (2010). Syntactic and semantic factors in processing difficulty: An integrated measure. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 196–206, Uppsala, Sweden. Association for Computational Linguistics.en
dc.subjectsemanticsen
dc.subjectcomputational linguisticsen
dc.subjectcompositionen
dc.titleComposition in distributional models of semanticsen
dc.typeThesis or Dissertationen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhD Doctor of Philosophyen


Files in this item

This item appears in the following Collection(s)

Show simple item record