Knowledge-lean approaches to metonymy
Current approaches to metonymy recognition are mainly supervised, relying heavily on the manual annotation of training and test data. This forms a considerable hindrance to their application on a wider scale. This dissertation therefore aims to relieve the knowledge acquisition bottleneck with respect to metonymy recognition by examining knowledge-lean approaches that reduce this need for human effort. This investigation involves the study of three algorithms that constitute an entire spectrum of machine learning approaches—unsupervised, supervised and semi-supervised ones. Chapter 2 will discuss an unsupervised approach to metonymy recognition, and will show that promising results can be reached when the data are automatically annotated with grammatical information. Although the robustness of these systems is limited, they can serve as a pre-processing step for the selection of useful training data, thereby reducing the workload for human annotators. Chapter 3 will investigate memory-based learning, a “lazy” supervised algorithm. This algorithm, which relies on an extremely simple learning stage, is able to replicate the results of more complex systems. Yet, it will also become clear that the performance of this algorithm, like that of others in the literature, depends heavily on grammatical annotation. Finally, chapter 4 will present a semi-supervised algorithm that produces very promising results with only ten labelled training instances. In addition, it will be shown that less than half of the training data from chapter 3 can lead to the same performance as the entire set. Semantic information in particular will prove very useful in this respect. In short, this dissertation presents experimental results which indicate that the knowledge acquisition bottleneck in metonymy recognition can be relieved with unsupervised and semi-supervised methods. These approaches may make the extension of current algorithms to a wide-scale metonymy resolution system a much more feasible task.