|
Edinburgh Research Archive >
Philosophy, Psychology and Language Sciences, School of >
Linguistics and English Language >
Linguistics and English Language Masters thesis collection >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/5318
|
| Title: | Towards the Development of a Web-based Alignment Platform |
| Authors: | Schnober, Carsten |
| Supervisor(s): | King, Simon Arranz, Victoria |
| Issue Date: | 2010 |
| Publisher: | The University of Edinburgh |
| Abstract: | In this work, a platform is developed that makes existing sentence and word alignment tools available as web services. The tools implemented are Hunalign and GIZA++; after creating wrappers and format converters, they are embedded into a pipeline that produces a collection of word-aligned sentences from a parallel corpus provided by the user. The single components of the platform are independent of each other and therefore can be used in any order and by
any web service client. The platform components are implemented in a generalisable way such
that additional modules from any other sub-fields of natural language processing or completely different fields as well can be developed using methods shown in this work.
After giving a background view on the state of the art alignment techniques, the platform
development itself is demonstrated and evaluated. Examples of how to use it with different clients are presented. The alignment platform is implemented as a Java servlet class that is by design usable on any operating system. It is generated using the Soaplab software suite.
Its tool acd2xml automatically creates web service descriptions from a definition written in the ACD format that has been designed by the Emboss project (European Molecular Biology Open Software Suite). |
| Keywords: | web service machine translation alignment |
| URI: | http://hdl.handle.net/1842/5318 |
| Appears in Collections: | Linguistics and English Language Masters thesis collection
|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|