|
Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/1211
|
| Title: | Detecting subject boundaries within text: A language-independent statistical approach. |
| Authors: | Richmond, Korin Smith, Andrew James Amitay, Einat |
| Issue Date: | 1997 |
| Citation: | In Proc. The Second Conference on Empirical Methods in Natural Language Processing, pages 47-54, Brown University, Providence, USA, August 1997. |
| Abstract: | We describe here an algorithm for detecting subject boundaries within text based on a statistical lexical similarity measure. Hearst has already tackled this problem with good results (Hearst, 1994). One of her main assumptions is that a change in subject is accompanied by a change in vocabulary. Using this assumption, but by introducing a new measure of word significance, we have been able to build a robust and reliable algorithm which exhibits improved accuracy without sacrificing language independency. |
| URI: | http://hdl.handle.net/1842/1211 |
| Appears in Collections: | CSTR publications
|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|