Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/1211

This item has been viewed 7 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Richmond_1997_a.pdf174.02 kBAdobe PDFView/Open
Richmond_1997_a.ps702.65 kBPostscriptView/Open
Title: Detecting subject boundaries within text: A language-independent statistical approach.
Authors: Richmond, Korin
Smith, Andrew James
Amitay, Einat
Issue Date: 1997
Citation: In Proc. The Second Conference on Empirical Methods in Natural Language Processing, pages 47-54, Brown University, Providence, USA, August 1997.
Abstract: We describe here an algorithm for detecting subject boundaries within text based on a statistical lexical similarity measure. Hearst has already tackled this problem with good results (Hearst, 1994). One of her main assumptions is that a change in subject is accompanied by a change in vocabulary. Using this assumption, but by introducing a new measure of word significance, we have been able to build a robust and reliable algorithm which exhibits improved accuracy without sacrificing language independency.
URI: http://hdl.handle.net/1842/1211
Appears in Collections:CSTR publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback