BFSU Sentence Segmenter 1.0
BFSU Sentence Segmenter 1.0 was programmed by Mr. Yunlong Jia, and designed by Dr. Jiajin Xu. This tool converts any loaded English plain text(s) into one sentence per line format. We have maximized our effort to ignore abbreviations like Dr., Mr., Mrs., B.C. as sentence final marks, but it is almost impossible to exhaust all exceptional and/or ad hoc abbreviations which might be deceptively recognized as sentence final positions. We therefore allow users to customize the list of abbreviations in “Abbrev.ini". In that case, the more abbreviations you add to the list, the less improper segmentations will be possible. Please note that alphanumeric strings in the list are case-sensitive.
The result of segmentation will be saved after execution in the same directory of the source text(s) in the same filename(s) with the extension .seg.
Post-editing is always necessary for serious language workers.
Please cite the program as:
Xu, Jiajin& Yunlong Jia. (2010). BFSU Sentence Segmenter 1.0. Beijing: National Research Center for Foreign Language Education, Beijing Foreign Studies University.
BFSU Sentence Segmenter 1.0 is freeware. The software comes on an “as is” basis, and the authors will accept no liability for any damage that results from using the software.
Bug reports will be highly appreciated and should be sent to WilliamJia@OpenCorpus.org.
相关讨论
Automatic Sentence Segmentation
Minor update: Change of words in About
BFSU Sentence Segmenter 1.0 was programmed by Mr. Yunlong Jia, and designed by Dr. Jiajin Xu. This tool converts any loaded English plain text(s) into one sentence per line format. We have maximized our effort to ignore abbreviations like Dr., Mr., Mrs., B.C. as sentence final marks, but it is almost impossible to exhaust all exceptional and/or ad hoc abbreviations which might be deceptively recognized as sentence final positions. We therefore allow users to customize the list of abbreviations in “Abbrev.ini". In that case, the more abbreviations you add to the list, the less improper segmentations will be possible. Please note that alphanumeric strings in the list are case-sensitive.
The result of segmentation will be saved after execution in the same directory of the source text(s) in the same filename(s) with the extension .seg.
Post-editing is always necessary for serious language workers.
Please cite the program as:
Xu, Jiajin& Yunlong Jia. (2010). BFSU Sentence Segmenter 1.0. Beijing: National Research Center for Foreign Language Education, Beijing Foreign Studies University.
BFSU Sentence Segmenter 1.0 is freeware. The software comes on an “as is” basis, and the authors will accept no liability for any damage that results from using the software.
Bug reports will be highly appreciated and should be sent to WilliamJia@OpenCorpus.org.
相关讨论
Automatic Sentence Segmentation
Minor update: Change of words in About