Bug in WordSmith 4?

xiaoz

永远的超级管理员
Staff member
Here is a posting by tiger in the news publication section -

I used wordsmith 3.00.00 and wordsmith 4.0.0.106 (for temporary use) respectively to produce a wordlist based on st2 of CLEC with the tags <*> and [*] ignored. The difference between the results was appalling: the former produced a wordlist of 8185 types; the latter produced a list of 8411 types, and obviously some types enclosed in the tags <*> and [*] were also included on the latter list. Is it because wordsmith 4.0.0.106 is only for temporary use, and consequently not free of bugs?
 
回复:Bug in WordSmith 4?

I tried WST3 and 4 on the same dataset, exluding <*> and [*]:
WST3: 8201 types
WST4: 8243 types

Some types appear in WST3 list but not in WST4; some appear in WST4 list but not in WST3 list; a small number of types also have slightly different frequencies in the two lists.

Here is a detailed comparison:
http://forum.corpus4u.org/upload/forum/2005080823572892.rar
 
I also have the experience of extracting different frequencies out of different concordancers, this puzzles me a lot although the difference is not very big.
 
回复:Bug in WordSmith 4?

The only factors I could think of are setting differences
between the different versions that are installed: hyphen,
abbreviation, upper/lower cases, etc. I don't have WST 3
at hand so I cannot tell all the setting parameters.
 
As can be seen from the attachment in my posting, types that appear in WST3 list but in WST4 list are themsleves interesting "words".
 
Back
顶部