[推荐] Yoshikoder: A free content analysis software

yinghuang

高级会员
Yoshikoder is a cross-platform multilingual content analysis program developed as part of the Identity Project at Harvard's Center for International Affairs.
Yoshikoder allows you to load documents, construct and apply content analysis dictionaries, examine keywords-in-context, and perform basic content analyses, in any language.

In more detail: Yoshikoder works with text documents, whether in plain ASCII, Unicode (e.g. UTF-8), or a national encodings (e.g. Big5 Chinese.) You can construct, view, and save keywords-in-context. You can write content analysis dictionaries can be constructed using PERL-style regular expressions. Yoshikoder provides summaries of documents, either as word frequency tables or according to a content analysis dictionary. You can also compare documents according to word frequency profile or with respect to a content dictionary. Yoshikoder's native file format is XML, so dictionaries and keyword-in-context files are non-proprietary and human readable.

Yoshikoder is open-source software, released under the Gnu Public License. This licensing implies, among other things, that Yoshikoder is free for academic use.

http://people.iq.harvard.edu/~wlowe/CCA.html

[本贴已被 动态语法 于 2006年07月05日 22时40分54秒 编辑过]
 
回复:[推荐] Yoshikode

How to get it started:
(1) Download and install windows excutable Yoshikoder at http://people.iq.harvard.edu/~wlowe/CCA.html.
(2) Download and install J2SE SDK also at this website, if you havent installed it.
(3) Run it.
(4) Open document, as shown below.
2006070622250313.gif

(5) Set encoding format (UTF8, or Big5, that depends) and font style for Chinese, as illustrated below.
2006070622285382.gif

(6) Click Report: on document, as displayed below.
2006070622323631.gif

(7) See the reported results.
2006070622353173.gif

(8) Enjoy the results.


[本贴已被 作者 于 2006年07月06日 22时42分19秒 编辑过]
 
Thank you for the screen shots.
The concordance part is not its best part.
What I am interested in is its so-called "content analysis" (as I see it--semantic analysis and beyond).
 
I guess the ware cannot do semantic analysis and beyond as you expect. For an overview of content analysis esp. concerning what software can do, see Stempler (2001) at the site: http://pareonline.net/getvn.asp?v=7&n=17.
 
Back
顶部