BFSU PowerConc A freeware concordancer for Windows免费通用型语料库检索分析工具

xujiajin

管理员
Staff member
BFSU PowerConc: A freeware concordancer for Windows http://www.bfsu-corpus.org/static/PowerConc.html

Developed at the National Research Centre of Foreign Language Education, Beijing Foreign Studies University

Please cite the programme as:
Xu, Jiajin, Maocheng Liang & Yunlong Jia. (2012). BFSU PowerConc 1.0. National Research Centre for Foreign Language Education, Beijing Foreign Studies University.

Publication and presentation based on BFSU PowerConc
1. 基于R-gram的语料库分析软件PowerConc的设计与开发,《外语电化教学》2013(1):57-62。
(XU, Jiajin & Yunlong JIA. (2013). The design and development of the R-gram based corpus analysis tool 'PowerConc'. Computer-assisted Foreign Language Education (1): 57-62. )

2. Xu, Jiajin and Yunlong Jia. 2013. PowerConc: An R-gram based corpus analysis tool. Paper presented at AACL2013 (American Association of Corpus Linguistics), San Diego, CA, USA. Jan 19, 2013. (Slides used at the conference. AACL2013会议上介绍PowerConc的幻灯片)

3. 南京六十六中的章玉芳老师制作了一个操作说明,与大家分享。 A quick user's guide in Chinese prepared by Ms. Yufang Zhang can be downloaded here.
 
回复: BFSU PowerConc A freeware concordancer for Windows

PowerConc可以计算汉语和英语的搭配。原先我们开发的BFSU Collocator只能处理英语。

4grams of simplified pos categories.jpg
a sth of colligate.jpg
initial_interface.jpg
lemmatized trigram list.jpg
pos sequence.jpg
trigram list.jpg
wordlist.jpg
 

附件

  • 4grams of simplified pos categories.jpg
    4grams of simplified pos categories.jpg
    66.2 KB · 浏览: 71
  • a sth of colligate.jpg
    a sth of colligate.jpg
    55.4 KB · 浏览: 39
  • initial_interface.jpg
    initial_interface.jpg
    80.4 KB · 浏览: 34
  • lemmatized trigram list.jpg
    lemmatized trigram list.jpg
    45.5 KB · 浏览: 31
  • pos sequence.jpg
    pos sequence.jpg
    49.8 KB · 浏览: 30
  • trigram list.jpg
    trigram list.jpg
    49.6 KB · 浏览: 25
  • wordlist.jpg
    wordlist.jpg
    45 KB · 浏览: 29
回复: BFSU PowerConc A freeware concordancer for Windows

谢谢软件的研制专家,谢谢许博分享,试了一下,确实smart,有特色,兼顾生、熟语料,以n-gram为中心,期待manual. 可以做为广大C友的新年大礼。
wordlist的范围拓展了,但处理稍大的语料库时显得吃力。不过已经具有三代语料库软件的几乎所有功能了。
 
Last edited:
回复: BFSU PowerConc A freeware concordancer for Windows

补充一个keyness:
 

附件

  • 2013-01-31_144405.jpg
    2013-01-31_144405.jpg
    48.2 KB · 浏览: 40
  • 2013-01-31_150742.jpg
    2013-01-31_150742.jpg
    49.3 KB · 浏览: 20
  • 2013-01-31_152413.jpg
    2013-01-31_152413.jpg
    45.8 KB · 浏览: 10
  • 2013-01-31_152642.jpg
    2013-01-31_152642.jpg
    47 KB · 浏览: 8
  • 2013-01-31_154429.jpg
    2013-01-31_154429.jpg
    48.2 KB · 浏览: 6
  • 2013-01-31_154514.jpg
    2013-01-31_154514.jpg
    50 KB · 浏览: 7
Last edited:
回复: BFSU PowerConc A freeware concordancer for Windows

keyness/keywords在PowerConc中有两种方式:一是经典方式,即两个语料库对照的方式,WordSmith和AntConc都采取了这种方式。

在PowerConc里,如果你生成词表并Sve,会得到一列tf-idf数据,这也是一种主题词计算方式。这种方式在自然语言处理领域十分常用,他不借助参照语料库,在语料库内部计算主题词,主要考虑主题词在不同文本中的分布情况。详见:http://en.wikipedia.org/wiki/TF_IDF

TF-IDF的方式更简便,同样有效。
 
回复: BFSU PowerConc A freeware concordancer for Windows

谢谢软件的研制专家,谢谢许博分享,试了一下,确实smart,期待manual. 可以做为广大C友的新年大礼。
wordlist的范围拓展了,但处理稍大的语料库时显得吃力。不过已经具有三代语料库软件的几乎所有功能了。

PC based concordancers can't deal with multi-million words within a couple of minutes anyway, unless some sort of indexing is performed. Neither can WordSmith Tools or AntConc.

We will consider indexing feature in later releases, in order to handle 'Big Data'.
 
回复: BFSU PowerConc A freeware concordancer for Windows

PC based concordancers can't deal with multi-million words within a couple of minutes anyway, unless some sort of indexing is performed. Neither can WordSmith Tools or AntConc.

We will consider indexing feature in later releases, in order to handle 'Big Data'.

Yes, WordSmith Tools give inferential statistics information before the text files are indexed whether the corpus is big or not.
 
回复: BFSU PowerConc A freeware concordancer for Windows

I don't think I am understanding you by 'inferential statistics'.
 
回复: BFSU PowerConc A freeware concordancer for Windows

I see, this is one way to calculate the relationship between words by using a wordlist.
Another way is to index the texts first and then get the same result.
 
回复: BFSU PowerConc A freeware concordancer for Windows

To compute collocation measures of specified word(s)/term(s) is much easier. Even though this is the case, to show the 'relationship'--strength of collocation--still requires indexing.

If you understand the algorithms of word associations, more often than not, joint frequency of two words, frequency of the search words and frequency of the collocate, sometimes the span, have to be computed. Apparently, a word list does not suffice to yield all necessary values for the computation of collocational measures.
 
回复: BFSU PowerConc A freeware concordancer for Windows

北外随着系列软件的发布,已经展现出在语料库研究领域“国际化”的趋势了。这些软件是在向世界问好和发出声音,期待做出更多,更好的软件,支持!中国只有有更多的原创的检索软件,系列开发软件,我们在语料库这一块才有自己的东西,而不是软件一直用国外的。就和国防建设一样,不能总买外国的武器,应该有自主研发的能力。
 
回复: BFSU PowerConc A freeware concordancer for Windows免费通用型语料库检索分析工具

做了一个简单的Keywords的操作图解说明,先发上来。
 

附件

  • PowerConc_Keywords.doc
    863 KB · 浏览: 268
回复: BFSU PowerConc A freeware concordancer for Windows免费通用型语料库检索分析工具

very helpful and thanks a lot
 
回复: BFSU PowerConc A freeware concordancer for Windows免费通用型语料库检索分析工具

此软件关键是操作简单化、智能化、可处理词汇也可处理范畴,如增加语料的预处理功能(将大规模语料index处理),从而可处理大容量语料,应该是同类软件中的佼佼者了。
 
回复: BFSU PowerConc A freeware concordancer for Windows免费通用型语料库检索分析工具

尝试了一下,有基础问题请教。运行concordance 没问题,但是如何运行collocation 和colligation 的统计?是否需要tag文件?谢谢许博士;)
 
回复: BFSU PowerConc A freeware concordancer for Windows免费通用型语料库检索分析工具

尝试了一下,有基础问题请教。运行concordance 没问题,但是如何运行collocation 和colligation 的统计?是否需要tag文件?谢谢许博士;)

做完Concordance后,在右下角有一个Coll按钮可以做collocation及colligation分析。
生语料和赋码语料库都可以。你可用软件里自带的5种不同类型文本进行测试。

目前搭配强度计算支持7种不同算法(MI、MI3、T、Z、LL、Dice、Log-Log,可在初始窗口的collocation 和colligation区域选择),应该是各软件里较全的。

如是词性赋码语料的话,可做更多的colligation分析,这是以往其他软件所不具备的。日本有一个叫Co-occurence的软件有类似的功能。

之所以操作按钮命名为Coll,是因为它既可做collocation,又可做colligation。取共同的前四个字母。
 
Back
顶部