For <NN.*>{2,} try
pattern = r"""NP: {<NN.*><NN.*>+}"""
For <NN.*>{2,5} try
pattern = r"""NP: {<NN.*><NN.*>}
{<NN.*><NN.*><NN.*>}
{<NN.*><NN.*><NN.*><NN.*>}
{<NN.*><NN.*><NN.*><NN.*><NN.*>}
"""
Ugly but work. :)
Run the following...
There's a very detailed explanation here:
http://www.linguistics.ucsb.edu/faculty/stgries/teaching/groningen/readme.txt
Just follow the instruction and you will get the result.
The key for the R script to work is to organize your data as required.
Take the collexeme_analysis as an example...
可以参考下面这本书(http://gen.lib.rus.ec/):
An Introduction to Categorical Data Analysis (2nd Ed.)
第 2.2 节 有关 Odds Ratio 的介绍和
第 7.1 节(p.207) 有关 如何解读 Log Linear 模型的结果
回复: mwetoolkit - The Multiword Expressions toolkit
Thanks!
Another interesting open source NLP tool, but developed under Linux.
See here for installation on Windows.
R 语言: Fisher's Exact Test 脚本
1. Why Fisher's Exact Test?
Because Chi-squared Test is not so accurate when the expected frequency is less than 5.
2. How to use the script compute_fisher.r?
It's very easy. Just copy all the code into R and change the fist line setwd() to the directory...
回复: Asking for help with data transformation for normal distribution
If the random variable is not normally distributed, how can you transform it to be so?
I think transformation such as standardization only affects the scale of the variable instead of the distribution of it.
Yes, I think...