网上看见一段用Python写的中文分句代码,在Win XP、Python2.7中运行出错,提示“NameError: global name 'FindTok' is not defined ”,不知该如何修改,请教于大家!
另外,如果想处理某个文件夹下的全部文件该怎么修改?
谢谢!
(“附件” 中的代码含有正确的缩进格式,为方便上传,特将“.py”后缀改成了“.txt”,改回至“.py”后缀可运行)
代码如下(此代码格式不正确,没有缩进):
[xcode=python]
def Cut(cutlist, lines):
l = []
line = []
for i in lines:
if FindTok(cutlist,i):
l.append("".join(line))
l.append(i)
line = []
else:
line.append(i)
return l
cutlist = "[。,,!……!《》<>\"'::?\?、\|“”‘’;]{}(){}【】(){}():?!。,;、~——+%%`:“”"'‘\n\r".decode('gbk')
for lines in file(inputfilename):
l = Cut(list(cutlist),list(lines.decode('gbk')))
for line in l:
if line.strip() <> "": #这里可能包含空格
li = line.strip().split()
for sentence in li:
print "se:",sentence
[/xcode]
代码所在网页:http://www.oschina.net/code/snippet_16840_2007
另外,如果想处理某个文件夹下的全部文件该怎么修改?
谢谢!
(“附件” 中的代码含有正确的缩进格式,为方便上传,特将“.py”后缀改成了“.txt”,改回至“.py”后缀可运行)
代码如下(此代码格式不正确,没有缩进):
[xcode=python]
def Cut(cutlist, lines):
l = []
line = []
for i in lines:
if FindTok(cutlist,i):
l.append("".join(line))
l.append(i)
line = []
else:
line.append(i)
return l
cutlist = "[。,,!……!《》<>\"'::?\?、\|“”‘’;]{}(){}【】(){}():?!。,;、~——+%%`:“”"'‘\n\r".decode('gbk')
for lines in file(inputfilename):
l = Cut(list(cutlist),list(lines.decode('gbk')))
for line in l:
if line.strip() <> "": #这里可能包含空格
li = line.strip().split()
for sentence in li:
print "se:",sentence
[/xcode]
代码所在网页:http://www.oschina.net/code/snippet_16840_2007
附件
Last edited by a moderator: