

一:BNC中名词的frequency wordlist 怎么提取?

If you have downloaded the XML BNC version, you can try to extract the genre or text category information from the header with XML parser such as Beautifulsoup or lxml in Python, or save yourself some trouble and go to http://bncweb.lancs.ac.uk/bncwebSignup/user/login.php, they have the annotated information of different registers for your query results.
I found someone has provided the frequency list. Thanks a lot~ By the way,do you have any advice to make concordance in BNC_Xml_Editon. Althoughh Xaira is recommend to deal with it, I just failed to make the index with it. You may check the procedures I 've taken in my blog:https://i4language.wordpress.com/2016/04/01/how-to-use-xaira-to-deal-with-bnc_xml_edition/
I found someone has provided the frequency list. Thanks a lot~ By the way,do you have any advice to make concordance in BNC_Xml_Editon. Althoughh Xaira is recommend to deal with it, I just failed to make the index with it. You may check the procedures I 've taken in my blog:https://i4language.wordpress.com/2016/04/01/how-to-use-xaira-to-deal-with-bnc_xml_edition/
Use regular expression to detag the XML files as plain texts or keep pos tags only if you need them. BTW, WST does have text converter that works with BNC XML files to convert them to texts.