BTW: Prof. Wen is currently affiliated to the national research center for foreign language edication, beijing foreign studies univ. Meanwhile she is supervising phd theses of her Nanda students.
The data collection (including sampling strategies), transcription, and annotation is sure to be time-consuming and tedious, but they determines the quality, the usability (and even their own limitations of usability) of any corpora.
Those corpus users/consumers will never understand the hardship and even the exhaustion of corpus building. So I think it is part of your "corpus life" to compile a corpus of your own, whatever the size is.
As far as I know, Pro. Wei Naixing is busy building the spoken corpus now, and maybe last year he published an article related to his spoken corpus in some key foreign language journal in China. Unfortunately, I cannot remember the exact name of his article.