BNC is commercially available and it is not advisable to disseminate it in this forum for the sake of copy right. Because the corpus is in itself a large one and has been annotated with a lot of metadata and POS information in XML format, the overall size reaches up to 4.35 gigabytes, which presents another obstacle in uploading and downloading work. What's more, even though you have already known the filenames containing spoken materials, without a specially designed tool, it would still be difficult to extract them and put them into just one folder.