Using the BNC on a PC: a guideline


Staff member

Using the BNC on a PC

Some guidelines anyone with an interest in using the BNC on a Windows


The British National Corpus is a potentially important resource for
teachers and researchers, but it was designed with the needs of a narrower
community in mind than the one that I belong to, and at the moment remains
intimidating and impenetrable for most PC users (the PC being the main IT
resource for the rest of us these days). Problems arise at a number of
levels - viz:

1. unpacking the files from the installation CDROM
2. identifying which files might be useful
3. working with the corpus

This note deals with the first two of these points. It is a bodger's guide
to setting up the BNC for people with a reasonable understanding of how to
use a Windows PC. It is not a definitive guide to all the things you can
do with BNC once you've got it set up.


The BNC comes on 3 CDs (at a cost of around £250 for all three - it may be
possible to negotiate the purchase of CD 1 only - I should have done this,
but did not realise you only need disk 1 to use BNC on a PC!). These CDs
contain the BNC data files and a whole host of other applications -
especially SARA, the search engine specifically designed for the corpus.
All of these appear to require Unix or a Unix like operating system such as
Linux for the PC to be installed. My assumption is that the rest of us do
not want the learning curve involved in setting up Unix on their machines,
and that more teachers will use BNC if they can use it on their work PCs.

? requirements

The good news is that the corpus data can be unpacked and transferred to a
PC's hard disk without too much trouble. The pre-requirements are:

? a PC with Windows 95 or better
? around 6 Giga-bytes (GB) of free disk space
? Win Zip 32 (a shareware application that is widely available - if you
haven't got it you can download it from

The steps involved in creating a copy of the corpus should be easy, but you
might meet a couple of problems because of an error in the creation of the
original CDROM - this may have been fixed in later versions, but users
should be aware of the potential glitch.

? unpacking

The procedure is as follows:

? Step 1 - identify the BNC Files
Put Disk 1 of the three disk set in your PC. Windows Explorer will show
you that this contains the following folders:

A.TGZ 51,845,713
B.TGZ 22,991,840
C.TGZ 65,652,987
D.TGZ 321,820
DOC.TGZ 3,126,547
E.TGZ 33,186,120
F.TGZ 46,629,619
G.TGZ 40,933,113
H.TGZ 95,877,886
J.TGZ 27,978,331
K.TGZ 61,981,017
SARA.TGZ 394,824
SGML.TGZ 125,590

The folders that interest us are A.TGZ through to K.TGZ and DOC.TGZ which
contains the BNC users' guide. The TGZ extension indicates that the
folders are compressed. The good news is that WinZip can uncompress these
files and transfer them from the CDROM on to your PC's hard disk. The bad
news is that folders A, B and C contain compressed folders which have them
selves been incorrectly named, and therefore present a problem for

? Step 2 - unpack and rename the contents of Folders A,B & C
With installed WinZip on your PC, when double click on Folder A you will
see that it contains a folder called "a". This should be called "a.tar" -
another file compression format which WinZip can also unpack. So to make
this folder useable it has to be un-zipped to the hard drive and then
renamed. Do this in the following way:

- double click on Folder A
- in the WinZip window select "a" (this is the only folder)
- select "Extract" from the WinZip menu
- choose an appropriate folder on your hard disk drive to which you want to
send the folder (I have a folder called C:\ZIP_TEMP on my PC that I reserve
for this sort of activity). Extract the folder to this drive. These are
BIG files ("A" is over 50 MB), so if it takes a few minutes, don't panic!.
- using Windows Explorer open eg C:\ZIP_TEMP and right click on the file
you have transferred (you will see that it is now much bigger). Select
"Rename" from the menu and add the extension .TAR to the file name.
- You will now be able to uncompress this to an appropriate directory - eg
C:\BNC - by (1) double clicking on the folder, (2) chosing "select all"
from the "Actions" menu, and then (3) selecting "Extract" and sending the
files to eg C:\BNC.
- Repeat these steps for folders B.TGZ and C.TGZ. This took me some time
to work out, but once you have understood the problem, it's easy to fix.

? Step 3 - unpacking Folders D - K
- In Windows Explorer, double click on an appropriate folder on the BNC
CDROM (eg "D")
- When Winzip asks you "Should WinZip decompress it to a temporary folder
and open it?", select "No". A second WinZip window will open containing a
single folder .TAR.
- Double click on this folder and get a list of the folders contained in
the .TAR folder.
- In WinZip, select all these folders through Actions, Select All.
- Extract the folders to an appropriate directory (eg C:\BNC)
- Repeat this process for the remaining folders (ie E, F, G, H, J, K)

You will now have a full version of the text files in BNC on your hard disk

? You can use the same procedure to unpack the BNC documentation. Select
DOC.TGZ and decompress it to an appropriate folder on your PC (eg
C:\BNCDOC). The information on the BNC documentation is invaluable as it
tells you what is contained in each file of BNC text.


Staff member
If you have encountered this...

Problem -

Dear colleagues and list members,

After having to buy a new laptop, I reinstalled the BNC. I have, unfortunately, forgotten how I opened the text files a few years ago and can not find anything in the manual of any help.

In any case, when I try to open the tar.gz file (which I believe I have to do in order to access the texts?? and tried to open with winzip), I get an error message which states ''texts.tar.gz is not a valid win32 application'. My operating system is XP 2 pack. Any help from those of you who are not as computer challenged as I am will be greatly appreciated.

Result -

Dear colleagues and list members,

Success! I have exatracted the files. I downloaded 7zip and although I had some problems (e.g. not realizing that you can right click on the file and tell it a lot of things to do , patiently watching 7zip extract the files but then not being able to locate them (how stupid does that make you feel?), finally finding the files and clicking on one but not being able to open it-still can't do that, maybe I'm not suppose to be able to do it). Just as a test, I selected some files from 7zip and ran them through Wordsmith and it made a wordlist (whew!)..

So a big thank you to Xiao Zhongua, Rob Raisch, Kolla Maeedhar (good to hear from you!), and Andy Roberts for all your promt and helpful suggestions and advice.

And very special thank you to Lou Burnard of BNC fame for taking the time to personally get back to me and reminding me that there is an 'install' file that does all the work for you.If I were in Houston where my BNC discs are, instead of Liverpool diligently?? working on my thesis I would have indeed noticed that (or perhaps not).

Best wishes (for this side of the pond) , ya'll take care (for the other side),


The problems are not solved yet. Now I have the same problem of installing BNC. Advice from expert is text files are not properly installed. My question is how could I get Disk 1 (which includes the texts) installed? Any suggestions are appreciated.