The second release of the American National Corpus includes updated versions all of the files in the first release plus an additional 10 million new words. However, the second release uses standoff annotations to a much greater extent than did the first release. All documents are now stored logically as annotation graphs with a node set and an edge set. The node set consists of a UTF-16 character stream with an implied node between each pair of characters and at the start and end of the stream. The edge set consists of one or more XML documents that describe the annotations.
See http://americannationalcorpus.org/2ndrelease.html#.
See http://americannationalcorpus.org/2ndrelease.html#.