There are two sets of codes: one is the coding of learners' errors and the other is the POS. The POS part is based on TOSCA/LOB tagset and tagged by this tagger. For the tagset, please visit http://english.htu.edu.cn/lingualsoft/index.htm and there is the COLEN corpus I did about three years ago.