1. JRC-Acquis: a large aligned parallel corpus in 21 languages, freely
available
SIZE AND FORMAT
- 21 languages (all 20 official EU languages plus Romanian)
- Average corpus size: 8.8 million words per language
- XML Format according to TEI P4, UTF-8-encoded
- Modular: download the...