Please use the following links to access the corpus collection.
- Roteiro - selective crawl of most popular sites
- IA - broad crawls periodically made by the Internet Archive
- BN - selective crawls made by the Portuguese National Library
- AWP2 - exhaustive crawl of mostly the .pt domain
- AWP4 - exhaustive crawl of mostly the .pt domain