Personal tools
  •  
You are here: Home Collaborate Recommendations for web authors to enable web archiving Crawler-friendly homepage

Crawler-friendly homepage

To enable the archive of a site by the Portuguese Web Archive, it is fundamental that the site presents a crawler-friendly homepage.

The Portuguese Web Archive crawler archives the web by crawling the homepages of sites (e.g. http://www.fccn.pt) first and then following links to the remaining contents.

If the crawler cannot process the homepage of a site, it will not be able to find the links to other contents. Therefore, to create crawler-friendly homepages:

  • Use preferentially HTML or XHTML formats;
  • Ensure that every content can be found by following links from the homepage;
  • Do not create homepages composed exclusively by images or animations (e.g. Flash). If you must create a homepage of this kind, there should be an alternative version of the homepage in HTML/XHTML format.
Share | |