Personal tools
  •  
You are here: Home Collaborate Recommendations for web authors to enable web archiving Texts published using textual formats

Texts published using textual formats

To enable that an archived content can be found within a web archive, it is fundamental that texts are published using textual formats.

Web archives process the text contained in contents to make them searchable. However, it is frequently impossible to extract texts from contents in a non-textual format, such as images, executable programs or videos.

  • Publish texts using HTML/XHTML because these are the mostly used and better supported textual formats on the Web.

How to detect is a content is in a textual format

The following simple test detects if a text within a content was published using an adequate format:

  1. Select the text on the content;
  2. Edit -> Copy on the browser;
  3. Edit -> Paste on a text editor, such as Microsoft Word.

If you cannot performed with success one of these steps, then probably the text was published using an inadequate format.

FCCN - Fundação para a Computação Científica Nacional UMIC - Agência para a Sociedade do Conhecimento POSC - Programa Operacional Sociedade do Conhecimento UE - União Europeia - FEDER - Fundo Europeu de Desenvolvimento Regional