Utilize este identificador para referenciar este registo: http://hdl.handle.net/10400.26/470
Título: Introducing the Portuguese web archive initiative
Autor: Gomes, Daniel
Nogueira, André
Miranda, João
Costa, Miguel
Palavras-chave: Archive
Data: Set-2009
Editora: Springer
Citação: Daniel Gomes, André Nogueira, João Miranda, Miguel Costa, Introducing the Portuguese web archive initiative, 8th International Web Archiving Workshop, Aarhus, Denmark, Setembro de 2008
Resumo: This paper introduces the Portuguese Web Archive initiative, presenting its main objectives and work in progress. Term search over web archives collections is a desirable feature that raises new challenges. It is discussed how the terms index size could be reduced without significantly decreasing the quality of search results. The results obtained from the first performed crawl show that the Portuguese web is composed approximately at least by 54 million contents that correspond to 2.8 TB of data. The crawl of the Portuguese web was stored in 2 TB of disk space using the ARC compressed format.
URI: http://hdl.handle.net/10400.26/470
Aparece nas colecções:FCCN - Fundação para a Computação Científica Nacional

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
introducing-the-portuguese-web-archive-initiative.pdf225,4 kBAdobe PDFVer/Abrir

FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote Degois 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.