Logo do repositório
 
Publicação

Introducing the Portuguese web archive initiative

dc.contributor.authorGomes, Daniel
dc.contributor.authorNogueira, André
dc.contributor.authorMiranda, João
dc.contributor.authorCosta, Miguel
dc.date.accessioned2009-11-25T13:41:40Z
dc.date.available2009-11-25T13:41:40Z
dc.date.issued2009-09
dc.description.abstractThis paper introduces the Portuguese Web Archive initiative, presenting its main objectives and work in progress. Term search over web archives collections is a desirable feature that raises new challenges. It is discussed how the terms index size could be reduced without significantly decreasing the quality of search results. The results obtained from the first performed crawl show that the Portuguese web is composed approximately at least by 54 million contents that correspond to 2.8 TB of data. The crawl of the Portuguese web was stored in 2 TB of disk space using the ARC compressed format.pt
dc.identifier.citationDaniel Gomes, André Nogueira, João Miranda, Miguel Costa, Introducing the Portuguese web archive initiative, 8th International Web Archiving Workshop, Aarhus, Denmark, Setembro de 2008pt
dc.identifier.urihttp://hdl.handle.net/10400.26/470
dc.language.isoengpt
dc.publisherSpringerpt
dc.subjectArchivept
dc.subjectPortugalpt
dc.subjectPreservationpt
dc.subjectHistorypt
dc.titleIntroducing the Portuguese web archive initiativept
dc.typejournal article
dspace.entity.typePublication
oaire.citation.conferencePlaceAarhus, Dinamarcapt
oaire.citation.title8th International Web Archiving Workshoppt
rcaap.rightsopenAccesspt
rcaap.typearticlept

Ficheiros

Principais
A mostrar 1 - 1 de 1
A carregar...
Miniatura
Nome:
introducing-the-portuguese-web-archive-initiative.pdf
Tamanho:
225.4 KB
Formato:
Adobe Portable Document Format
Licença
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
license.txt
Tamanho:
1.91 KB
Formato:
Item-specific license agreed upon to submission
Descrição: