Logo do repositório
 
Publicação

Improving NLTK for Processing Portuguese

dc.contributor.authorFerreira, João
dc.contributor.authorOliveira, Hugo Gonçalo
dc.contributor.authorRodrigues, Ricardo
dc.date.accessioned2026-01-26T16:25:25Z
dc.date.available2026-01-26T16:25:25Z
dc.date.issued2019
dc.description.abstractPython has a growing community of users, especially in the AI and ML fields. Yet, Computational Processing of Portuguese in this programming language is limited, in both available tools and results. This paper describes NLPyPort, a NLP pipeline in Python, primarily based on NLTK, and focused on Portuguese. It is mostly assembled from pre-existent resources or their adaptations, but improves over the performance of existing alternatives in Python, namely in the tasks of tokenization, PoS tagging, lemmatization and NER.eng
dc.identifier.citationJoão Ferreira, Hugo Gonçalo Oliveira, and Ricardo Rodrigues. Improving NLTK for Processing Portuguese. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 18:1-18:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019) https://doi.org/10.4230/OASIcs.SLATE.2019.18
dc.identifier.doi10.4230/OASIcs.SLATE.2019.18
dc.identifier.urihttp://hdl.handle.net/10400.26/61201
dc.language.isoeng
dc.peerreviewedn/a
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectNLP
dc.subjectTokenization
dc.subjectPoS tagging
dc.subjectLemmatization
dc.subjectNamed Entity Recognition
dc.titleImproving NLTK for Processing Portugueseeng
dc.typeconference object
dspace.entity.typePublication
oaire.citation.endPage18:9
oaire.citation.startPage18:1
oaire.citation.title8th Symposium on Languages, Applications and Technologies (SLATE 2019)
oaire.citation.volume74
oaire.versionhttp://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyNameRodrigues
person.givenNameRicardo
person.identifier.ciencia-idD31C-FB4A-FEAA
person.identifier.orcid0000-0002-6262-7920
relation.isAuthorOfPublicationc64ccf7c-eca2-43cf-a4a2-78e684499c00
relation.isAuthorOfPublication.latestForDiscoveryc64ccf7c-eca2-43cf-a4a2-78e684499c00

Ficheiros

Principais
A mostrar 1 - 1 de 1
A carregar...
Miniatura
Nome:
OASIcs.SLATE.2019.18.pdf
Tamanho:
450.74 KB
Formato:
Adobe Portable Document Format
Licença
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
license.txt
Tamanho:
1.85 KB
Formato:
Item-specific license agreed upon to submission
Descrição: