ESEC - Comunicações em conferências e congressos
URI permanente para esta coleção:
Navegar
Entradas recentes
- Estereotipização e Lutas de Identidade: Da tiranização de Saddam Hussein à ocidentalização do povo iraquiano no jornal PúblicoPublication . CERQUEIRA BORGES, SUSANA MARIAAbordamos a defesa da guerra como o "mal menor" que libertará um povo oprimido e garantirá asegurança internacional, nos editoriais do jornal Público de Março de 2003, confrontando a sua retórica dedivisão entre Ocidente e Oriente, assente na dicotomização entre "Nós" e o "Outro", com o pressupostode que os termos carecem de estabilidade ontológica, sendo "feitos de esforço humano, em parteafirmação, em parte identificação do outro" (Said, 2004: XIII).A influência da comunicação mediática será equacionada entre o seu potencial como instrumento dehegemonia ideológica (Gramsci, 1974: 393) e a sua capacidade de "regular produção de poder legítimoatravés da linguagem (num sentido comunicacional de abertura dos media ao mundo, à vida e àexperiência humana)" (Esteves, 2005: 38).Se "a política é o lugar, por excelência, da eficácia simbólica, acção que se exerce por sinais capazes deproduzir coisas sociais e, sobretudo, grupos" (Bourdieu, 2001: 159), problematizamos a função políticadestes editoriais na construção de um "nós ideológico" que visa fabricar um consenso que legitime aguerra (Rojo, 1995: 75-76), servindo a clausura auto-referencial do sistema político (Luhmann, 2006: 87),em detrimento do debate crítico-racional (Habermas, 1998: 443).
- LemPORT: a High-Accuracy Cross-Platform Lemmatizer for PortuguesePublication . Rodrigues, Ricardo; Oliveira, Hugo Gonçalo; Gomes, PauloAlthough lemmatization is a very common subtask in many natural language processing tasks, there is a lack of available true cross-platform lemmatization tools specifically targeted for Portuguese, namely for integration in projects developed in Java. To address this issue, we have developed a lemmatizer, initially just for our own use, but which we have decided to make publicly available. The lemmatizer, presented in this document, yields an overall accuracy over 98% when compared against a manually revised corpus.
- NLPPort: A Pipeline for Portuguese NLPPublication . Rodrigues, Ricardo; Oliveira, Hugo Gonçalo; Gomes, PauloAlthough there are tools for some the most common natural language processing tasks in Portuguese, there is a lack of available cross-platform tools specifically targeted for Portuguese, from end to end, namely for integration in projects developed in Java. To address this issue, we have developed and tweaked, over the last half-dozen years, NLPPort, a set of tools that can be used in a pipelined fashion, which we have made publicly available. In this paper, we present the major features of such set of tools.
- ASAPP 2.0: Advancing the state-of-the-art of semantic textual similarity for PortuguesePublication . Alves, Ana; Oliveira, Hugo Gonçalo; Rodrigues, Ricardo; Encarnação, RuiSemantic Textual Similarity (STS) aims at computing the proximity of meaning transmitted by two sentences. In 2016, the ASSIN shared task targeted STS in Portuguese and released training and test collections. This paper describes the development of ASAPP, a system that participated in ASSIN, but has been improved since then, and now achieves the best results in this task. ASAPP learns a STS function from a broad range of lexical, syntactic, semantic and distributional features. This paper describes the features used in the current version of ASAPP, and how they are exploited in a regression algorithm to achieve the best published results for ASSIN to date, in both European and Brazilian Portuguese.
- Using Lucene for Developing a Question-Answering Agent in PortuguesePublication . Oliveira, Hugo Gonçalo; Filipe, Ricardo; Rodrigues, Ricardo; Alve, AnaGiven the limitations of available platforms for creating conversational agents, and that a questionanswering agent suffices in many scenarios, we take advantage of the Information Retrieval library Lucene for developing such an agent for Portuguese. The solution described answers natural language questions based on an indexed list of FAQs. Its adaptation to different domains is a matter of changing the underlying list. Different configurations of this solution, mostly on the language analysis level, resulted in different search strategies, which were tested for answering questions about the economic activity in Portugal. In addition to comparing the different search strategies, we concluded that, towards better answers, it is fruitful to combine the results of different strategies with a voting method.
- NLPyPort: Named Entity Recognition with CRF and Rule-Based Relation ExtractionPublication . Ferreira, João; Oliveira, Hugo Gonçalo; Rodrigues, RicardoThis paper describes the application of the NLPyPort pipeline to Named Entity Recognition (NER) and Relation Extraction in Portuguese, more precisely in the scope of the IberLEF-2019 evaluation task on the topic. NER was tackled with CRF, based on several features, and trained in the HAREM collection, but results were low. This was partly caused by an issue on the submitted model, which had been trained in lowercase text, but, apparently, also due to the training data used, which highlights the different natures of HAREM, the source of the majority of the testing corpus, and SIGARRA. Relations were extracted with a set of rules bootstrapped from the examples provided by the organisation. Despite an F1-score of 0.72, we were the only participants in this task. We also express our doubts concerning the utility of the extracted relations.
- Improving NLTK for Processing PortuguesePublication . Ferreira, João; Oliveira, Hugo Gonçalo; Rodrigues, RicardoPython has a growing community of users, especially in the AI and ML fields. Yet, Computational Processing of Portuguese in this programming language is limited, in both available tools and results. This paper describes NLPyPort, a NLP pipeline in Python, primarily based on NLTK, and focused on Portuguese. It is mostly assembled from pre-existent resources or their adaptations, but improves over the performance of existing alternatives in Python, namely in the tasks of tokenization, PoS tagging, lemmatization and NER.
- Assessing Factoid Question-Answer Generation for PortuguesePublication . Ferreira, João; Rodrigues, Ricardo; Oliveira, Hugo GonçaloWe present work on the automatic generation of question-answer pairs in Portuguese, useful, for instance, for populating the knowledge-base of question-answering systems. This includes: (i) a new corpus of close to 600 factoid sentences, manually created from an existing corpus of questions and answers, used as our benchmark; (ii) two approaches for the automatic generation of question-answer pairs, which can be seen as baselines; (iii) results of those approaches in the corpus.
- AIA-BDE: A Corpus of FAQs in Portuguese and their VariationsPublication . Oliveira, Hugo Gonçalo; Ferreira, João; Santos, José; Fialho, Pedro; Rodrigues, Ricardo; Coheur, Luísa; Alves, AnaWe present AIA-BDE, a corpus of 380 domain-oriented FAQs in Portuguese and their variations, i.e., paraphrases or entailed questions, created manually, by humans, or automatically, with Google Translate. Its aims to be used as a benchmark for FAQ retrieval and automatic question-answering, but may be useful in other contexts, such as the development of task-oriented dialogue systems, or models for natural language inference in an interrogative context. We also report on two experiments. Matching variations with their original questions was not trivial with a set of unsupervised baselines, especially for manually created variations. Besides high performances obtained with ELMo and BERT embeddings, an Information Retrieval system was surprisingly competitive when considering only the first hit. In the second experiment, text classifiers were trained with the original questions, and tested when assigning each variation to one of three possible sources, or assigning them as out-of-domain. Here, the difference between manual and automatic variations was not so significant.
- Masterblind : testing the usability of auditory feedback in a computer game for blind peoplePublication . Teixeira, Ana Rita; Carvalhal, Ana; Abrantes, Filipe; Lourenço, Vladimiro; Gomes, Anabela; Orvalho, JoaoThe present study presents an adaptation of the Mastermind board game for blind users - Masterblind. Given the focus on visual information in the original game, the game mechanics were simplified and auditory feedback introduced. The research object was to understand what kind of sounds would work better to help blind people play the game. Three versions were presented to the subjects - pentatonic notes, animal sounds and vowels - to help users recall previous steps in the game. The main hypothesis predicted that blind users would consciously benefit from the auditory feedback provided. The second hypothesis predicted that users would benefit less from the feedback that doesn’t provide semantic information. The results were congruent with the hypothesis, although revealing an important role for spatial awareness. Masterblind can be an usable, enjoyable and a challenging experience for blind users as long as it provides semantically significant feedback.
