Repository logo
 
Loading...
Profile Picture
Person

Nogueira Leitão Lima Grilo, Ana Margarida

Search Results

Now showing 1 - 2 of 2
  • The BioVisualSpeech corpus of words with sibilants for speech therapy games development
    Publication . Cavaco, Sofia; Guimarães, Isabel; Ascensão, Mariana; Abad, Alberto; Anjos, Ivo; Oliveira, Francisco; Martins, Sofia; Marques, Nuno; Eskenazi, Maxine; Magalhães, João; Grilo, Ana Margarida
    Abstract: In order to develop computer tools for speech therapy that reliably classify speech productions, there is a need for speech production corpora that characterize the target population in terms of age, gender, and native language. Apart from including correct speech productions, in order to characterize the target population, the corpora should also include samples from people with speech sound disorders. In addition, the annotation of the data should include information on the correctness of the speech productions. Following these criteria, we collected a corpus that can be used to develop computer tools for speech and language therapy of Portuguese children with sigmatism. The proposed corpus contains European Portuguese children’s word productions in which the words have sibilant consonants. The corpus has productions from 356 children from 5 to 9 years of age. Some important characteristics of this corpus, that are relevant to speech and language therapy and computer science research, are that (1) the corpus includes data from children with speech sound disorders; and (2) the productions were annotated according to the criteria of speech and language pathologists, and have information about the speech production errors. These are relevant features for the development and assessment of speech processing tools for speech therapy of Portuguese children. In addition, as an illustration on how to use the corpus, we present three speech therapy games that use a convolutional neural network sibilants classifier trained with data from this corpus and a word recognition module trained on additional children data and calibrated and evaluated with the collected corpus.
  • The BioVisualSpeech european portuguese sibilants corpus
    Publication . Grilo, Ana Margarida; Guimarães, Isabel; Ascensão, Mariana; Abad, Alberto; Anjos, Ivo; Magalhães, João; Cavaco, Sofia
    Abstract. The development of reliable speech therapy computer tools that automatically classify speech productions depends on the quality of the speech data set used to train the classi cation algorithms. The data set should characterize the population in terms of age, gender and native language, but it should also have other important properties that characterize the population that is going to use the tool. Thus, apart from including samples from correct speech productions, it should also have samples from people with speech disorders. Also, the annotation of the data should include information on whether the phonemes are correctly or wrongly pronounced. Here, we present a corpus of European Portuguese children's speech data that we are using in the development of speech classi ers for speech therapy tools for Portuguese children. The corpus includes data from children with speech disorders and in which the labelling includes information about the speech production errors. This corpus, which has data from 356 children from 5 to 9 years of age, focuses on the European Portuguese sibilant consonants and can be used to train speech recognition models for tools to assist the detection and therapy of sigmatism.