First, I discuss the advantages of having and making text corpora available, both for corpus providers and for corpus users and, more generally, for anyone interested in studying language and/or working with natural language processing.
I then proceed to suggest some reasons why so far there are so few corpora of Portuguese generally available, describing legal and economic problems, technical difficulties, and cultural impedments.
With reference to my previous work with the Oslo Corpus of Bosnian Texts as an example, I list the advantages of making corpora available, from three perspectives: general properties of WWW-based systems, specific advantages for corpus compilers, and for corpus users.
Finally, I suggest that Evaluation Contests for Portuguese, based on available corpus data, are organized.
List of publications