13 research outputs found

    Prosodic, syntactic, semantic guidelines for topic structures across domains and corpora

    Get PDF
    This paper presents the annotation guidelines applied to naturally occurring speech, aiming at an integrated account of contrast and parallel structures in European Portuguese. These guidelines were defined to allow for the empirical study of interactions among intonation and syntax-discourse patterns in selected sets of different corpora (monologues and dialogues, by adults and teenagers). In this paper we focus on the multilayer annotation process of left periphery structures by using a small sample of highly spontaneous speech in which the distinct types of topic structures are displayed. The analysis of this sample provides fundamental training and testing material for further application in a wider range of domains and corpora. The annotation process comprises the following time-linked levels (manual and automatic): phone, syllable and word level transcriptions (including co-articulation effects); tonal events and break levels; part-of-speech tagging; syntactic-discourse patterns (construction type; construction position; syntactic function; discourse function), and disfluency events as well. Speech corpora with such a multi-level annotation are a valuable resource to look into grammar module relations in language use from an integrated viewpoint. Such viewpoint is innovative in our language, and has not been often assumed by studies for other languages.info:eu-repo/semantics/acceptedVersio

    IDEST: International Database of Emotional Short Texts

    Get PDF
    We introduce a database (IDEST) of 250 short stories rated for valence, arousal, and comprehensibility in two languages. The texts, with a narrative structure telling a story in the first person and controlled for length, were originally written in six different languages (Finnish, French, German, Portuguese, Spanish, and Turkish), and rated for arousal, valence, and comprehensibility in the original language. The stories were translated into English, and the same ratings for the English translations were collected via an internet survey tool (N = 573). In addition to the rating data, we also report readability indexes for the original and English texts. The texts have been categorized into different story types based on their emotional arc. The texts score high on comprehensibility and represent a wide range of emotional valence and arousal levels. The comparative analysis of the ratings of the original texts and English translations showed that valence ratings were very similar across languages, whereas correlations between the two pairs of language versions for arousal and comprehensibility were modest. Comprehensibility ratings correlated with only some of the readability indexes. The database is published in osf.io/9tga3, and it is freely available for academic research.</p

    The national inventory of geological heritage: methodological approach and results

    Get PDF
    A existência de um inventário nacional de património geológico é fundamental para se poderem implementar estratégias de geoconservação. Este trabalho apresenta a metodologia usada no desenvolvimento do mais completo inventário de geossítios, realizado até ao momento em Portugal, assim como os principais resultados obtidos. O inventário vai integrar o Sistema de Informação do Património Natural e o Cadastro Nacional dos Valores Naturais Classificados, ambos geridos pelo Instituto de Conservação da Natureza e da Biodiversidade.The existence of a national inventory of the geological heritage is of paramount importance for the implementation of a geoconservation strategy. This paper presents the methodological approach used to produce the most complete geosites inventory in Portugal, so far, and the obtained results. This inventory will be uploaded into the National Database of Natural Heritage managed by the Portuguese authority for nature conservation.Este trabalho é apoiado pela Fundação para a Ciência e a Tecnologia, através do financiamento plurianual do CGUP e do projecto de investigação “Identificação, caracterização e conservação do património geológico: uma estratégia de geoconservação para Portugal” (PTDC/CTE-GEX/64966/2006).info:eu-repo/semantics/publishedVersio

    MAMMALS IN PORTUGAL : A data set of terrestrial, volant, and marine mammal occurrences in P ortugal

    Get PDF
    Mammals are threatened worldwide, with 26% of all species being includedin the IUCN threatened categories. This overall pattern is primarily associatedwith habitat loss or degradation, and human persecution for terrestrial mam-mals, and pollution, open net fishing, climate change, and prey depletion formarine mammals. Mammals play a key role in maintaining ecosystems func-tionality and resilience, and therefore information on their distribution is cru-cial to delineate and support conservation actions. MAMMALS INPORTUGAL is a publicly available data set compiling unpublishedgeoreferenced occurrence records of 92 terrestrial, volant, and marine mam-mals in mainland Portugal and archipelagos of the Azores and Madeira thatincludes 105,026 data entries between 1873 and 2021 (72% of the data occur-ring in 2000 and 2021). The methods used to collect the data were: live obser-vations/captures (43%), sign surveys (35%), camera trapping (16%),bioacoustics surveys (4%) and radiotracking, and inquiries that represent lessthan 1% of the records. The data set includes 13 types of records: (1) burrowsjsoil moundsjtunnel, (2) capture, (3) colony, (4) dead animaljhairjskullsjjaws, (5) genetic confirmation, (6) inquiries, (7) observation of live animal (8),observation in shelters, (9) photo trappingjvideo, (10) predators dietjpelletsjpine cones/nuts, (11) scatjtrackjditch, (12) telemetry and (13) vocalizationjecholocation. The spatial uncertainty of most records ranges between 0 and100 m (76%). Rodentia (n=31,573) has the highest number of records followedby Chiroptera (n=18,857), Carnivora (n=18,594), Lagomorpha (n=17,496),Cetartiodactyla (n=11,568) and Eulipotyphla (n=7008). The data setincludes records of species classified by the IUCN as threatened(e.g.,Oryctolagus cuniculus[n=12,159],Monachus monachus[n=1,512],andLynx pardinus[n=197]). We believe that this data set may stimulate thepublication of other European countries data sets that would certainly contrib-ute to ecology and conservation-related research, and therefore assisting onthe development of more accurate and tailored conservation managementstrategies for each species. There are no copyright restrictions; please cite thisdata paper when the data are used in publications.info:eu-repo/semantics/publishedVersio

    Mammals in Portugal: a data set of terrestrial, volant, and marine mammal occurrences in Portugal

    Get PDF
    Mammals are threatened worldwide, with ~26% of all species being included in the IUCN threatened categories. This overall pattern is primarily associated with habitat loss or degradation, and human persecution for terrestrial mammals, and pollution, open net fishing, climate change, and prey depletion for marine mammals. Mammals play a key role in maintaining ecosystems functionality and resilience, and therefore information on their distribution is crucial to delineate and support conservation actions. MAMMALS IN PORTUGAL is a publicly available data set compiling unpublished georeferenced occurrence records of 92 terrestrial, volant, and marine mammals in mainland Portugal and archipelagos of the Azores and Madeira that includes 105,026 data entries between 1873 and 2021 (72% of the data occurring in 2000 and 2021). The methods used to collect the data were: live observations/captures (43%), sign surveys (35%), camera trapping (16%), bioacoustics surveys (4%) and radiotracking, and inquiries that represent less than 1% of the records. The data set includes 13 types of records: (1) burrows | soil mounds | tunnel, (2) capture, (3) colony, (4) dead animal | hair | skulls | jaws, (5) genetic confirmation, (6) inquiries, (7) observation of live animal (8), observation in shelters, (9) photo trapping | video, (10) predators diet | pellets | pine cones/nuts, (11) scat | track | ditch, (12) telemetry and (13) vocalization | echolocation. The spatial uncertainty of most records ranges between 0 and 100 m (76%). Rodentia (n =31,573) has the highest number of records followed by Chiroptera (n = 18,857), Carnivora (n = 18,594), Lagomorpha (n = 17,496), Cetartiodactyla (n = 11,568) and Eulipotyphla (n = 7008). The data set includes records of species classified by the IUCN as threatened (e.g., Oryctolagus cuniculus [n = 12,159], Monachus monachus [n = 1,512], and Lynx pardinus [n = 197]). We believe that this data set may stimulate the publication of other European countries data sets that would certainly contribute to ecology and conservation-related research, and therefore assisting on the development of more accurate and tailored conservation management strategies for each species. There are no copyright restrictions; please cite this data paper when the data are used in publications
    corecore