297 research outputs found
Análisis de la alimentación de estadios larvarios de Sardina pilchardus (Walbaum, 1792) en el mar Cantábrico central
Diet analysis of larval fish may clarify the role that zooplankton—not only its abundance but also its size structure and taxonomic composition—may have on larval growth and subsequent recruitment levels. Taking into account the increase in size during larval stages, we followed a larval size-dependent approach for the analysis of prey size in the diet of larval sardine (Sardina pilchardus). Studies on feeding patterns of clupeid larvae typically involve a problem due to regurgitation and defecation of gut contents during capture. Therefore, an alternative sampling method was tested in this study, but no significant differences from conventional methods were found. Despite the low feeding incidence observed (23%), we found a circadian feeding pattern with the highest mean gut contents after dawn, decreasing during the day and with the lowest values at night. Diet was mostly composed of copepod developmental stages, mainly nauplii, and prey size increased with larval size following a power function. Maximum and mean prey size were related to larval mouth gape, though other factors may be restricting the maximum prey size ingested, since most prey width values were between 20 and 40% of larval mandible width.El análisis de la dieta de las larvas de peces puede clarificar el papel que el zooplancton, no solo su abundancia sino también su estructura de tamaños y su composición taxonómica, pueden tener en el crecimiento larvario y los subsecuentes niveles de reclutamiento. Teniendo en cuenta el incremento de tamaño durante los estadíos larvarios, hemos seguido una aproximación dependiente del tamaño para el análisis del tamaño de presa en la dieta de larvas de sardina (Sardina pilchardus). Los estudios sobre los patrones de alimentación de las larvas de clupeidos suelen presentar problemas debido a la regurgitación y defecación de los contenidos digestivos durante la captura. En este sentido, en el presente estudio se probó un nuevo método de muestreo pero no se encontraron diferencias significativas respecto a los métodos convencionales. A pesar de la baja incidencia alimenticia observada (23%), encontramos un patrón alimenticio circadiano con valores medios de contenidos digestivos más elevados después del amanecer, que decrecen durante el día y se minimizan por la noche. La dieta se compuso fundamentalmente de estadíos de desarrollo de copépodo, sobre todo nauplii, y el tamaño de presa aumentó con el tamaño de larva siguiendo una función potencial. El tamaño máximo y medio de presa se relacionó con la apertura bucal de las larvas, aunque otros factores pueden restringir el tamaño máximo de presa ingerida, ya que el tamaño de la mayoría de las presas se encontró entre el 20 % y el 40 % de la anchura de mandíbula larvaria
Building the Gold Standard for the surface syntax of Basque
In this paper, we present the process in the construction of SF-EPEC, a 300,000-word corpus syntactically annotated that aims to be a Gold Standard for the surface syntactic processing of Basque. First, the tagset designed for this purpose is described; being Basque an agglutinative language, sometimes complex syntactic tags were needed. We also account for the different phases in the construction of SF-EPEC
Patterns of Text Readability in Human and Predicted Eye Movements
It has been shown that multilingual transformer models are able to predict human reading behavior when fine-tuned on small amounts of eye tracking data. As the cumulated prediction results do not provide insights into the linguistic cues that the model acquires to predict reading behavior, we conduct a deeper analysis of the predictions from the perspective of
readability. We try to disentangle the three-fold relationship between human eye movements, the capability of language models to predict these eye movement patterns, and sentence-level readability measures for English. We compare a range of model configurations to multiple baselines. We show that the models exhibit difficulties with function words and that pre-training only provides limited advantages for linguistic generalization
Association between chloroplast DNA and mitochondrial DNA haplotypes in Prunus spinosa L. (Rosaceae) populations across Europe
Chloroplast DNA (cpDNA) and mitochondrial DNA (mtDNA) were studied in 24 populations of Prunus spinosa sampled across Europe. The cpDNA and mtDNA fragments were amplified using universal primers and subsequently digested with restriction enzymes to obtain the polymorphisms. Combinations of all the polymorphisms resulted in 33 cpDNA haplotypes and two mtDNA haplotypes. Strict association between the cpDNA haplotypes and the mtDNA haplotypes was detected in most cases, indicating conjoint inheritance of the two genomes. The most frequent and abundant cpDNA haplotype (C20; frequency, 51 %) is always associated with the more frequent and abundant mtDNA haplotype (M1; frequency, 84 %). All but two of the cpDNA haplotypes associated with the less frequent mtDNA haplotype (M2) are private haplotypes. These private haplotypes are phylogenetically related but geographically unrelated. They form a separate cluster on the minimum-length spanning tree.We thank Dr Remy J. Petit for providing significant support as a coordinator during this project, and for helpful suggestions and valuable comments on the manuscript. The research was supported by the European Community research programme FAIR5‐CT97‐3795
Natural Language Processing and Language Technologies for the Basque Language
The presence of a language in the digital domain is crucial for its survival, as online communication and digital language resources have become the standard in the last decades and will gain more importance in the coming years. In order to develop advanced systems that are considered the basics for an efficient digital communication (e.g. machine translation systems, text-to-speech and speech-to-text converters and digital assistants), it is necessary to digitalise linguistic resources and create tools. In the case of Basque, scholars have studied the creation of digital linguistic resources and the tools that allow the development of those systems for the last forty years. In this paper, we present an overview of the natural language processing and language technology resources developed for Basque, their impact in the process of making Basque a “digital language” and the applications and challenges in multilingual communication. More precisely, we present the well-known products for Basque, the basic tools and the resources that are behind the products we use every day. Likewise, we would like that this survey serves as a guide for other minority languages that are making their way to digitalisation.
Received: 05 April 2022
Accepted: 20 May 202
This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models
Although large language models (LLMs) have apparently acquired a certain
level of grammatical knowledge and the ability to make generalizations, they
fail to interpret negation, a crucial step in Natural Language Processing. We
try to clarify the reasons for the sub-optimal performance of LLMs
understanding negation. We introduce a large semi-automatically generated
dataset of circa 400,000 descriptive sentences about commonsense knowledge that
can be true or false in which negation is present in about 2/3 of the corpus in
different forms. We have used our dataset with the largest available open LLMs
in a zero-shot approach to grasp their generalization and inference capability
and we have also fine-tuned some of the models to assess whether the
understanding of negation can be trained. Our findings show that, while LLMs
are proficient at classifying affirmative sentences, they struggle with
negative sentences and lack a deep understanding of negation, often relying on
superficial cues. Although fine-tuning the models on negative sentences
improves their performance, the lack of generalization in handling negation is
persistent, highlighting the ongoing challenges of LLMs regarding negation
understanding and generalization. The dataset and code are publicly available.Comment: Accepted in the The 2023 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2023
Noisy Channel for Automatic Text Simplification
In this paper we present a simple re-ranking method for Automatic Sentence
Simplification based on the noisy channel scheme. Instead of directly computing
the best simplification given a complex text, the re-ranking method also
considers the probability of the simple sentence to produce the complex
counterpart, as well as the probability of the simple text itself, according to
a language model. Our experiments show that combining these scores outperform
the original system in three different English datasets, yielding the best
known result in one of them. Adopting the noisy channel scheme opens new ways
to infuse additional information into ATS systems, and thus to control
important aspects of them, a known limitation of end-to-end neural seq2seq
generative models.Comment: 8 page
- …