41 research outputs found

    Basque-to-Spanish and Spanish-to-Basque machine translation for the health domain

    Get PDF
    [EU]Master Amaierako Lan honek medikuntza domeinuko euskara eta gaztelera arteko itzulpen automatiko sistema bat garatzeko helburuarekin emandako lehenengo urratsak aurkezten ditu. Corpus elebidun nahikoaren faltan, hainbat esperimentu burutu dira Itzulpen Automatiko Neuronalean erabiltzen diren parametroak domeinuz kanpoko corpusean aztertzeko; medikuntza domeinuan izandako jokaera ebaluatzeko ordea, eskuz itzulitako corpusa erabili da medikuntza domeinuko corpusen presentzia handituz entrenatutako sistema desberdinak probatzeko. Lortutako emaitzek deskribatutako helbururako bidean lehenengo aurrerapausoa suposatzen dute.[EN]This project presents the initial steps towards the objective of developing a Machine Translation system for the health domain between Basque and Spanish. In the absence of a big enough bilingual corpus, several experiments have been carried out to test different Neural Machine Translation parameters on an out-of-domain corpus; while performance on the health domain has been evaluated with a manually translated corpus in different systems trained with increasing presence of health domain corpora. The results obtained represent a first step forward to the described objective

    A method for estimation of elasticities in metabolic networks using steady state and dynamic metabolomics data and linlog kinetics

    Get PDF
    BACKGROUND: Dynamic modeling of metabolic reaction networks under in vivo conditions is a crucial step in order to obtain a better understanding of the (dis)functioning of living cells. So far dynamic metabolic models generally have been based on mechanistic rate equations which often contain so many parameters that their identifiability from experimental data forms a serious problem. Recently, approximative rate equations, based on the linear logarithmic (linlog) format have been proposed as a suitable alternative with fewer parameters. RESULTS: In this paper we present a method for estimation of the kinetic model parameters, which are equal to the elasticities defined in Metabolic Control Analysis, from metabolite data obtained from dynamic as well as steady state perturbations, using the linlog kinetic format. Additionally, we address the question of parameter identifiability from dynamic perturbation data in the presence of noise. The method is illustrated using metabolite data generated with a dynamic model of the glycolytic pathway of Saccharomyces cerevisiae based on mechanistic rate equations. Elasticities are estimated from the generated data, which define the complete linlog kinetic model of the glycolysis. The effect of data noise on the accuracy of the estimated elasticities is presented. Finally, identifiable subset of parameters is determined using information on the standard deviations of the estimated elasticities through Monte Carlo (MC) simulations. CONCLUSION: The parameter estimation within the linlog kinetic framework as presented here allows the determination of the elasticities directly from experimental data from typical dynamic and/or steady state experiments. These elasticities allow the reconstruction of the full kinetic model of Saccharomyces cerevisiae, and the determination of the control coefficients. MC simulations revealed that certain elasticities are potentially unidentifiable from dynamic data only. Addition of steady state perturbation of enzyme activities solved this problem

    When and Why Did Human Brains Decrease in Size? A New Change-Point Analysis and Insights From Brain Evolution in Ants

    Get PDF
    Human brain size nearly quadrupled in the six million years since Homo last shared a common ancestor with chimpanzees, but human brains are thought to have decreased in volume since the end of the last Ice Age. The timing and reason for this decrease is enigmatic. Here we use change-point analysis to estimate the timing of changes in the rate of hominin brain evolution. We find that hominin brains experienced positive rate changes at 2.1 and 1.5 million years ago, coincident with the early evolution of Homo and technological innovations evident in the archeological record. But we also find that human brain size reduction was surprisingly recent, occurring in the last 3,000 years. Our dating does not support hypotheses concerning brain size reduction as a by-product of body size reduction, a result of a shift to an agricultural diet, or a consequence of self-domestication. We suggest our analysis supports the hypothesis that the recent decrease in brain size may instead result from the externalization of knowledge and advantages of group-level decision-making due in part to the advent of social systems of distributed cognition and the storage and sharing of information. Humans live in social groups in which multiple brains contribute to the emergence of collective intelligence. Although difficult to study in the deep history of Homo, the impacts of group size, social organization, collective intelligence and other potential selective forces on brain evolution can be elucidated using ants as models. The remarkable ecological diversity of ants and their species richness encompasses forms convergent in aspects of human sociality, including large group size, agrarian life histories, division of labor, and collective cognition. Ants provide a wide range of social systems to generate and test hypotheses concerning brain size enlargement or reduction and aid in interpreting patterns of brain evolution identified in humans. Although humans and ants represent very different routes in social and cognitive evolution, the insights ants offer can broadly inform us of the selective forces that influence brain size

    Basque-to-Spanish and Spanish-to-Basque machine translation for the health domain

    Get PDF
    [EU]Master Amaierako Lan honek medikuntza domeinuko euskara eta gaztelera arteko itzulpen automatiko sistema bat garatzeko helburuarekin emandako lehenengo urratsak aurkezten ditu. Corpus elebidun nahikoaren faltan, hainbat esperimentu burutu dira Itzulpen Automatiko Neuronalean erabiltzen diren parametroak domeinuz kanpoko corpusean aztertzeko; medikuntza domeinuan izandako jokaera ebaluatzeko ordea, eskuz itzulitako corpusa erabili da medikuntza domeinuko corpusen presentzia handituz entrenatutako sistema desberdinak probatzeko. Lortutako emaitzek deskribatutako helbururako bidean lehenengo aurrerapausoa suposatzen dute.[EN]This project presents the initial steps towards the objective of developing a Machine Translation system for the health domain between Basque and Spanish. In the absence of a big enough bilingual corpus, several experiments have been carried out to test different Neural Machine Translation parameters on an out-of-domain corpus; while performance on the health domain has been evaluated with a manually translated corpus in different systems trained with increasing presence of health domain corpora. The results obtained represent a first step forward to the described objective

    Online Learning in Neural Machine Translation

    Full text link
    [EN] High quality translations are in high demand these days. Although machine translation offers acceptable performance, it is not sufficient in some cases and human supervision is required. In order to ease the translation task of the human, machine translation systems take part in this process. When a sentence in the source language needs to be translated, it is fed to the system which outputs a hypothesis translation. The human then, corrects this hypothesis (also known as post-editing) in order to obtain a high quality translation. Being able to transfer the knowledge that a human translator exhibit when post-editing a translation to the machine translation system is a desirable feature, as it has been proven that a more accurate machine translation system helps to increase the efficiency of the post-editing process. Because the post-editing scenario requires an already trained system, online learning techniques are suited for this task. In this work, three online learning algorithms have been proposed and applied to a neural machine translation sys- tem in a post-editing scenario. They rely on the Passive-Aggressive online learn- ing approach in which the model is updated after every sample in order to fulfil a correctness criterion while remembering previously learned information. The goal is to adapt and refine an already trained system with new samples on-the- fly as the post-editing process takes place (hence, the update time must be kept under control). Moreover, these new algorithms are compared with well-stablished online learning variants of the stochastic gradient descent algorithm. Results show im- provements on the translation quality of the system after applying these algo- rithms, reducing human effort in the post-editing process.[ES] La traducción de gran calidad está muy demandada en la actualidad. A pesar de que la traducción automática ofrece unas prestaciones aceptables, en algunos casos no es suficiente y es necesaria la supervisión humana. Para facilitar la tarea de traducción del humano, los sistemas de traducción automática toman parte en este proceso. Cuando una nueva oración en el idioma origen necesita ser tradu- cida, esta se introduce en el sistema, el cual obtiene como salida una hipótesis de traducción. El humano entonces, corrige esta hipótesis (también conocido como post-editar) para obtener una traducción de mayor calidad. Ser capaz de transfe- rir el conocimiento que el humano exhibe cuando realiza la tarea de post-edición al sistema de traducción automática es una característica deseable puesto que se ha demostrado que un sistema de traducción mas preciso ayuda a aumentar la eficiencia del proceso de post-edición. Debido a que el proceso de post-edición requiere un sistema ya entrenado, las técnicas de aprendizaje en línea son las adecuadas para esta tarea. En este traba- jo, se proponen tres algoritmos de aprendizaje en línea aplicados a un traductor automático neuronal en un escenario de post-edición. Estos algoritmos se basan en la aproximación en línea Passive-Aggressive en la cual el modelo se actualiza después de cada muestra con el objetivo de cumplir un criterio de corrección a la vez que manteniendo información previa aprendida. El objetivo es adaptar y refinar un sistema ya entrenado con nuevas muestras al vuelo mientras el pro- ceso de post-edición se lleva a cabo (por tanto, el tiempo de actualización debe mantenerse bajo control). Además, estos algoritmos se comparan con otras bien conocidas variantes en línea del algoritmo de descenso por gradiente estocástico. Los resultados mues- tran una mejora en la calidad de las traducciones después de aplicar estos algo- ritmos, reduciendo así el esfuerzo humano en el proceso de post-edición.[CA] La traducció de gran qualitat es troba molt demanada en l’actualitat. Tot i que la traducció automàtica oferix unes prestacions acceptables, en alguns casos no és suficient i és necessària la supervisió humana. Per a facilitar la tasca de traducció de l’humà, els sistemes de traducció automàtica prenen part en aquest procés. Quan una nova oració en el llenguatge origen necessita ser traduïda, esta s’introduïx en el sistema, el qual obté com a eixida una hipòtesi de traducció. Llavors, l’humà corregix aquesta hipòtesi (també conegut com a post-editar) per a obtindre una traducció de major qualitat. Ser capaços de transferir el coneixement que l’ humà exhibix quan realitza la tasca de post-edició al sistema de traducció automàtica és una característica desitjable ja que s’ha demostrat que un sistema de traducció mes precís ajuda a augmentar l‘eficiència del procés de post-edició. Pel fet que el procés de post-edició requerix un sistema ja entrenat, les tècniques d’aprenentatge en línia són les adequades per aquesta tasca. En este treball, es proposen tres algoritmes d’aprenentatge en línia aplicats a un traductor automàtic neuronal en un escenari de post-edició. Estos algoritmes es basen en l’aproximació en línia Passive-Aggressive en la qual el model s’actualitza després de cada mostra amb l’objectiu de complir un criteri de correcció al mateix temps que manté informació prèvia apresa. L’objectiu és adaptar i refinar un sistema ja entrenat amb noves mostres al vol mentre el procés de post-edició es du a terme (per tant, el temps d’actualització ha de mantenir-se controlat). A més, estos algoritmes es comparen amb altres ben conegudes variants en línia de l’algoritme de descens per gradient estocàstic. Els resultats mostren una millora en la qualitat de les traduccions després d’aplicar estos algoritmes, reduint així l’esforç humà en el procés de post-edició.Cebrián Chuliá, L. (2017). Aprendizaje en línea en traducción automática basada en redes neuronales. http://hdl.handle.net/10251/86299TFG

    Snowmass Theory Frontier Report

    Full text link
    This report summarizes the recent progress and promising future directions in theoretical high-energy physics (HEP) identified within the Theory Frontier of the 2021 Snowmass Process.Comment: Contribution to the US Community Study on the Future of Particle Physics (Snowmass 2021), v2: fixed typo

    Failure-awareness and dynamic adaptation in data scheduling

    Get PDF
    Over the years, scientific applications have become more complex and more data intensive. Especially large scale simulations and scientific experiments in areas such as physics, biology, astronomy and earth sciences demand highly distributed resources to satisfy excessive computational requirements. Increasing data requirements and the distributed nature of the resources made I/O the major bottleneck for end-to-end application performance. Existing systems fail to address issues such as reliability, scalability, and efficiency in dealing with wide area data access, retrieval and processing. In this study, we explore data-intensive distributed computing and study challenges in data placement in distributed environments. After analyzing different application scenarios, we develop new data scheduling methodologies and the key attributes for reliability, adaptability and performance optimization of distributed data placement tasks. Inspired by techniques used in microprocessor and operating system architectures, we extend and adapt some of the known low-level data handling and optimization techniques to distributed computing. Two major contributions of this work include (i) a failure-aware data placement paradigm for increased fault-tolerance, and (ii) adaptive scheduling of data placement tasks for improved end-to-end performance. The failure-aware data placement includes early error detection, error classification, and use of this information in scheduling decisions for the prevention of and recovery from possible future errors. The adaptive scheduling approach includes dynamically tuning data transfer parameters over wide area networks for efficient utilization of available network capacity and optimized end-to-end data transfer performance

    gbeta - a Language with Virtual Attributes, Block Structure, and Propagating, Dynamic Inheritance

    Get PDF
    A language design development process is presented which leads to a language, gbeta, with a tight integration of virtual classes, general block structure, and a multiple inheritance mechanism based on coarse-grained structural type equivalence. From this emerges the concept of propagating specialization. The power lies in the fact that a simple expression can have far-reaching but well-organized consequences, e.g., in one step causing the combination of families of classes, then by propagation the members of those families, and finally by propagation the methods of the members. Moreover, classes are first class values which can be constructed at run-time, and it is possible to inherit from classes whether or not they are compile-time constants, and whether or not they were created dynamically. It is also possible to change the class and structure of an existing object at run-time, preserving object identity. Even though such dynamism is normally not seen in statically type-checked languages, these constructs have been integrated without compromising the static type safety of the language
    corecore