25,386 research outputs found

    Data-to-text generation with neural planning

    Get PDF
    In this thesis, we consider the task of data-to-text generation, which takes non-linguistic structures as input and produces textual output. The inputs can take the form of database tables, spreadsheets, charts, and so on. The main application of data-to-text generation is to present information in a textual format which makes it accessible to a layperson who may otherwise find it problematic to understand numerical figures. The task can also automate routine document generation jobs, thus improving human efficiency. We focus on generating long-form text, i.e., documents with multiple paragraphs. Recent approaches to data-to-text generation have adopted the very successful encoder-decoder architecture or its variants. These models generate fluent (but often imprecise) text and perform quite poorly at selecting appropriate content and ordering it coherently. This thesis focuses on overcoming these issues by integrating content planning with neural models. We hypothesize data-to-text generation will benefit from explicit planning, which manifests itself in (a) micro planning, (b) latent entity planning, and (c) macro planning. Throughout this thesis, we assume the input to our generator are tables (with records) in the sports domain. And the output are summaries describing what happened in the game (e.g., who won/lost, ..., scored, etc.). We first describe our work on integrating fine-grained or micro plans with data-to-text generation. As part of this, we generate a micro plan highlighting which records should be mentioned and in which order, and then generate the document while taking the micro plan into account. We then show how data-to-text generation can benefit from higher level latent entity planning. Here, we make use of entity-specific representations which are dynam ically updated. The text is generated conditioned on entity representations and the records corresponding to the entities by using hierarchical attention at each time step. We then combine planning with the high level organization of entities, events, and their interactions. Such coarse-grained macro plans are learnt from data and given as input to the generator. Finally, we present work on making macro plans latent while incrementally generating a document paragraph by paragraph. We infer latent plans sequentially with a structured variational model while interleaving the steps of planning and generation. Text is generated by conditioning on previous variational decisions and previously generated text. Overall our results show that planning makes data-to-text generation more interpretable, improves the factuality and coherence of the generated documents and re duces redundancy in the output document

    A GNN-based multi-task learning framework for personalized video search

    No full text
    Watching online videos has become more and more popular and users tend to watch videos based on their personal tastes and preferences. Providing a customized ranking list to maximize the user's satisfaction has become increasingly important for online video platforms. Existing personalized search methods (PSMs) train their models with user feedback information (e.g. clicks). However, we identified that such feedback signals may indicate attractiveness but not necessarily indicate relevance in video search. Besides, the click data and user historical information are usually too sparse to train a good PSM, which is different from the conventional Web search containing users' rich historical information. To address these concerns, in this paper we propose a multi-task graph neural network architecture for personalized video search (MGNN-PVS) that can jointly model user's click behaviour and the relevance between queries and videos. To relieve the sparsity problem and learn better representation for users, queries and videos, we develop an efficient and novel GNN architecture based on neighborhood sampling and hierarchical aggregation strategy by leveraging their different hops of neighbors in the user-query and query-document click graph. Extensive experiments on a major commercial video search engine show that our model significantly outperforms state-of-the-art PSMs, which illustrates the effectiveness of our proposed framework

    Development of ATP-PAGE for Protein Analysis

    Get PDF

    Reforming the United Nations

    Get PDF
    The thesis deals with the financial crisis that the United Nations faced starting in 1985 when the US Congress decided to withhold a significant part of the US contribution to the UN regular budget in order to force a greater say for the major contributors on budgetary issues, budgetary restraint and greater efficiency. The UN responded by the adoption of resolution 41/213 of 19 December 1986 that was based on the recommendations of a Group of High-level Intergovernmental Experts ("G-18") set up a year earlier. A new system was introduced regarding the formulation of the regular budget of the United Nations Organisation and a broader process of reform was initiated including a restructuring of the Secretariat and of the intergovernmental machinery in the economic and social fields. After an introductory chapter (Chapter I), the thesis examines the UN problems at the budgetary/financial and administrative/structural levels, the solutions proposed from within and without the United Nations established framework and the actual attempts at reform (Chapters II and ifi). The realisation that the implementation of reforms is rather disjointed and often unsuccessful (e.g. the failure to restructure the intergovernmental machi.neiy) prompts a search for the deeper causes of the UN problems at the political level and the attitudes of the main actors, namely the USA, the USSR, some up-and-coming states, notably Japan, the Third World states and, finally, of the UN Secretary-General and the Secretariat (Chapter 1V). Although the financial crisis may have subsided since 1988 and the USA seem committed to paying up their dues, the deeper UN crisis of identity has not been resolved and is expected to resurface if no bold steps are taken. In that direction, some possible alternative courses for the UN in the future are discussed drawing upon theory and practice (Chapte

    Identification of new regenerative therapies in reproductive medicine and their application as a future therapeutic approach for endometrial regeneration

    Get PDF
    El útero es uno de los principales órganos internos del sistema reproductor femenino. Está compuesto de tres capas tisulares: perimetrio, miometrio y endometrio. Esta última capa recubre la cavidad intrauterina y es responsable directa de la implantación embrionaria (para la cual necesita un grosor endometrial mínimo). Entre las patologías que afectan al endometrio pueden distinguirse, entre otras, la atrofia endometrial (insuficiente grosor endometrial) y el síndrome de Asherman (presencia de adhesiones intrauterinas y tejido fibrótico), las cuales conforman el hilo conductor de esta tesis, compuesta de 4 artículos científicos. En ambos casos, el tejido endometrial se encuentra degenerado, lo que dificulta la implantación embrionaria, ocasionando problemas de fertilidad. A día de hoy, ninguna de estas patologías cuenta con una cura totalmente efectiva. Hasta el momento, una de las opciones terapéuticas más prometedora es la inyección de células madre. Por ello, el primer objetivo de esta tesis fue evaluar como la inyección de células madre derivadas de la médula ósea (aisladas con la detección del antígeno CD133), que había resultado ser efectiva tanto en un modelo humano como en uno animal, estaba modificando el endometrio molecularmente. Para así, intentar entender cuáles son los mecanismos paracrinos a través de los cuales llevan a cabo su acción terapéutica. Este primer estudio reveló que estas células madre parecían estar promoviendo la regeneración endometrial mediante la creación de un escenario inmunomodulador (sub-expresión del gen CXCL8), que daría paso a la sobreexpresión de genes involucrados en la regeneración tisular, como SERPINE1, IL4, y JUN. Otro tratamiento que ha ido ganando acepción con los años es el plasma rico en plaquetas, eje central del manuscrito 2. Este manuscrito evidencia como este plasma, especialmente si proviene de sangre de cordón umbilical, es capaz de promover procesos celulares, como la migración y la proliferación de las células endometriales, así como eventos regenerativos en un modelo animal con daño endometrial inducido. Sea cual sea la aproximación terapéutica de elección, se ha hipotetizado que esta regeneración tisular podría surgir de la estimulación del nicho de células madre presente en el endometrio. Es por ello que el objetivo 3 supuso el estudio de los trabajos publicados, tanto de modelos murinos como humanos, relativos a esta población de células madre endometriales. Esta búsqueda permitió concluir que aún quedan lagunas de conocimiento, bien sea en la definición de marcadores celulares específicos o en de la contribución de la médula ósea a este nicho de células madre endometriales. Finalmente, dada la mencionada falta actual de una terapia definitiva para las pacientes con atrofia endometrial o síndrome de Asherman, el cuarto y último objetivo de esta tesis supuso el estudio de todas aquellas aproximaciones que se han llevado a cabo en modelos animales que simulan este tipo de patologías humanas. Este trabajo concluyó que si bien están emergiendo nuevas terapias muy prometedoras, como son aquellas derivadas de la bioingeniería (por ejemplo, uso de hidrogeles o biomoldes), todavía falta perfeccionar y estandarizar los modelos tanto animales como in vitro que permitan una mejor traslación clínica de las mismas.The uterus is one of the main internal organs of the female reproductive system. It is composed of three different tissue layers: perimetrium, myometrium, and endometrium. This last layer covers the intrauterine cavity and is directly responsible for embryo implantation (for which it needs a certain minimum endometrial thickness). Among the pathologies affecting the endometrium, we can distinguish, among others, endometrial atrophy (characterized by an insufficient endometrial thickness) and Asherman's syndrome (a rare disease characterized by the presence of intrauterine adhesions and fibrotic tissue), which form the common thread of this thesis, composed of four original manuscripts. In both cases, the endometrial tissue is degenerated, which hinders the correct embryo implantation, causing then fertility problems. To date, none of these pathologies has a totally effective cure. So far, one of the most promising therapeutic options is the injection of stem cells. Therefore, the first objective was to evaluate how the infusion of bone marrow-derived stem cells (isolated with the antigen CD133), which had proven effective in both a human and an animal model, was modifying the endometrium at the molecular level. Then, this work aimed to understand the paracrine mechanisms through which these cells were carrying out their therapeutic and regenerative action over the endometrial tissue. This first study revealed that these stem cells appeared to be promoting endometrial regeneration by creating an immunomodulatory scenario (down-regulation of the CXCL8 gene), which would give way to the over-expression of genes (SERPINE1, IL4, and JUN) involved in tissue regeneration. Another treatment gaining acceptance over the years is a blood derivate, platelet-rich plasma, which was the focus of the second manuscript. This work shows how this plasma, mainly derived from umbilical cord blood rather than adult peripheral blood, can promote cellular processes, such as cell migration and proliferation of different types of endometrial cells (from primary culture and from stem cell lines). These plasmas also revealed how they triggered the over-expression of certain proteins involved in regenerative events in a mouse model with induced endometrial damage. Whatever the therapeutic approach of choice, it has been hypothesized that regeneration could arise from stimulation of the stem cell niche present in the endometrium. That is why objective three involved studying those works, both murine and human models, concerning this population of endometrial stem cells. This search concluded that there are still gaps in knowledge, either in the definition of specific endometrial stem cell markers or in the contribution of the bone marrow to this endogenous endometrial stem cell niche. Finally, given the aforementioned current lack of definitive therapy for patients with endometrial atrophy or Asherman's syndrome, the last objective involved studying all those approaches that have been carried out in animal models that simulate this type of human pathology. This work concluded that although new therapies are emerging, such as those derived from bioengineering (e.g. use of decellularized scaffolds or hydrogels), there is still a need to perfect and standardize both animal and in vitro models to allow a better clinical translation of these therapies

    Machine learning and large scale cancer omic data: decoding the biological mechanisms underpinning cancer

    Get PDF
    Many of the mechanisms underpinning cancer risk and tumorigenesis are still not fully understood. However, the next-generation sequencing revolution and the rapid advances in big data analytics allow us to study cells and complex phenotypes at unprecedented depth and breadth. While experimental and clinical data are still fundamental to validate findings and confirm hypotheses, computational biology is key for the analysis of system- and population-level data for detection of hidden patterns and the generation of testable hypotheses. In this work, I tackle two main questions regarding cancer risk and tumorigenesis that require novel computational methods for the analysis of system-level omic data. First, I focused on how frequent, low-penetrance inherited variants modulate cancer risk in the broader population. Genome-Wide Association Studies (GWAS) have shown that Single Nucleotide Polymorphisms (SNP) contribute to cancer risk with multiple subtle effects, but they are still failing to give further insight into their synergistic effects. I developed a novel hierarchical Bayesian regression model, BAGHERA, to estimate heritability at the gene-level from GWAS summary statistics. I then used BAGHERA to analyse data from 38 malignancies in the UK Biobank. I showed that genes with high heritable risk are involved in key processes associated with cancer and are often localised in genes that are somatically mutated drivers. Heritability, like many other omics analysis methods, study the effects of DNA variants on single genes in isolation. However, we know that most biological processes require the interplay of multiple genes and we often lack a broad perspective on them. For the second part of this thesis, I then worked on the integration of Protein-Protein Interaction (PPI) graphs and omics data, which bridges this gap and recapitulates these interactions at a system level. First, I developed a modular and scalable Python package, PyGNA, that enables robust statistical testing of genesets' topological properties. PyGNA complements the literature with a tool that can be routinely introduced in bioinformatics automated pipelines. With PyGNA I processed multiple genesets obtained from genomics and transcriptomics data. However, topological properties alone have proven to be insufficient to fully characterise complex phenotypes. Therefore, I focused on a model that allows to combine topological and functional data to detect multiple communities associated with a phenotype. Detecting cancer-specific submodules is still an open problem, but it has the potential to elucidate mechanisms detectable only by integrating multi-omics data. Building on the recent advances in Graph Neural Networks (GNN), I present a supervised geometric deep learning model that combines GNNs and Stochastic Block Models (SBM). The model is able to learn multiple graph-aware representations, as multiple joint SBMs, of the attributed network, accounting for nodes participating in multiple processes. The simultaneous estimation of structure and function provides an interpretable picture of how genes interact in specific conditions and it allows to detect novel putative pathways associated with cancer
    corecore