314 research outputs found

    Development of a text mining approach to disease network discovery

    Get PDF
    Scientific literature is one of the major sources of knowledge for systems biology, in the form of papers, patents and other types of written reports. Text mining methods aim at automatically extracting relevant information from the literature. The hypothesis of this thesis was that biological systems could be elucidated by the development of text mining solutions that can automatically extract relevant information from documents. The first objective consisted in developing software components to recognize biomedical entities in text, which is the first step to generate a network about a biological system. To this end, a machine learning solution was developed, which can be trained for specific biological entities using an annotated dataset, obtaining high-quality results. Additionally, a rule-based solution was developed, which can be easily adapted to various types of entities. The second objective consisted in developing an automatic approach to link the recognized entities to a reference knowledge base. A solution based on the PageRank algorithm was developed in order to match the entities to the concepts that most contribute to the overall coherence. The third objective consisted in automatically extracting relations between entities, to generate knowledge graphs about biological systems. Due to the lack of annotated datasets available for this task, distant supervision was employed to train a relation classifier on a corpus of documents and a knowledge base. The applicability of this approach was demonstrated in two case studies: microRNAgene relations for cystic fibrosis, obtaining a network of 27 relations using the abstracts of 51 recently published papers; and cell-cytokine relations for tolerogenic cell therapies, obtaining a network of 647 relations from 3264 abstracts. Through a manual evaluation, the information contained in these networks was determined to be relevant. Additionally, a solution combining deep learning techniques with ontology information was developed, to take advantage of the domain knowledge provided by ontologies. This thesis contributed with several solutions that demonstrate the usefulness of text mining methods to systems biology by extracting domain-specific information from the literature. These solutions make it easier to integrate various areas of research, leading to a better understanding of biological systems

    AI Knowledge Transfer from the University to Society

    Get PDF
    AI Knowledge Transfer from the University to Society: Applications in High-Impact Sectors brings together examples from the "Innovative Ecosystem with Artificial Intelligence for Andalusia 2025" project at the University of Seville, a series of sub-projects composed of research groups and different institutions or companies that explore the use of Artificial Intelligence in a variety of high-impact sectors to lead innovation and assist in decision-making. Key Features Includes chapters on health and social welfare, transportation, digital economy, energy efficiency and sustainability, agro-industry, and tourism Great diversity of authors, expert in varied sectors, belonging to powerful research groups from the University of Seville with proven experience in the transfer of knowledge to the productive sector and agents attached to the Andalucía TECH Campu

    Automated detection of reflection in texts. A machine learning based approach

    Get PDF
    Promoting reflective thinking is an important educational goal. A common educational practice is to provide opportunities for learners to express their reflective thoughts in writing. The analysis of such text with regard to reflection is mainly a manual task that employs the principles of content analysis. Considering the amount of text produced by online learning systems, tools that automatically analyse text with regard to reflection would greatly benefit research and practice. Previous research has explored the potential of dictionary-based approaches that automatically map keywords to categories associated with reflection. Other automated methods use manually constructed rules to gauge insight from text. Machine learning has shown potential for classifying text with regard to reflection-related constructs. However, not much is known of whether machine learning can be used to reliably analyse text with regard to the categories of reflective writing models. This thesis investigates the reliability of machine learning algorithms to detect reflective thinking in text. In particular, it studies whether text segments from student writings can be analysed automatically to detect the presence (or absence) of reflective writing model categories. A synthesis of the models of reflective writing is performed to determine the categories frequently used to analyse reflective writing. For each of these categories, several machine learning algorithms are evaluated with regard to their ability to reliably detect reflective writing categories. The evaluation finds that many of the categories can be predicted reliably. The automated method, however, does not achieve the same level of reliability as humans do

    Information retrieval and text mining technologies for chemistry

    Get PDF
    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.A.V. and M.K. acknowledge funding from the European Community’s Horizon 2020 Program (project reference: 654021 - OpenMinted). M.K. additionally acknowledges the Encomienda MINETAD-CNIO as part of the Plan for the Advancement of Language Technology. O.R. and J.O. thank the Foundation for Applied Medical Research (FIMA), University of Navarra (Pamplona, Spain). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia), and FEDER (European Union), and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684). We thank Iñigo Garciá -Yoldi for useful feedback and discussions during the preparation of the manuscript.info:eu-repo/semantics/publishedVersio
    corecore