138,303 research outputs found

    From Kepler to Newton: Explainable AI for Science Discovery

    Full text link
    The Observation--Hypothesis--Prediction--Experimentation loop paradigm for scientific research has been practiced by researchers for years towards scientific discoveries. However, with data explosion in both mega-scale and milli-scale scientific research, it has been sometimes very difficult to manually analyze the data and propose new hypotheses to drive the cycle for scientific discovery. In this paper, we discuss the role of Explainable AI in scientific discovery process by demonstrating an Explainable AI-based paradigm for science discovery. The key is to use Explainable AI to help derive data or model interpretations, hypotheses, as well as scientific discoveries or insights. We show how computational and data-intensive methodology -- together with experimental and theoretical methodology -- can be seamlessly integrated for scientific research. To demonstrate the AI-based science discovery process, and to pay our respect to some of the greatest minds in human history, we show how Kepler's laws of planetary motion and Newton's law of universal gravitation can be rediscovered by (Explainable) AI based on Tycho Brahe's astronomical observation data, whose works were leading the scientific revolution in the 16-17th century. This work also highlights the important role of Explainable AI (as compared to Blackbox AI) in science discovery to help humans prevent or better prepare for the possible technological singularity that may happen in the future, since science is not only about the know how, but also the know why.Comment: Presented at ICML-AI4Science 202

    Science Through the “Golden Security Triangle”: Information Security and Data Journeys in Data-intensive Biomedicine

    Get PDF
    This paper talks about ways in which infrastructure for biomedical data-intensive discovery is operationalized. Specifically, it is interested in information security solutions and how the processes of scientific research through data-intensive infrastructures are shaped by them. The implications of information security for big data biomedical research have not been discussed in depth by the extant IS literature. Yet, information security might exert a strong influence on the processes and outcomes of data sharing efforts. In this research-in-progress paper I present a developing, in-depth study of a leading information linkage infrastructure that is representative of the kind of opportunities that big data technologies are occasioning in the medical field. This research calls for IS to extend the discussion to consider, building on the empirical detail of intensive case studies, a whole range of relations between provisions for information security and the processes of scientific research and data work

    How to Make the Dream Come True: The Astronomers' Data Manifesto

    Get PDF
    Astronomy is one of the most data-intensive of the sciences. Data technology is accelerating the quality and effectiveness of its research, and the rate of astronomical discovery is higher than ever. As a result, many view astronomy as being in a 'Golden Age', and projects such as the Virtual Observatory are amongst the most ambitious data projects in any field of science. But these powerful tools will be impotent unless the data on which they operate are of matching quality. Astronomy, like other fields of science, therefore needs to establish and agree on a set of guiding principles for the management of astronomical data. To focus this process, we are constructing a 'data manifesto', which proposes guidelines to maximise the rate and cost-effectiveness of scientific discovery.Comment: Submitted to Data Science Journal Presented at CODATA, Beijing, October 200

    Science Through the “Golden Security Triangle”: Information Security and Data Journeys in Data-intensive Biomedicine.

    Get PDF
    37th International Conference on Information Systems, Dublin, Ireland, 11-14 December 2016This is the author accepted manuscript. The final version is available from the Association for Information Systems via the URL in this record.This paper talks about ways in which infrastructure for biomedical data-intensive discovery is operationalized. Specifically, it is interested in information security solutions and how the processes of scientific research through data-intensive infrastructures are shaped by them. The implications of information security for big data biomedical research have not been discussed in depth by the extant IS literature. Yet, information security might exert a strong influence on the processes and outcomes of data sharing efforts. In this research-in-progress paper I present a developing, in-depth study of a leading information linkage infrastructure that is representative of the kind of opportunities that big data technologies are occasioning in the medical field. This research calls for IS to extend the discussion to consider, building on the empirical detail of intensive case studies, a whole range of relations between provisions for information security and the processes of scientific research and data work.This research is funded by the European Research Council under the European Union's 7th Framework Programme (FP7/2007-2013) / ERC grant agreement n° 335925

    Bioinformatics and Medicine in the Era of Deep Learning

    Full text link
    Many of the current scientific advances in the life sciences have their origin in the intensive use of data for knowledge discovery. In no area this is so clear as in bioinformatics, led by technological breakthroughs in data acquisition technologies. It has been argued that bioinformatics could quickly become the field of research generating the largest data repositories, beating other data-intensive areas such as high-energy physics or astroinformatics. Over the last decade, deep learning has become a disruptive advance in machine learning, giving new live to the long-standing connectionist paradigm in artificial intelligence. Deep learning methods are ideally suited to large-scale data and, therefore, they should be ideally suited to knowledge discovery in bioinformatics and biomedicine at large. In this brief paper, we review key aspects of the application of deep learning in bioinformatics and medicine, drawing from the themes covered by the contributions to an ESANN 2018 special session devoted to this topic

    Decentralized replication strategies for P2P based scientific data grid

    Get PDF
    Scientific Data Grid provides geographically distributed resources for large-scale data-intensive applications that generate large scientific data sets and it mostly deals with large computational problems. Research in the area of grid has given various ideas and solutions to address these requirements. However, since the number of participants (scientists and institutes) that involve in this kind of environment is increasing tremendously, scalability, availability and reliability have been the core problem for such system. Peer-to-peer (P2P) is one of the architecture that promising scale and dynamism environment. In this paper, we present a P2P model for Scientific Data Grid that utilizes the P2P services to address those problems. For the purpose of this study, we have developed and used our own data grid simulation written using PARSEC. In this paper, we illustrate our P2P Scientific Data Grid model, our data grid simulation and the design of proposed data replication strategies. We then analyze the performance of data discovery service with and without the existence of replication strategies relative to their success rates, response time, average number of hop and bandwidth consumption. The results from simulation study that show how the proposed replication strategies promote high data availability in the proposed Scientific Data Grid model and how these strategies improve the discovery process are presented

    Enabling European archaeological research: The ARIADNE E-infrastructure

    Get PDF
    Research e-infrastructures, digital archives and data services have become important pillars of scientific enterprise that in recent decades has become ever more collaborative, distributed and data-intensive. The archaeological research community has been an early adopter of digital tools for data acquisition, organisation, analysis and presentation of research results of individual projects. However, the provision of einfrastructure and services for data sharing, discovery, access and re-use has lagged behind. This situation is being addressed by ARIADNE: the Advanced Research Infrastructure for Archaeological Dataset Networking in Europe. This EUfunded network has developed an einfrastructure that enables data providers to register and provide access to their resources (datasets, collections) through the ARIADNE data portal, facilitating discovery, access and other services across the integrated resources. This article describes the current landscape of data repositories and services for archaeologists in Europe, and the issues that make interoperability between them difficult to realise. The results of the ARIADNE surveys on users' expectations and requirements are also presented. The main section of the article describes the architecture of the einfrastructure, core services (data registration, discovery and access) and various other extant or experimental services. The ongoing evaluation of the data integration and services is also discussed. Finally, the article summarises lessons learned, and outlines the prospects for the wider engagement of the archaeological research community in sharing data through ARIADNE
    corecore