Search CORE

27 research outputs found

Collaborative platforms for streamlining workflows in Open Science

Author: Claudia Koltzenburg
Daniel Mietchen
Gregor Hagedorn
Konrad Förstner
Konrad Förstner
M. Fabiana Kubke
Publication venue
Publication date: 01/01/2011
Field of study

Despite the internet’s dynamic and collaborative nature, scientists continue to produce grant proposals, lab notebooks, data files, conclusions etc. that stay in static formats or are not published online and therefore not always easily accessible to the interested public. Because of limited adoption of tools that seamlessly integrate all aspects of a research project (conception, data generation, data evaluation, peer-reviewing and publishing of conclusions), much effort is later spent on reproducing or reformatting individual entities before they can be repurposed independently or as parts of articles.

We propose that workflows - performed both individually and collaboratively - could potentially become more efficient if all steps of the research cycle were coherently represented online and the underlying data were formatted, annotated and licensed for reuse. Such a system would accelerate the process of taking projects from conception to publication stages and allow for continuous updating of the data sets and their interpretation as well as their integration into other independent projects.

A major advantage of such workflows is the increased transparency, both with respect to the scientific process as to the contribution of each participant. The latter point is important from a perspective of motivation, as it enables the allocation of reputation, which creates incentives for scientists to contribute to projects. Such workflow platforms offering possibilities to fine-tune the accessibility of their content could gradually pave the path from the current static mode of research presentation into
a more coherent practice of open science

Crossref

Online-Publikations-Server der Universität Würzburg

Nature Precedings

Representation of research hypotheses

Author: Rzhetsky Andrey
Soldatova Larisa N
Publication venue
Publication date: 01/01/2011
Field of study

BACKGROUND: Hypotheses are now being automatically produced on an industrial scale by computers in biology, e.g. the annotation of a genome is essentially a large set of hypotheses generated by sequence similarity programs; and robot scientists enable the full automation of a scientific investigation, including generation and testing of research hypotheses. RESULTS: This paper proposes a logically defined way for recording automatically generated hypotheses in machine amenable way. The proposed formalism allows the description of complete hypotheses sets as specified input and output for scientific investigations. The formalism supports the decomposition of research hypotheses into more specialised hypotheses if that is required by an application. Hypotheses are represented in an operational way – it is possible to design an experiment to test them. The explicit formal description of research hypotheses promotes the explicit formal description of the results and conclusions of an investigation. The paper also proposes a framework for automated hypotheses generation. We demonstrate how the key components of the proposed framework are implemented in the Robot Scientist “Adam”. CONCLUSIONS: A formal representation of automatically generated research hypotheses can help to improve the way humans produce, record, and validate research hypotheses. AVAILABILITY: http://www.aber.ac.uk/en/cs/research/cb/projects/robotscientist/results

Crossref

Aberystwyth Research Portal

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Brunel University Research Archive

Representation of probabilistic scientific knowledge

Author: De Grave K
King RD
Rzhetsky A
Soldatova LN
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This article is available through the Brunel Open Access Publishing Fund. Copyright © 2013 Soldatova et al; licensee BioMed Central Ltd.The theory of probability is widely used in biomedical research for data analysis and modelling. In previous work the probabilities of the research hypotheses have been recorded as experimental metadata. The ontology HELO is designed to support probabilistic reasoning, and provides semantic descriptors for reporting on research that involves operations with probabilities. HELO explicitly links research statements such as hypotheses, models, laws, conclusions, etc. to the associated probabilities of these statements being true. HELO enables the explicit semantic representation and accurate recording of probabilities in hypotheses, as well as the inference methods used to generate and update those hypotheses. We demonstrate the utility of HELO on three worked examples: changes in the probability of the hypothesis that sirtuins regulate human life span; changes in the probability of hypotheses about gene functions in the S. cerevisiae aromatic amino acid pathway; and the use of active learning in drug design (quantitative structure activity relation learning), where a strategy for the selection of compounds with the highest probability of improving on the best known compound was used. HELO is open source and available at https://github.com/larisa-soldatova/HELO.This work was partially supported by grant BB/F008228/1 from the UK Biotechnology & Biological Sciences Research Council, from the European Commission under the FP7 Collaborative Programme, UNICELLSYS, KU Leuven GOA/08/008 and ERC Starting Grant 240186

Lirias

Crossref

Springer - Publisher Connector

PubMed Central

Brunel University Research Archive

Inductive queries for a drug designing robot scientist

Author: A. Lingas
C. Hansch
C.A. Lipinski
D.R. Jones
D.R. Jones
H. Blockeel
J. Matousek
L. Raedt De
R.D. King
R.D. King
T. Gärtner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments

Lirias

Crossref

Bournemouth University Research Online

The University of Manchester - Institutional Repository

DIAL UCLouvain

Automating sciences: Philosophical and social dimensions

Author: King RD
Mellingwood C
Schuler Costa V
Soldatova LN
Publication venue: IEEE Technology and Society Magazine
Publication date: 01/03/2018
Field of study

Crossref

The University of Manchester - Institutional Repository

Apollo (Cambridge)

Re-imagining health and well-being in low resource African settings using an augmented AI system and a 3D digital twin

Author: Moodley Deshendran
Seebregts Christopher
Publication venue
Publication date: 29/05/2023
Field of study

In this paper, we discuss and explore the potential and relevance of recent developments in artificial intelligence (AI) and digital twins for health and well-being in low-resource African countries. Using an AI systems perspective, we review emerging trends in AI systems and digital twins and propose an initial augmented AI system architecture to illustrate how an AI system can work in conjunction with a 3D digital twin. We highlight scientific knowledge discovery, continual learning, pragmatic interoperability, and interactive explanation and decision-making as important research challenges for AI systems and digital twins.Comment: Submitted to Workshop on AI for Digital Twins and Cyber-physical applications at IJCAI 2023, August 19--21, 2023, Macau, S.A.

arXiv.org e-Print Archive

A Cable-based Manipulator for Chemistry Labs

Author: Cooper Andrew
Fichera Sebastiano
Manes Lupo
Marquez-Gamez David
Paoletti Paolo
Publication venue: 'UK-Robotics and Autonomous Systems (RAS) Network'
Publication date: 06/05/2020
Field of study

University of Liverpool Repository

Crossref

Automated Discovery of Integral with Deep Learning

Author: Yin Xiaoxin
Publication venue
Publication date: 27/02/2024
Field of study

Recent advancements in the realm of deep learning, particularly in the development of large language models (LLMs), have demonstrated AI's ability to tackle complex mathematical problems or solving programming challenges. However, the capability to solve well-defined problems based on extensive training data differs significantly from the nuanced process of making scientific discoveries. Trained on almost all human knowledge available, today's sophisticated LLMs basically learn to predict sequences of tokens. They generate mathematical derivations and write code in a similar way as writing an essay, and do not have the ability to pioneer scientific discoveries in the manner a human scientist would do. In this study we delve into the potential of using deep learning to rediscover a fundamental mathematical concept: integrals. By defining integrals as area under the curve, we illustrate how AI can deduce the integral of a given function, exemplified by inferring

\int_{0}^{x} t^2 dt = \frac{x^3}{3}

and

\int_{0}^{x} ae^{bt} dt = \frac{a}{b} e^{bx} - \frac{a}{b}

. Our experiments show that deep learning models can approach the task of inferring integrals either through a sequence-to-sequence model, akin to language translation, or by uncovering the rudimentary principles of integration, such as

\int_{0}^{x} t^n dt = \frac{x^{n+1}}{n+1}

arXiv.org e-Print Archive

Scientific discovery as a combinatorial optimisation problem: How best to navigate the landscape of possible experiments?

Author: Barrow JD
Bernardo JM
Berry DA.
Bertsch McGrayne S.
Buzan T.
Chalmers AF.
Corne D
Farrelly C
Garey M
Goble C
Howson C
Kell DB
Koza JR
Langley P
Leonard T
Mackay DJC.
Pearl J.
Wright S.
Żytkow JM
Publication venue: WILEY-VCH Verlag
Publication date: 01/03/2012
Field of study

A considerable number of areas of bioscience, including gene and drug discovery, metabolic engineering for the biotechnological improvement of organisms, and the processes of natural and directed evolution, are best viewed in terms of a ‘landscape’ representing a large search space of possible solutions or experiments populated by a considerably smaller number of actual solutions that then emerge. This is what makes these problems ‘hard’, but as such these are to be seen as combinatorial optimisation problems that are best attacked by heuristic methods known from that field. Such landscapes, which may also represent or include multiple objectives, are effectively modelled in silico, with modern active learning algorithms such as those based on Darwinian evolution providing guidance, using existing knowledge, as to what is the ‘best’ experiment to do next. An awareness, and the application, of these methods can thereby enhance the scientific discovery process considerably. This analysis fits comfortably with an emerging epistemology that sees scientific reasoning, the search for solutions, and scientific discovery as Bayesian processes

Crossref

PubMed Central

The University of Manchester - Institutional Repository