Search CORE

18,982 research outputs found

Where does good evidence come from?

Author: Cook Thomas
Gorard Stephen
Publication venue: Taylor & Francis
Publication date: 01/01/2007
Field of study

This paper started as a debate between the two authors. Both authors present a series of propositions about quality standards in education research. Cook’s propositions, as might be expected, concern the importance of experimental trials for establishing the security of causal evidence, but they also include some important practical and acceptable alternatives such as regression discontinuity analysis. Gorard’s propositions, again as might be expected, tend to place experimental trials within a larger mixed method sequence of research activities, treating them as important but without giving them primacy. The paper concludes with a synthesis of these ideas, summarising the many areas of agreement and clarifying the few areas of disagreement. The latter include what proportion of available research funds should be devoted to trials, how urgent the need for more trials is, and whether the call for more truly mixed methods work requires a major shift in the community

Crossref

University of Birmingham Research Portal

Model selection via Bayesian information capacity designs for generalised linear models

Author: Lewis Susan M.
McGree James M.
Woods David C.
Publication venue
Publication date: 26/10/2016
Field of study

The first investigation is made of designs for screening experiments where the response variable is approximated by a generalised linear model. A Bayesian information capacity criterion is defined for the selection of designs that are robust to the form of the linear predictor. For binomial data and logistic regression, the effectiveness of these designs for screening is assessed through simulation studies using all-subsets regression and model selection via maximum penalised likelihood and a generalised information criterion. For Poisson data and log-linear regression, similar assessments are made using maximum likelihood and the Akaike information criterion for minimally-supported designs that are constructed analytically. The results show that effective screening, that is, high power with moderate type I error rate and false discovery rate, can be achieved through suitable choices for the number of design support points and experiment size. Logistic regression is shown to present a more challenging problem than log-linear regression. Some areas for future work are also indicated

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Elsevier - Publisher Connector

Queensland University of Technology ePrints Archive

Recommended from our members

Bayesian Structural Causal Inference with Probabilistic Programming

Author: Witty Sam A
Publication venue: ScholarWorks@UMass Amherst
Publication date: 14/11/2023
Field of study

Reasoning about causal relationships is central to the human experience. This evokes a natural question in our pursuit of human-like artificial intelligence: how might we imbue intelligent systems with similar causal reasoning capabilities? Better yet, how might we imbue intelligent systems with the ability to learn cause and effect relationships from observation and experimentation? Unfortunately, reasoning about cause and effect requires more than just data: it also requires partial knowledge about data generating mechanisms. Given this need, our task then as computational scientists is to design data structures for representing partial causal knowledge, and algorithms for updating that knowledge in light of observations and experiments. In this dissertation, I explore the Bayesian structural approach to causal inference in which probability distributions over structural causal models are one such data structure, and probabilistic inference in multi-world transformations of those models as the corresponding algorithmic task. Specifically, I demonstrate that this approach has two distinct advantages over the dominant computational paradigm of causal graphical models: (i) it expands the breadth of compatible assumptions; and (ii) it seamlessly integrates with modern Bayesian modeling and inference technologies to facilitate quantification of uncertainty about causal structure and the effects of interventions. Specifically, doing so allows the emerging and powerful technology of probabilistic programming to be brought to bear on a large and diverse set of causal inference problems. In Chapter 3, I present an example-driven pedagogical introduction to the Bayesian structural approach to causal inference, demonstrating how priors over structural causal models induce joint distributions over observed and latent counterfactual random variables, and how the resulting posterior distributions capture common motifs in causal inference. In particular, I show how various assumptions about latent confounding influence our ability to estimate causal effects from data and I provide examples of common observational and quasi-experimental designs expressed as probabilistic programs. In Chapter 4, I present an advanced application of the Bayesian structural approach for modeling hierarchical relational dependencies with latent confounders, and how to combine such assumptions with flexible Gaussian process models. In Chapter 5, I present a prototype software implementation for causal inference using probabilistic programming, accommodating a broad class of multi-source observational and experimental data. Finally, in Chapter 6, I present Simulation-Based Identifiability, a gradient-based optimization method for determining if any differentiable and bounded prior over structural causal models converges to a unique causal conclusion asymptotically

ScholarWorks@UMass Amherst

Toward data science in biophotonics: biomedical investigations-based study

Author: Ali Nairveen
Publication venue
Publication date: 01/01/2021
Field of study

Biophotonics aims to grasp and investigate the characteristics of biological samples based on their interaction with incident light. Over the past decades, numerous biophotonic technologies have been developed delivering various sorts of biological and chemical information from the studied samples. Such information is usually contained in high dimensional data that need to be translated into high-level information like disease biomarkers. This data translation is not straightforward, but it can be achieved using the advances in computer and data science. The scientific contributions presented in this thesis were established to cover two main aspects of data science in biophotonics: the design of experiments and the data-driven modeling and validation. For the design of experiment, the scientific contributions focus on estimating the sample size required for group differentiation and on evaluating the influence of experimental factors on unbalanced multifactorial designs. Both methods were designed for multivariate data and were checked on Raman spectral datasets. Thereafter, the automatic detection and identification of three diagnostic tasks were checked based on combining several image processing techniques with machine learning (ML) algorithms. In the first task, an improved ML pipeline to predict the antibiotic susceptibilities of E. coli bacteria was presented and evaluated based on bright-field microscopic images. Then, transfer learning-based classification of bladder cancer was demonstrated using blue light cystoscopic images. Finally, different ML techniques and validation strategies were combined to perform the automatic detection of breast cancer based on a small-sized dataset of nonlinear multimodal images. The obtained results exhibited the benefits of data science tools in improving the experimantal planning and the translation of biophotonic-associated data into high-level information for various biophotonic technologies

Digitale Bibliothek Thüringen

Transforming Graph Representations for Statistical Relational Learning

Author: Aha David W.
McDowell Luke K.
Neville Jennifer
Rossi Ryan A.
Publication venue
Publication date: 01/01/2012
Field of study

Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

arXiv.org e-Print Archive

CiteSeerX

Estimation of bias in dose-response curve fitting and experimental strategies to its reduction

Author: Dias Diogo
Publication venue: Helsingfors universitet
Publication date: 01/01/2022
Field of study

One of the biggest hurdles in cancer patient care is the lack of response to treatment. With the support of high-throughput drug screening, it is nowadays feasible to conduct vast amounts of drug sensitivity assays, aiding in the identification of sensitive and resistant samples to chemical perturbations. In an oncology setting, drug screening is the process by which patient cells are examined experimentally for response and activity to distinct drugs and analysed via dose-response curve fitting. However, the ability to reproduce and replicate with high confidence drug screening outcomes proved to be a challenge that needs to be addressed. Inefficient experimental designs, lack of standard protocols to control both biological and technical factors in such cell-based assays are at the core of a steep influx of experimental biases. Hence, additional endeavour has to be carried out to provide less biased estimations of drug effects. This thesis work focuses on reducing erroneous inferences (i.e., bias) from dose-response data in the curve fitting step, thereby improving the reproducibility of drug sensitivity screening through efficient dose selection. A novel two-step experimental design is introduced which significantly improves the estimation of dose-response curves while keeping the amount of cellular and chemical materials feasible

Helsingin yliopiston digitaalinen arkisto

Exploration of Reaction Pathways and Chemical Transformation Networks

Author: Reiher Markus
Simm Gregor N.
Vaucher Alain C.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 03/12/2018
Field of study

For the investigation of chemical reaction networks, the identification of all relevant intermediates and elementary reactions is mandatory. Many algorithmic approaches exist that perform explorations efficiently and automatedly. These approaches differ in their application range, the level of completeness of the exploration, as well as the amount of heuristics and human intervention required. Here, we describe and compare the different approaches based on these criteria. Future directions leveraging the strengths of chemical heuristics, human interaction, and physical rigor are discussed.Comment: 48 pages, 4 figure

arXiv.org e-Print Archive

Repository for Publications and Research Data

Research and Education in Computational Science and Engineering

Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne