Search CORE

124 research outputs found

Bootstrap methods for the empirical study of decision-making and information flows in social systems

Author: DeDeo Simon
Hawkins Robert
Hitchcock Tim
Klingenstein Sara
Publication venue: 'MDPI AG'
Publication date: 01/01/2013
Field of study

Abstract: We characterize the statistical bootstrap for the estimation of information theoretic quantities from data, with particular reference to its use in the study of large-scale social phenomena. Our methods allow one to preserve, approximately, the underlying axiomatic relationships of information theory—in particular, consistency under arbitrary coarse-graining—that motivate use of these quantities in the first place, while providing reliability comparable to the state of the art for Bayesian estimators. We show how information-theoretic quantities allow for rigorous empirical study of the decision-making capacities of rational agents, and the time-asymmetric flows of information in distributed systems. We provide illustrative examples by reference to ongoing collaborative work on the semantic structure of the British Criminal Court system and the conflict dynamics of the contemporary Afghanistan insurgency

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

Sussex Research Online

Statistical Analysis and Design of Crowdsourcing Applications

Author: Kapelner Adam
Publication venue: ScholarlyCommons
Publication date: 01/01/2014
Field of study

This thesis develops methods for the analysis and design of crowdsourced experiments and crowdsourced labeling tasks. Much of this document focuses on applications including running natural field experiments, estimating the number of objects in images and collecting labels for word sense disambiguation. Observed shortcomings of the crowdsourced experiments inspired the development of methodology for running more powerful experiments via matching on-the-fly. Using the label data to estimate response functions inspired work on non-parametric function estimation using Bayesian Additive Regression Trees (BART). This work then inspired extensions to BART such as incorporation of missing data as well as a user-friendly R package

ScholarlyCommons@Penn

Systems Biology in the Light of Uncertainty:The Limits of Computation

Author: MacLeod Miles Alexander James
Publication venue: Springer
Publication date: 01/01/2017
Field of study

University of Twente Research Information

Informative Presence and Observation in Routinely Collected Health Data: Methods to Support the Development of Clinical Prediction Models

Author: Sisk Rose
Publication venue
Publication date: 01/08/2022
Field of study

The University of Manchester - Institutional Repository

Diffusion of Electronic Health Records:six Years of Empirical Data

Author: Andersen Stig Kjær
Bernstein Knut
Bruun-Rasmussen Morten
Nøhr Christian
Vingtoft Søren
Publication venue: 'IOS Press'
Publication date: 01/01/2007
Field of study

VBN

Designing m-Learning for Junior Registrars:activation of a Theoretical Model of Clinical Knowledge

Author: Boye Niels
Kanstrup Anne Marie
Nøhr Christian
Publication venue: 'IOS Press'
Publication date: 01/01/2007
Field of study

VBN

Statistical Modelling

Author: Biggeri Annibale
Dreassi Emanuela
Lagazio Corrado
Marchi Marco
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The book collects the proceedings of the 19th International Workshop on Statistical Modelling held in Florence on July 2004. Statistical modelling is an important cornerstone in many scientific disciplines, and the workshop has provided a rich environment for cross-fertilization of ideas from different disciplines. It consists in four invited lectures, 48 contributed papers and 47 posters. The contributions are arranged in sessions: Statistical Modelling; Statistical Modelling in Genomics; Semi-parametric Regression Models; Generalized Linear Mixed Models; Correlated Data Modelling; Missing Data, Measurement of Error and Survival Analysis; Spatial Data Modelling and Time Series and Econometrics

Directory of Open Access Books (DOAB)

Recommended from our members

Accelerating Materials Discovery with Machine Learning

Author: Goodall Rhys Edward Andrew
Publication venue: University of Cambridge
Publication date: 16/03/2022
Field of study

As we enter the data age, ever-increasing amounts of human knowledge are being recorded in machine-readable formats. This has opened up new opportunities to leverage data to accelerate scientific discovery. This thesis focuses on how we can use historical and computational data to aid the discovery and development of new materials. We begin by looking at a traditional materials informatics task -- elucidating the structure-function relationships of high-temperature cuprate superconductors. One of the most significant challenges for materials informatics is the limited availability of relevant data. We propose a simple calibration-based approach to estimate the apical and in-plane copper-oxygen distances from more readily available lattice parameter data to address this challenge for cuprate superconductors. Our investigation uncovers a large, unexplored region of materials space that may yield cuprates with higher critical temperatures. We propose two experimental avenues that may enable this region to be accessed. Computational materials exploration is bottle-necked by our ability to provide input structures to feed our workflows. Whilst \textit{ab-intio} structure identification is possible, it is computationally burdensome and we lack design rules for deciding where to target searches in high-throughput setups. To address this, there is a need to develop tools that suggest promising candidates, enabling automated deployment and increased efficiency. Machine learning models are well suited to this task, however, current approaches typically use hand-engineered inputs. This means that their performance is circumscribed by the intuitions reflected in the chosen inputs. We propose a novel way to formulate the machine learning task as a set regression problem over the elements in a material. We show that our approach leads to higher sample efficiency than other well-established composition-based approaches. Having demonstrated the ability of machine learning to aid in the selection of promising compound compositions, we next explore how useful machine learning might be for identifying fabrication routes. Using a recently released data-mined data set of solid-state synthesis reactions, we design a two-stage model to predict the products of inorganic reactions. We critically explore the performance of this model, showing that whilst the predictions fall short of the accuracy required to be chemically discriminative, the model provides valuable insights into understanding inorganic reactions. Through careful investigation of the model's failure modes, we explore the challenges that remain in the construction of forward inorganic reaction prediction models and suggest some pathways to tackle the identified issues. One of the principal ways that material scientists understand and categorise materials is in terms of their symmetries. Crystal structure prototypes are assigned based on the presence of symmetrically equivalent sites known as Wyckoff positions. We show that a powerful coarse-grained representation of materials structures can be constructed from the Wyckoff positions by discarding information about their coordinates within crystal structures. One of the strengths of this representation is that it maintains the ability of structure-based methods to distinguish polymorphs whilst also allowing combinatorial enumeration akin to composition-based approaches. We construct an end-to-end differentiable model that takes our proposed Wyckoff representation as input. The performance of this approach is examined on a suite of materials discovery experiments showing that it leads to strong levels of enrichment in materials discovery tasks. The research presented in this thesis highlights the promise of applying data-driven workflows and machine learning in materials discovery and development. This thesis concludes by speculating about promising research directions for applying machine learning within materials discovery

Apollo (Cambridge)

Mining Causal Links Between Real-World Events and TV Content Viewing Patterns

Author: Orlando Duarte Rodrigues Ferreira de Melo
Publication venue
Publication date: 22/07/2021
Field of study

Repositório Aberto da Universidade do Porto

Statistical Knowledge and Learning in Phonology

Author: Dunbar Ewan
Publication venue
Publication date: 01/01/2013
Field of study

This thesis deals with the theory of the phonetic component of grammar in a formal probabilistic inference framework: (1) it has been recognized since the beginning of generative phonology that some language-specific phonetic implementation is actually context-dependent, and thus it can be said that there are gradient "phonetic processes" in grammar in addition to categorical "phonological processes." However, no explicit theory has been developed to characterize these processes. Meanwhile, (2) it is understood that language acquisition and perception are both really informed guesswork: the result of both types of inference can be reasonably thought to be a less-than-perfect committment, with multiple candidate grammars or parses considered and each associated with some degree of credence. Previous research has used probability theory to formalize these inferences in implemented computational models, especially in phonetics and phonology. In this role, computational models serve to demonstrate the existence of working learning/per- ception/parsing systems assuming a faithful implementation of one particular theory of human language, and are not intended to adjudicate whether that theory is correct. The current thesis (1) develops a theory of the phonetic component of grammar and how it relates to the greater phonological system and (2) uses a formal Bayesian treatment of learning to evaluate this theory of the phonological architecture and for making predictions about how the resulting grammars will be organized. The coarse description of the consequence for linguistic theory is that the processes we think of as "allophonic" are actually language-specific, gradient phonetic processes, assigned to the phonetic component of grammar; strict allophones have no representation in the output of the categorical phonological grammar

Digital Repository at the University of Maryland