118 research outputs found
A categorical foundation for Bayesian probability
Given two measurable spaces and with countably generated
-algebras, a perfect prior probability measure on and a
sampling distribution , there is a corresponding inference
map which is unique up to a set of measure zero. Thus,
given a data measurement , a posterior probability
can be computed. This procedure is iterative: with
each updated probability , we obtain a new joint distribution which in
turn yields a new inference map and the process repeats with each
additional measurement. The main result uses an existence theorem for regular
conditional probabilities by Faden, which holds in more generality than the
setting of Polish spaces. This less stringent setting then allows for
non-trivial decision rules (Eilenberg--Moore algebras) on finite (as well as
non finite) spaces, and also provides for a common framework for decision
theory and Bayesian probability.Comment: 15 pages; revised setting to more clearly explain how to incorporate
perfect measures and the Giry monad; to appear in Applied Categorical
Structure
Which Distributions (or Families of Distributions) Best Represent Interval Uncertainty: Case of Permutation-Invariant Criteria
In many practical situations, we only know the interval containing the quantity of interest, we have no information about the probability of different values within this interval. In contrast to the cases when we know the distributions and can thus use Monte-Carlo simulations, processing such interval uncertainty is difficult -- crudely speaking, because we need to try all possible distributions on this interval. Sometimes, the problem can be simplified: namely, it is possible to select a single distribution (or a small family of distributions) whose analysis provides a good understanding of the situation. The most known case is when we use the Maximum Entropy approach and get the uniform distribution on the interval. Interesting, sensitivity analysis -- which has completely different objectives -- leads to selection of the same uniform distribution. In this paper, we provide a general explanation of why uniform distribution appears in different situations -- namely, it appears every time we have a permutation-invariant objective functions with the unique optimum. We also discuss what happens if there are several optima
Direct entropy determination and application to artificial spin ice
From thermodynamic origins, the concept of entropy has expanded to a range of
statistical measures of uncertainty, which may still be thermodynamically
significant. However, laboratory measurements of entropy continue to rely on
direct measurements of heat. New technologies that can map out myriads of
microscopic degrees of freedom suggest direct determination of configurational
entropy by counting in systems where it is thermodynamically inaccessible, such
as granular and colloidal materials, proteins and lithographically fabricated
nanometre-scale arrays. Here, we demonstrate a conditional-probability
technique to calculate entropy densities of translation-invariant states on
lattices using limited configuration data on small clusters, and apply it to
arrays of interacting nanometre-scale magnetic islands (artificial spin ice).
Models for statistically disordered systems can be assessed by applying the
method to relative entropy densities. For artificial spin ice, this analysis
shows that nearest-neighbour correlations drive longer-range ones.Comment: 10 page
Recognizing Speech in a Novel Accent: The Motor Theory of Speech Perception Reframed
The motor theory of speech perception holds that we perceive the speech of
another in terms of a motor representation of that speech. However, when we
have learned to recognize a foreign accent, it seems plausible that recognition
of a word rarely involves reconstruction of the speech gestures of the speaker
rather than the listener. To better assess the motor theory and this
observation, we proceed in three stages. Part 1 places the motor theory of
speech perception in a larger framework based on our earlier models of the
adaptive formation of mirror neurons for grasping, and for viewing extensions
of that mirror system as part of a larger system for neuro-linguistic
processing, augmented by the present consideration of recognizing speech in a
novel accent. Part 2 then offers a novel computational model of how a listener
comes to understand the speech of someone speaking the listener's native
language with a foreign accent. The core tenet of the model is that the
listener uses hypotheses about the word the speaker is currently uttering to
update probabilities linking the sound produced by the speaker to phonemes in
the native language repertoire of the listener. This, on average, improves the
recognition of later words. This model is neutral regarding the nature of the
representations it uses (motor vs. auditory). It serve as a reference point for
the discussion in Part 3, which proposes a dual-stream neuro-linguistic
architecture to revisits claims for and against the motor theory of speech
perception and the relevance of mirror neurons, and extracts some implications
for the reframing of the motor theory
Combined SVM-CRFs for Biological Named Entity Recognition with Maximal Bidirectional Squeezing
Biological named entity recognition, the identification of biological terms in text, is essential for biomedical information extraction. Machine learning-based approaches have been widely applied in this area. However, the recognition performance of current approaches could still be improved. Our novel approach is to combine support vector machines (SVMs) and conditional random fields (CRFs), which can complement and facilitate each other. During the hybrid process, we use SVM to separate biological terms from non-biological terms, before we use CRFs to determine the types of biological terms, which makes full use of the power of SVM as a binary-class classifier and the data-labeling capacity of CRFs. We then merge the results of SVM and CRFs. To remove any inconsistencies that might result from the merging, we develop a useful algorithm and apply two rules. To ensure biological terms with a maximum length are identified, we propose a maximal bidirectional squeezing approach that finds the longest term. We also add a positive gain to rare events to reinforce their probability and avoid bias. Our approach will also gradually extend the context so more contextual information can be included. We examined the performance of four approaches with GENIA corpus and JNLPBA04 data. The combination of SVM and CRFs improved performance. The macro-precision, macro-recall, and macro-F1 of the SVM-CRFs hybrid approach surpassed conventional SVM and CRFs. After applying the new algorithms, the macro-F1 reached 91.67% with the GENIA corpus and 84.04% with the JNLPBA04 data
Modelling Students’ Thematically Associated Knowledge : Networked Knowledge from Affinity Statistics
Peer reviewe
Phosphodiesterase type 5 inhibitors enhance chemotherapy in preclinical models of esophageal adenocarcinoma by targeting cancer-associated fibroblasts
Time-Course Analysis of Cyanobacterium Transcriptome: Detecting Oscillatory Genes
The microarray technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining these data one can identify the dynamics of the gene expression time series. The detection of genes that are periodically expressed is an important step that allows us to study the regulatory mechanisms associated with the circadian cycle. The problem of finding periodicity in biological time series poses many challenges. Such challenge occurs due to the fact that the observed time series usually exhibit non-idealities, such as noise, short length, outliers and unevenly sampled time points. Consequently, the method for finding periodicity should preferably be robust against such anomalies in the data. In this paper, we propose a general and robust procedure for identifying genes with a periodic signature at a given significance level. This identification method is based on autoregressive models and the information theory. By using simulated data we show that the suggested method is capable of identifying rhythmic profiles even in the presence of noise and when the number of data points is small. By recourse of our analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis
Avaliação de dois anos de um programa educacional para pacientes ambulatoriais adultos com asma
Bayesian Action–Perception Computational Model: Interaction of Production and Recognition of Cursive Letters
In this paper, we study the collaboration of perception and action representations involved in cursive letter recognition and production. We propose a mathematical formulation for the whole perception–action loop, based on probabilistic modeling and Bayesian inference, which we call the Bayesian Action–Perception (BAP) model. Being a model of both perception and action processes, the purpose of this model is to study the interaction of these processes. More precisely, the model includes a feedback loop from motor production, which implements an internal simulation of movement. Motor knowledge can therefore be involved during perception tasks. In this paper, we formally define the BAP model and show how it solves the following six varied cognitive tasks using Bayesian inference: i) letter recognition (purely sensory), ii) writer recognition, iii) letter production (with different effectors), iv) copying of trajectories, v) copying of letters, and vi) letter recognition (with internal simulation of movements). We present computer simulations of each of these cognitive tasks, and discuss experimental predictions and theoretical developments
- …