1,062,611 research outputs found
Correlation-based Data Representation
The Dagstuhl Seminar \u27Similarity-based Clustering and its
Application to Medicine and Biology\u27 (07131) held in March 25--30, 2007,
provided an excellent atmosphere for in-depth discussions
about the research frontier of computational methods
for relevant applications of biomedical clustering and beyond.
We address some highlighted issues about correlation-based data
analysis in this seminar postribution.
First, some prominent correlation measures are briefly revisited.
Then, a focus is put on Pearson correlation, because of its
widespread use in biomedical sciences and because of
its analytic accessibility.
A connection to Euclidean distance of z-score transformed
data outlined.
Cost function optimization of correlation-based data representation
is discussed for which, finally, applications to visualization
and clustering of gene expression data are given
Common Representation Learning Using Step-based Correlation Multi-Modal CNN
Deep learning techniques have been successfully used in learning a common
representation for multi-view data, wherein the different modalities are
projected onto a common subspace. In a broader perspective, the techniques used
to investigate common representation learning falls under the categories of
canonical correlation-based approaches and autoencoder based approaches. In
this paper, we investigate the performance of deep autoencoder based methods on
multi-view data. We propose a novel step-based correlation multi-modal CNN
(CorrMCNN) which reconstructs one view of the data given the other while
increasing the interaction between the representations at each hidden layer or
every intermediate step. Finally, we evaluate the performance of the proposed
model on two benchmark datasets - MNIST and XRMB. Through extensive
experiments, we find that the proposed model achieves better performance than
the current state-of-the-art techniques on joint common representation learning
and transfer learning tasks.Comment: Accepted in Asian Conference of Pattern Recognition (ACPR-2017
Multiplicity dependence of identical particle correlations in the quantum optical approach
Identical particle correlations at fixed multiplicity are consideres in the
presence of chaotic and coherent fields. The multiplicity distribution,
one-particle momentum density, and two-particle correlation function are
obtained based on the diagrammatic representation for cmulants in
semi-inclusive events. Our formulation is applied to the analysis of the
experimental data on the multiplicity dependence of correlation functions
reported by the UA1 and the OPAL Collaborations.Comment: 14 pages, 7 figure
H-matrix accelerated second moment analysis for potentials with rough correlation
We consider the efficient solution of partial differential equationsfor strongly elliptic operators with constant coefficients and stochastic Dirichlet data by the boundary integral equation method. The computation of the solution's two-point correlation is well understood if the two-point correlation of the Dirichlet data is known and sufficiently smooth.Unfortunately, the problem becomes much more involved in case of rough data. We will show that the concept of the H-matrix arithmetic provides a powerful tool to cope with this problem. By employing a parametric surface representation, we end up with an H-matrix arithmetic based on balanced cluster trees. This considerably simplifies the implementation and improves the performance of the H-matrix arithmetic. Numerical experiments are provided to validate and quantify the presented methods and algorithms
CausaLM: Causal Model Explanation Through Counterfactual Language Models
Understanding predictions made by deep neural networks is notoriously
difficult, but also crucial to their dissemination. As all ML-based methods,
they are as good as their training data, and can also capture unwanted biases.
While there are tools that can help understand whether such biases exist, they
do not distinguish between correlation and causation, and might be ill-suited
for text-based models and for reasoning about high level language concepts. A
key problem of estimating the causal effect of a concept of interest on a given
model is that this estimation requires the generation of counterfactual
examples, which is challenging with existing generation technology. To bridge
that gap, we propose CausaLM, a framework for producing causal model
explanations using counterfactual language representation models. Our approach
is based on fine-tuning of deep contextualized embedding models with auxiliary
adversarial tasks derived from the causal graph of the problem. Concretely, we
show that by carefully choosing auxiliary adversarial pre-training tasks,
language representation models such as BERT can effectively learn a
counterfactual representation for a given concept of interest, and be used to
estimate its true causal effect on model performance. A byproduct of our method
is a language representation model that is unaffected by the tested concept,
which can be useful in mitigating unwanted bias ingrained in the data.Comment: Our code and data are available at:
https://amirfeder.github.io/CausaLM/ Under review for the Computational
Linguistics journa
- …