8,046 research outputs found
Proof-Pattern Recognition and Lemma Discovery in ACL2
We present a novel technique for combining statistical machine learning for
proof-pattern recognition with symbolic methods for lemma discovery. The
resulting tool, ACL2(ml), gathers proof statistics and uses statistical
pattern-recognition to pre-processes data from libraries, and then suggests
auxiliary lemmas in new proofs by analogy with already seen examples. This
paper presents the implementation of ACL2(ml) alongside theoretical
descriptions of the proof-pattern recognition and lemma discovery methods
involved in it
Benchmarking in cluster analysis: A white paper
To achieve scientific progress in terms of building a cumulative body of
knowledge, careful attention to benchmarking is of the utmost importance. This
means that proposals of new methods of data pre-processing, new data-analytic
techniques, and new methods of output post-processing, should be extensively
and carefully compared with existing alternatives, and that existing methods
should be subjected to neutral comparison studies. To date, benchmarking and
recommendations for benchmarking have been frequently seen in the context of
supervised learning. Unfortunately, there has been a dearth of guidelines for
benchmarking in an unsupervised setting, with the area of clustering as an
important subdomain. To address this problem, discussion is given to the
theoretical conceptual underpinnings of benchmarking in the field of cluster
analysis by means of simulated as well as empirical data. Subsequently, the
practicalities of how to address benchmarking questions in clustering are dealt
with, and foundational recommendations are made
ACL2(ml):machine-learning for ACL2
ACL2(ml) is an extension for the Emacs interface of ACL2. This tool uses
machine-learning to help the ACL2 user during the proof-development. Namely,
ACL2(ml) gives hints to the user in the form of families of similar theorems,
and generates auxiliary lemmas automatically. In this paper, we present the two
most recent extensions for ACL2(ml). First, ACL2(ml) can suggest now families
of similar function definitions, in addition to the families of similar
theorems. Second, the lemma generation tool implemented in ACL2(ml) has been
improved with a method to generate preconditions using the guard mechanism of
ACL2. The user of ACL2(ml) can also invoke directly the latter extension to
obtain preconditions for his own conjectures.Comment: In Proceedings ACL2 2014, arXiv:1406.123
Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset
Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics
A Survey on Soft Subspace Clustering
Subspace clustering (SC) is a promising clustering technology to identify
clusters based on their associations with subspaces in high dimensional spaces.
SC can be classified into hard subspace clustering (HSC) and soft subspace
clustering (SSC). While HSC algorithms have been extensively studied and well
accepted by the scientific community, SSC algorithms are relatively new but
gaining more attention in recent years due to better adaptability. In the
paper, a comprehensive survey on existing SSC algorithms and the recent
development are presented. The SSC algorithms are classified systematically
into three main categories, namely, conventional SSC (CSSC), independent SSC
(ISSC) and extended SSC (XSSC). The characteristics of these algorithms are
highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201
Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation
We propose a novel deep learning model, which supports permutation invariant
training (PIT), for speaker independent multi-talker speech separation,
commonly known as the cocktail-party problem. Different from most of the prior
arts that treat speech separation as a multi-class regression problem and the
deep clustering technique that considers it a segmentation (or clustering)
problem, our model optimizes for the separation regression error, ignoring the
order of mixing sources. This strategy cleverly solves the long-lasting label
permutation problem that has prevented progress on deep learning based
techniques for speech separation. Experiments on the equal-energy mixing setup
of a Danish corpus confirms the effectiveness of PIT. We believe improvements
built upon PIT can eventually solve the cocktail-party problem and enable
real-world adoption of, e.g., automatic meeting transcription and multi-party
human-computer interaction, where overlapping speech is common.Comment: 5 page
Screening and metamodeling of computer experiments with functional outputs. Application to thermal-hydraulic computations
To perform uncertainty, sensitivity or optimization analysis on scalar
variables calculated by a cpu time expensive computer code, a widely accepted
methodology consists in first identifying the most influential uncertain inputs
(by screening techniques), and then in replacing the cpu time expensive model
by a cpu inexpensive mathematical function, called a metamodel. This paper
extends this methodology to the functional output case, for instance when the
model output variables are curves. The screening approach is based on the
analysis of variance and principal component analysis of output curves. The
functional metamodeling consists in a curve classification step, a dimension
reduction step, then a classical metamodeling step. An industrial nuclear
reactor application (dealing with uncertainties in the pressurized thermal
shock analysis) illustrates all these steps
- …