160,059 research outputs found
A survey of outlier detection methodologies
Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review
Revealing quantum chaos with machine learning
Understanding properties of quantum matter is an outstanding challenge in
science. In this paper, we demonstrate how machine-learning methods can be
successfully applied for the classification of various regimes in
single-particle and many-body systems. We realize neural network algorithms
that perform a classification between regular and chaotic behavior in quantum
billiard models with remarkably high accuracy. We use the variational
autoencoder for autosupervised classification of regular/chaotic wave
functions, as well as demonstrating that variational autoencoders could be used
as a tool for detection of anomalous quantum states, such as quantum scars. By
taking this method further, we show that machine learning techniques allow us
to pin down the transition from integrability to many-body quantum chaos in
Heisenberg XXZ spin chains. For both cases, we confirm the existence of
universal W shapes that characterize the transition. Our results pave the way
for exploring the power of machine learning tools for revealing exotic
phenomena in quantum many-body systems.Comment: 12 pages, 12 figure
A novel two stage scheme utilizing the test set for model selection in text classification
Text classification is a natural application domain for semi-supervised learning, as labeling documents is expensive, but on the other hand usually an abundance of unlabeled documents is available. We describe a novel simple two stage scheme based on dagging which allows for utilizing the test set in model selection. The dagging ensemble can also be used by itself instead of the original classifier. We evaluate the performance of a meta classifier choosing between various base learners and their respective dagging ensembles. The selection process seems to perform robustly especially for small percentages of available labels for training
- …