18,517 research outputs found
Harnessing Data-Driven Insights: Predictive Modeling for Diamond Price Forecasting using Regression and Classification Techniques
In the multi-faceted world of gemology, understanding diamond valuations plays a pivotal role for traders, customers, and researchers alike. This study delves deep into predicting diamond prices in terms of exact monetary values and broader price categories. The purpose was to harness advanced machine learning techniques to achieve precise estimations and categorisations, thereby assisting stakeholders in informed decision-making. The research methodology adopted comprised a rigorous data preprocessing phase, ensuring the data's readiness for model training. A range of sophisticated machine learning models were employed, from traditional linear regression to more advanced ensemble methods like Random Forest and Gradient Boosting. The dataset was also transformed to facilitate classification into predefined price tiers, exploring the viability of models like Logistic Regression and Support Vector Machines in this context. The conceptual model encompasses a systematic flow, beginning with data acquisition, transitioning through preprocessing, regression, and classification analyses, and culminating in a comparative study of the performance metrics. This structured approach underscores the originality and value of our research, offering a holistic view of diamond price prediction from both regression and classification lenses. Findings from the analysis highlighted the superior performance of the Random Forest regressor in predicting exact prices with an R2 value of approximately 0.975. In contrast, for classification into price tiers, both Logistic Regression and Support Vector Machines emerged as frontrunners with an accuracy exceeding 95%. These results provide invaluable insights for stakeholders in the diamond industry, emphasising the potential of machine learning in refining valuation processes
Statistical learning theory of structured data
The traditional approach of statistical physics to supervised learning
routinely assumes unrealistic generative models for the data: usually inputs
are independent random variables, uncorrelated with their labels. Only
recently, statistical physicists started to explore more complex forms of data,
such as equally-labelled points lying on (possibly low dimensional) object
manifolds. Here we provide a bridge between this recently-established research
area and the framework of statistical learning theory, a branch of mathematics
devoted to inference in machine learning. The overarching motivation is the
inadequacy of the classic rigorous results in explaining the remarkable
generalization properties of deep learning. We propose a way to integrate
physical models of data into statistical learning theory, and address, with
both combinatorial and statistical mechanics methods, the computation of the
Vapnik-Chervonenkis entropy, which counts the number of different binary
classifications compatible with the loss class. As a proof of concept, we focus
on kernel machines and on two simple realizations of data structure introduced
in recent physics literature: -dimensional simplexes with prescribed
geometric relations and spherical manifolds (equivalent to margin
classification). Entropy, contrary to what happens for unstructured data, is
nonmonotonic in the sample size, in contrast with the rigorous bounds.
Moreover, data structure induces a novel transition beyond the storage
capacity, which we advocate as a proxy of the nonmonotonicity, and ultimately a
cue of low generalization error. The identification of a synaptic volume
vanishing at the transition allows a quantification of the impact of data
structure within replica theory, applicable in cases where combinatorial
methods are not available, as we demonstrate for margin learning.Comment: 19 pages, 3 figure
Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior
We describe an approach to understand the peculiar and counterintuitive
generalization properties of deep neural networks. The approach involves going
beyond worst-case theoretical capacity control frameworks that have been
popular in machine learning in recent years to revisit old ideas in the
statistical mechanics of neural networks. Within this approach, we present a
prototypical Very Simple Deep Learning (VSDL) model, whose behavior is
controlled by two control parameters, one describing an effective amount of
data, or load, on the network (that decreases when noise is added to the
input), and one with an effective temperature interpretation (that increases
when algorithms are early stopped). Using this model, we describe how a very
simple application of ideas from the statistical mechanics theory of
generalization provides a strong qualitative description of recently-observed
empirical results regarding the inability of deep neural networks not to
overfit training data, discontinuous learning and sharp transitions in the
generalization properties of learning algorithms, etc.Comment: 31 pages; added brief discussion of recent papers that use/extend
these idea
Machine Learning for Condensed Matter Physics
Condensed Matter Physics (CMP) seeks to understand the microscopic
interactions of matter at the quantum and atomistic levels, and describes how
these interactions result in both mesoscopic and macroscopic properties. CMP
overlaps with many other important branches of science, such as Chemistry,
Materials Science, Statistical Physics, and High-Performance Computing. With
the advancements in modern Machine Learning (ML) technology, a keen interest in
applying these algorithms to further CMP research has created a compelling new
area of research at the intersection of both fields. In this review, we aim to
explore the main areas within CMP, which have successfully applied ML
techniques to further research, such as the description and use of ML schemes
for potential energy surfaces, the characterization of topological phases of
matter in lattice systems, the prediction of phase transitions in off-lattice
and atomistic simulations, the interpretation of ML theories with
physics-inspired frameworks and the enhancement of simulation methods with ML
algorithms. We also discuss in detail the main challenges and drawbacks of
using ML methods on CMP problems, as well as some perspectives for future
developments.Comment: 48 pages, 2 figures, 300 references. Review paper. Major Revisio
Hierarchical learning in polynomial Support Vector Machines
We study the typical properties of polynomial Support Vector Machines within
a Statistical Mechanics approach that allows us to analyze the effect of
different normalizations of the features. If the normalization is adecuately
chosen, there is a hierarchical learning of features of increasing order as a
function of the training set size.Comment: 22 pages, 7 figures, submitted to Machine Learnin
Statistical Global Modeling of Beta-Decay Halflives Systematics Using Multilayer Feedforward Neural Networks and Support Vector Machines
In this work, the beta-decay halflives problem is dealt as a nonlinear
optimization problem, which is resolved in the statistical framework of Machine
Learning (LM). Continuing past similar approaches, we have constructed
sophisticated Artificial Neural Networks (ANNs) and Support Vector Regression
Machines (SVMs) for each class with even-odd character in Z and N to global
model the systematics of nuclei that decay 100% by the beta-minus-mode in their
ground states. The arising large-scale lifetime calculations generated by both
types of machines are discussed and compared with each other, with the
available experimental data, with previous results obtained with neural
networks, as well as with estimates coming from traditional global nuclear
models. Particular attention is paid on the estimates for exotic and halo
nuclei and we focus to those nuclides that are involved in the r-process
nucleosynthesis. It is found that statistical models based on LM can at least
match or even surpass the predictive performance of the best conventional
models of beta-decay systematics and can complement the latter.Comment: 8 pages, 1 fiqure, Proceedings of the 17th HNPS Symposiu
signSGD with Majority Vote is Communication Efficient And Fault Tolerant
Training neural networks on large datasets can be accelerated by distributing
the workload over a network of machines. As datasets grow ever larger, networks
of hundreds or thousands of machines become economically viable. The time cost
of communicating gradients limits the effectiveness of using such large machine
counts, as may the increased chance of network faults. We explore a
particularly simple algorithm for robust, communication-efficient
learning---signSGD. Workers transmit only the sign of their gradient vector to
a server, and the overall update is decided by a majority vote. This algorithm
uses less communication per iteration than full-precision,
distributed SGD. Under natural conditions verified by experiment, we prove that
signSGD converges in the large and mini-batch settings, establishing
convergence for a parameter regime of Adam as a byproduct. Aggregating sign
gradients by majority vote means that no individual worker has too much power.
We prove that unlike SGD, majority vote is robust when up to 50% of workers
behave adversarially. The class of adversaries we consider includes as special
cases those that invert or randomise their gradient estimate. On the practical
side, we built our distributed training system in Pytorch. Benchmarking against
the state of the art collective communications library (NCCL), our
framework---with the parameter server housed entirely on one machine---led to a
25% reduction in time for training resnet50 on Imagenet when using 15 AWS
p3.2xlarge machines
Combining Multiple Time Series Models Through A Robust Weighted Mechanism
Improvement of time series forecasting accuracy through combining multiple
models is an important as well as a dynamic area of research. As a result,
various forecasts combination methods have been developed in literature.
However, most of them are based on simple linear ensemble strategies and hence
ignore the possible relationships between two or more participating models. In
this paper, we propose a robust weighted nonlinear ensemble technique which
considers the individual forecasts from different models as well as the
correlations among them while combining. The proposed ensemble is constructed
using three well-known forecasting models and is tested for three real-world
time series. A comparison is made among the proposed scheme and three other
widely used linear combination methods, in terms of the obtained forecast
errors. This comparison shows that our ensemble scheme provides significantly
lower forecast errors than each individual model as well as each of the four
linear combination methods.Comment: 6 pages, 3 figures, 2 tables, conferenc
Modeling Nuclear Properties with Support Vector Machines
We have made initial studies of the potential of support vector machines
(SVM) for providing statistical models of nuclear systematics with demonstrable
predictive power. Using SVM regression and classification procedures, we have
created global models of atomic masses, beta-decay halflives, and ground-state
spins and parities. These models exhibit performance in both data-fitting and
prediction that is comparable to that of the best global models from nuclear
phenomenology and microscopic theory, as well as the best statistical models
based on multilayer feedforward neural networks.Comment: 15 pages; website with latest results adde
A Theory of Cheap Control in Embodied Systems
We present a framework for designing cheap control architectures for embodied
agents. Our derivation is guided by the classical problem of universal
approximation, whereby we explore the possibility of exploiting the agent's
embodiment for a new and more efficient universal approximation of behaviors
generated by sensorimotor control. This embodied universal approximation is
compared with the classical non-embodied universal approximation. To exemplify
our approach, we present a detailed quantitative case study for policy models
defined in terms of conditional restricted Boltzmann machines. In contrast to
non-embodied universal approximation, which requires an exponential number of
parameters, in the embodied setting we are able to generate all possible
behaviors with a drastically smaller model, thus obtaining cheap universal
approximation. We test and corroborate the theory experimentally with a
six-legged walking machine. The experiments show that the sufficient controller
complexity predicted by our theory is tight, which means that the theory has
direct practical implications. Keywords: cheap design, embodiment, sensorimotor
loop, universal approximation, conditional restricted Boltzmann machineComment: 27 pages, 10 figure
- …