18,517 research outputs found

    Harnessing Data-Driven Insights: Predictive Modeling for Diamond Price Forecasting using Regression and Classification Techniques

    Get PDF
    In the multi-faceted world of gemology, understanding diamond valuations plays a pivotal role for traders, customers, and researchers alike. This study delves deep into predicting diamond prices in terms of exact monetary values and broader price categories. The purpose was to harness advanced machine learning techniques to achieve precise estimations and categorisations, thereby assisting stakeholders in informed decision-making. The research methodology adopted comprised a rigorous data preprocessing phase, ensuring the data's readiness for model training. A range of sophisticated machine learning models were employed, from traditional linear regression to more advanced   ensemble methods like Random Forest and Gradient Boosting. The dataset was also transformed to facilitate classification into predefined price tiers, exploring the viability of models like Logistic Regression and Support Vector Machines in this context. The conceptual model encompasses a systematic flow, beginning with data acquisition, transitioning through preprocessing, regression, and classification analyses, and culminating in a comparative study of the performance metrics. This structured approach underscores the originality and value of our research, offering a holistic view of diamond price prediction from both regression and classification lenses. Findings from the analysis highlighted the superior performance of the Random Forest regressor in predicting exact prices with an R2 value of approximately 0.975. In contrast, for classification into price tiers, both Logistic Regression and Support Vector Machines emerged as frontrunners with an accuracy exceeding 95%. These results provide invaluable insights for stakeholders in the diamond industry, emphasising the potential of machine learning in refining valuation processes

    Statistical learning theory of structured data

    Full text link
    The traditional approach of statistical physics to supervised learning routinely assumes unrealistic generative models for the data: usually inputs are independent random variables, uncorrelated with their labels. Only recently, statistical physicists started to explore more complex forms of data, such as equally-labelled points lying on (possibly low dimensional) object manifolds. Here we provide a bridge between this recently-established research area and the framework of statistical learning theory, a branch of mathematics devoted to inference in machine learning. The overarching motivation is the inadequacy of the classic rigorous results in explaining the remarkable generalization properties of deep learning. We propose a way to integrate physical models of data into statistical learning theory, and address, with both combinatorial and statistical mechanics methods, the computation of the Vapnik-Chervonenkis entropy, which counts the number of different binary classifications compatible with the loss class. As a proof of concept, we focus on kernel machines and on two simple realizations of data structure introduced in recent physics literature: kk-dimensional simplexes with prescribed geometric relations and spherical manifolds (equivalent to margin classification). Entropy, contrary to what happens for unstructured data, is nonmonotonic in the sample size, in contrast with the rigorous bounds. Moreover, data structure induces a novel transition beyond the storage capacity, which we advocate as a proxy of the nonmonotonicity, and ultimately a cue of low generalization error. The identification of a synaptic volume vanishing at the transition allows a quantification of the impact of data structure within replica theory, applicable in cases where combinatorial methods are not available, as we demonstrate for margin learning.Comment: 19 pages, 3 figure

    Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior

    Full text link
    We describe an approach to understand the peculiar and counterintuitive generalization properties of deep neural networks. The approach involves going beyond worst-case theoretical capacity control frameworks that have been popular in machine learning in recent years to revisit old ideas in the statistical mechanics of neural networks. Within this approach, we present a prototypical Very Simple Deep Learning (VSDL) model, whose behavior is controlled by two control parameters, one describing an effective amount of data, or load, on the network (that decreases when noise is added to the input), and one with an effective temperature interpretation (that increases when algorithms are early stopped). Using this model, we describe how a very simple application of ideas from the statistical mechanics theory of generalization provides a strong qualitative description of recently-observed empirical results regarding the inability of deep neural networks not to overfit training data, discontinuous learning and sharp transitions in the generalization properties of learning algorithms, etc.Comment: 31 pages; added brief discussion of recent papers that use/extend these idea

    Machine Learning for Condensed Matter Physics

    Full text link
    Condensed Matter Physics (CMP) seeks to understand the microscopic interactions of matter at the quantum and atomistic levels, and describes how these interactions result in both mesoscopic and macroscopic properties. CMP overlaps with many other important branches of science, such as Chemistry, Materials Science, Statistical Physics, and High-Performance Computing. With the advancements in modern Machine Learning (ML) technology, a keen interest in applying these algorithms to further CMP research has created a compelling new area of research at the intersection of both fields. In this review, we aim to explore the main areas within CMP, which have successfully applied ML techniques to further research, such as the description and use of ML schemes for potential energy surfaces, the characterization of topological phases of matter in lattice systems, the prediction of phase transitions in off-lattice and atomistic simulations, the interpretation of ML theories with physics-inspired frameworks and the enhancement of simulation methods with ML algorithms. We also discuss in detail the main challenges and drawbacks of using ML methods on CMP problems, as well as some perspectives for future developments.Comment: 48 pages, 2 figures, 300 references. Review paper. Major Revisio

    Hierarchical learning in polynomial Support Vector Machines

    Full text link
    We study the typical properties of polynomial Support Vector Machines within a Statistical Mechanics approach that allows us to analyze the effect of different normalizations of the features. If the normalization is adecuately chosen, there is a hierarchical learning of features of increasing order as a function of the training set size.Comment: 22 pages, 7 figures, submitted to Machine Learnin

    Statistical Global Modeling of Beta-Decay Halflives Systematics Using Multilayer Feedforward Neural Networks and Support Vector Machines

    Full text link
    In this work, the beta-decay halflives problem is dealt as a nonlinear optimization problem, which is resolved in the statistical framework of Machine Learning (LM). Continuing past similar approaches, we have constructed sophisticated Artificial Neural Networks (ANNs) and Support Vector Regression Machines (SVMs) for each class with even-odd character in Z and N to global model the systematics of nuclei that decay 100% by the beta-minus-mode in their ground states. The arising large-scale lifetime calculations generated by both types of machines are discussed and compared with each other, with the available experimental data, with previous results obtained with neural networks, as well as with estimates coming from traditional global nuclear models. Particular attention is paid on the estimates for exotic and halo nuclei and we focus to those nuclides that are involved in the r-process nucleosynthesis. It is found that statistical models based on LM can at least match or even surpass the predictive performance of the best conventional models of beta-decay systematics and can complement the latter.Comment: 8 pages, 1 fiqure, Proceedings of the 17th HNPS Symposiu

    signSGD with Majority Vote is Communication Efficient And Fault Tolerant

    Get PDF
    Training neural networks on large datasets can be accelerated by distributing the workload over a network of machines. As datasets grow ever larger, networks of hundreds or thousands of machines become economically viable. The time cost of communicating gradients limits the effectiveness of using such large machine counts, as may the increased chance of network faults. We explore a particularly simple algorithm for robust, communication-efficient learning---signSGD. Workers transmit only the sign of their gradient vector to a server, and the overall update is decided by a majority vote. This algorithm uses 32×32\times less communication per iteration than full-precision, distributed SGD. Under natural conditions verified by experiment, we prove that signSGD converges in the large and mini-batch settings, establishing convergence for a parameter regime of Adam as a byproduct. Aggregating sign gradients by majority vote means that no individual worker has too much power. We prove that unlike SGD, majority vote is robust when up to 50% of workers behave adversarially. The class of adversaries we consider includes as special cases those that invert or randomise their gradient estimate. On the practical side, we built our distributed training system in Pytorch. Benchmarking against the state of the art collective communications library (NCCL), our framework---with the parameter server housed entirely on one machine---led to a 25% reduction in time for training resnet50 on Imagenet when using 15 AWS p3.2xlarge machines

    Combining Multiple Time Series Models Through A Robust Weighted Mechanism

    Full text link
    Improvement of time series forecasting accuracy through combining multiple models is an important as well as a dynamic area of research. As a result, various forecasts combination methods have been developed in literature. However, most of them are based on simple linear ensemble strategies and hence ignore the possible relationships between two or more participating models. In this paper, we propose a robust weighted nonlinear ensemble technique which considers the individual forecasts from different models as well as the correlations among them while combining. The proposed ensemble is constructed using three well-known forecasting models and is tested for three real-world time series. A comparison is made among the proposed scheme and three other widely used linear combination methods, in terms of the obtained forecast errors. This comparison shows that our ensemble scheme provides significantly lower forecast errors than each individual model as well as each of the four linear combination methods.Comment: 6 pages, 3 figures, 2 tables, conferenc

    Modeling Nuclear Properties with Support Vector Machines

    Full text link
    We have made initial studies of the potential of support vector machines (SVM) for providing statistical models of nuclear systematics with demonstrable predictive power. Using SVM regression and classification procedures, we have created global models of atomic masses, beta-decay halflives, and ground-state spins and parities. These models exhibit performance in both data-fitting and prediction that is comparable to that of the best global models from nuclear phenomenology and microscopic theory, as well as the best statistical models based on multilayer feedforward neural networks.Comment: 15 pages; website with latest results adde

    A Theory of Cheap Control in Embodied Systems

    Full text link
    We present a framework for designing cheap control architectures for embodied agents. Our derivation is guided by the classical problem of universal approximation, whereby we explore the possibility of exploiting the agent's embodiment for a new and more efficient universal approximation of behaviors generated by sensorimotor control. This embodied universal approximation is compared with the classical non-embodied universal approximation. To exemplify our approach, we present a detailed quantitative case study for policy models defined in terms of conditional restricted Boltzmann machines. In contrast to non-embodied universal approximation, which requires an exponential number of parameters, in the embodied setting we are able to generate all possible behaviors with a drastically smaller model, thus obtaining cheap universal approximation. We test and corroborate the theory experimentally with a six-legged walking machine. The experiments show that the sufficient controller complexity predicted by our theory is tight, which means that the theory has direct practical implications. Keywords: cheap design, embodiment, sensorimotor loop, universal approximation, conditional restricted Boltzmann machineComment: 27 pages, 10 figure
    • …
    corecore