218 research outputs found

    Hints for families of GRBs improving the Hubble diagram

    Full text link
    As soon as their extragalactic origins were established, the hope to make Gamma - Ray Bursts (GRBs) standardizeable candles to probe the very high - z universe has opened the search for scaling relations between redshift independent observable quantities and distance dependent ones. Although some remarkable success has been achieved, the empirical correlations thus found are still affected by a significant intrinsic scatter which downgrades the precision in the inferred GRBs Hubble diagram. We investigate here whether this scatter may come from fitting together objects belonging to intrinsically different classes. To this end, we rely on a cladistics analysis to partition GRBs in homogenous families according to their rest frame properties. Although the poor statistics prevent us from drawing a definitive answer, we find that both the intrinsic scatter and the coefficients of the EpeakE_{peak}\,-\,EisoE_{iso} and EpeakE_{peak}\,-\,LL correlations significantly change depending on which subsample is fitted. It turns out that the fit to the full sample leads to a scaling relation which approximately follows the diagonal of the region delimited by the fits to each homogenous class. We therefore argue that a preliminary identification of the class a GRB belongs to is necessary in order to select the right scaling relation to be used in order to not bias the distance determination and hence the Hubble diagram.Comment: 10 pages, 6 figures, 4 tables, accepted for publication on MNRA

    Identifying users and activities from brain wave signals recorded from a wearable headband

    Get PDF
    1 online resource (xii, 82 p.) : ill.Includes abstract.Includes bibliographical references (p. 79-82).This paper studies the supervised classification of electroencephalogram (EEG) brain signals to identify persons and their activities. The brain signals are obtained from a commercially available and modestly priced wearable headband. Such wear-able devices generate a large amount of data and due to their attractive pricing struc-ture are becoming increasingly commonplace. As a result, the data generated from such wearables will increase exponentially, leading to many interesting data mining opportunities. This paper proposes a representation that reduces variable length signals to more manageable and uniformly fixed length distributions, and then explores the effectiveness of a variety of data mining techniques on the biometric signals. The proposed approach is demonstrated through data collected from a wearable headband that recorded EEG brain signals. The brain signals are recorded for a number of participants performing various tasks. The experiments use a number of classification and clustering techniques, including decision trees, SVM, neural networks, random forests, K-means clustering, and semi-supervised crisp and rough K-medoid clustering. The results show that it is possible to identify both the persons and the activities with a reasonable degree of precision. Furthermore, for identifying persons the evolutionary semi-supervised crisp and rough K-medoid clustering is shown to favourably compare with the conventional unsupervised algorithms such as K-means

    Importance subsampling: Improving power system planning under climate-based uncertainty

    Get PDF
    Recent studies indicate that the effects of inter-annual climate-based variability in power system planning are significant and that long samples of demand & weather data (spanning multiple decades) should be considered. At the same time, modelling renewable generation such as solar and wind requires high temporal resolution to capture fluctuations in output levels. In many realistic power system models, using long samples at high temporal resolution is computationally unfeasible. This paper introduces a novel subsampling approach, referred to as importance subsampling, allowing the use of multiple decades of demand & weather data in power system planning models at reduced computational cost. The methodology can be applied in a wide class of optimisation-based power system simulations. A test case is performed on a model of the United Kingdom created using the open-source modelling framework Calliope and 36 years of hourly demand and wind data. Standard data reduction approaches such as using individual years or clustering into representative days lead to significant errors in estimates of optimal system design. Furthermore, the resultant power systems lead to supply capacity shortages, raising questions of generation capacity adequacy. In contrast, importance subsampling leads to accurate estimates of optimal system design at greatly reduced computational cost, with resultant power systems able to meet demand across all 36 years of demand & weather scenarios

    Importance subsampling: improving power system planning under climate-based uncertainty

    Get PDF
    Recent studies indicate that the effects of inter-annual climate-based variability in power system planning are significant and that long samples of demand & weather data (spanning multiple decades) should be considered. At the same time, modelling renewable generation such as solar and wind requires high temporal resolution to capture fluctuations in output levels. In many realistic power system models, using long samples at high temporal resolution is computationally unfeasible. This paper introduces a novel subsampling approach, referred to as importance subsampling, allowing the use of multiple decades of demand & weather data in power system planning models at reduced computational cost. The methodology can be applied in a wide class of optimisation based power system simulations. A test case is performed on a model of the United Kingdom created using the open-source modelling framework Calliope and 36 years of hourly demand and wind data. Standard data reduction approaches such as using individual years or clustering into representative days lead to significant errors in estimates of optimal system design. Furthermore, the resultant power systems lead to supply capacity shortages, raising questions of generation capacity adequacy. In contrast, importance subsampling leads to accurate estimates of optimal system design at greatly reduced computational cost, with resultant power systems able to meet demand across all 36 years of demand & weather scenarios

    Dynamic Time Warping for Lead-Lag Relationships in Lagged Multi-Factor Models

    Full text link
    In multivariate time series systems, lead-lag relationships reveal dependencies between time series when they are shifted in time relative to each other. Uncovering such relationships is valuable in downstream tasks, such as control, forecasting, and clustering. By understanding the temporal dependencies between different time series, one can better comprehend the complex interactions and patterns within the system. We develop a cluster-driven methodology based on dynamic time warping for robust detection of lead-lag relationships in lagged multi-factor models. We establish connections to the multireference alignment problem for both the homogeneous and heterogeneous settings. Since multivariate time series are ubiquitous in a wide range of domains, we demonstrate that our algorithm is able to robustly detect lead-lag relationships in financial markets, which can be subsequently leveraged in trading strategies with significant economic benefits.Comment: arXiv admin note: substantial text overlap with arXiv:2305.0670

    Clustering Data of Mixed Categorical and Numerical Type with Unsupervised Feature Learning

    Get PDF
    Mixed-type categorical and numerical data are a challenge in many applications. This general area of mixed-type data is among the frontier areas, where computational intelligence approaches are often brittle compared with the capabilities of living creatures. In this paper, unsupervised feature learning (UFL) is applied to the mixed-type data to achieve a sparse representation, which makes it easier for clustering algorithms to separate the data. Unlike other UFL methods that work with homogeneous data, such as image and video data, the presented UFL works with the mixed-type data using fuzzy adaptive resonance theory (ART). UFL with fuzzy ART (UFLA) obtains a better clustering result by removing the differences in treating categorical and numeric features. The advantages of doing this are demonstrated with several real-world data sets with ground truth, including heart disease, teaching assistant evaluation, and credit approval. The approach is also demonstrated on noisy, mixed-type petroleum industry data. UFLA is compared with several alternative methods. To the best of our knowledge, this is the first time UFL has been extended to accomplish the fusion of mixed data types

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance

    A General Spatio-Temporal Clustering-Based Non-local Formulation for Multiscale Modeling of Compartmentalized Reservoirs

    Full text link
    Representing the reservoir as a network of discrete compartments with neighbor and non-neighbor connections is a fast, yet accurate method for analyzing oil and gas reservoirs. Automatic and rapid detection of coarse-scale compartments with distinct static and dynamic properties is an integral part of such high-level reservoir analysis. In this work, we present a hybrid framework specific to reservoir analysis for an automatic detection of clusters in space using spatial and temporal field data, coupled with a physics-based multiscale modeling approach. In this work a novel hybrid approach is presented in which we couple a physics-based non-local modeling framework with data-driven clustering techniques to provide a fast and accurate multiscale modeling of compartmentalized reservoirs. This research also adds to the literature by presenting a comprehensive work on spatio-temporal clustering for reservoir studies applications that well considers the clustering complexities, the intrinsic sparse and noisy nature of the data, and the interpretability of the outcome. Keywords: Artificial Intelligence; Machine Learning; Spatio-Temporal Clustering; Physics-Based Data-Driven Formulation; Multiscale Modelin
    • …
    corecore