954 research outputs found

    Hierarchical topological clustering learns stock market sectors

    Get PDF
    The breakdown of financial markets into sectors provides an intuitive classification for groups of companies. The allocation of a company to a sector is an expert task, in which the company is classified by the activity that most closely describes the nature of the company's business. Individual share price movement is dependent upon many factors, but there is an expectation for shares within a market sector to move broadly together. We are interested in discovering if share closing prices do move together, and whether groups of shares that do move together are identifiable in terms of industrial activity. Using TreeGNG, a hierarchical clustering algorithm, on a time series of share closing prices, we have identified groups of companies that cluster into clearly identifiable groups. These clusters compare favourably to a globally accepted sector classification scheme, and in our opinion, our method identifies sector structure clearer than a statistical agglomerative hierarchical clustering metho

    Map-aided fingerprint-based indoor positioning

    Get PDF
    The objective of this work is to investigate potential accuracy improvements in the fingerprint-based indoor positioning processes, by imposing map-constraints into the positioning algorithms in the form of a-priori knowledge. In our approach, we propose the introduction of a Route Probability Factor (RPF), which reflects the possibility of a user, to be located on one position instead of all others. The RPF does not only affect the probabilities of the points along the pre-defined frequent routes, but also influences all the neighbouring points that lie at the proximity of each frequent route. The outcome of the evaluation process, indicates the validity of the RPF approach, demonstrated by the significant reduction of the positioning error

    A generalization of the Minkowski distance and a new definition of the ellipse

    Full text link
    In this paper, we generalize the Minkowski distance by defining a new distance function in n-dimensional space, and we show that this function determines also a metric family as the Minkowski distance. Then, we consider three special cases of this family, which generalize the taxicab, Euclidean and maximum metrics respectively, and finally we determine circles of them with their some properties in the real plane. While we determine some properties of circles of the generalized Minkowski distance, we also discover a new definition for the ellipse.Comment: 18 pages, 18 figure

    Optimised meta-clustering approach for clustering Time Series Matrices

    Get PDF
    The prognostics (health state) of multiple components represented as time series data stored in vectors and matrices were processed and clustered more effectively and efficiently using the newly devised ‘Meta-Clustering’ approach. These time series data gathered from large applications and systems in diverse fields such as communication, medicine, data mining, audio, visual applications, and sensors. The reason time series data was used as the domain of this research is that meaningful information could be extracted regarding the characteristics of systems and components found in large applications. Also when it came to clustering, only time series data would allow us to group these data according to their life cycle, i.e. from the time which they were healthy until the time which they start to develop faults and ultimately fail. Therefore by proposing a technique that can better process extracted time series data would significantly cut down on space and time consumption which are both crucial factors in data mining. This approach will, as a result, improve the current state of the art pattern recognition algorithms such as K-NM as the clusters will be identified faster while consuming less space. The project also has application implications in the sense that by calculating the distance between the similar components faster while also consuming less space means that the prognostics of multiple components clustered can be realised and understood more efficiently. This was achieved by using the Meta-Clustering approach to process and cluster the time series data by first extracting and storing the time series data as a two-dimensional matrix. Then implementing an enhance K-NM clustering algorithm based on the notion of Meta-Clustering and using the Euclidean distance tool to measure the similarity between the different set of failure patterns in space. This approach would initially classify and organise each component within its own refined individual cluster. This would provide the most relevant set of failure patterns that show the highest level of similarity and would also get rid of any unnecessary data that adds no value towards better understating the failure/health state of the component. Then during the second stage, once these clusters were effectively obtained, the following inner clusters initially formed are thereby grouped into one general cluster that now represents the prognostics of all the processed components. The approach was tested on multivariate time series data extracted from IGBT components within Matlab and the results achieved from this experiment showed that the optimised Meta-Clustering approach proposed does indeed consume less time and space to cluster the prognostics of IGBT components as compared to existing data mining techniques
    • 

    corecore