4,008 research outputs found

    You can't always sketch what you want: Understanding Sensemaking in Visual Query Systems

    Full text link
    Visual query systems (VQSs) empower users to interactively search for line charts with desired visual patterns, typically specified using intuitive sketch-based interfaces. Despite decades of past work on VQSs, these efforts have not translated to adoption in practice, possibly because VQSs are largely evaluated in unrealistic lab-based settings. To remedy this gap in adoption, we collaborated with experts from three diverse domains---astronomy, genetics, and material science---via a year-long user-centered design process to develop a VQS that supports their workflow and analytical needs, and evaluate how VQSs can be used in practice. Our study results reveal that ad-hoc sketch-only querying is not as commonly used as prior work suggests, since analysts are often unable to precisely express their patterns of interest. In addition, we characterize three essential sensemaking processes supported by our enhanced VQS. We discover that participants employ all three processes, but in different proportions, depending on the analytical needs in each domain. Our findings suggest that all three sensemaking processes must be integrated in order to make future VQSs useful for a wide range of analytical inquiries.Comment: Accepted for presentation at IEEE VAST 2019, to be held October 20-25 in Vancouver, Canada. Paper will also be published in a special issue of IEEE Transactions on Visualization and Computer Graphics (TVCG) IEEE VIS (InfoVis/VAST/SciVis) 2019 ACM 2012 CCS - Human-centered computing, Visualization, Visualization design and evaluation method

    Markovian model for forecasting financial time series

    Get PDF
    The study aims to create a Markovian model for forecasting financial time series and measure its effectiveness on stock prices. In the study, the new forecaster was inspired by several machine learning techniques and included statistical approaches and conditional probabilities. Namely, Markov Chains and Hidden Markov Chains are the main inspiration for machine learning techniques. To be able to process time series with Markov Chains like algorithm, new transformation developed with the usage of daily stock prices. Thirteen years of daily stock prices have been used for the data feed. For measuring the effectiveness of a new predictor, the obtaıned results are compared with conventional methods such as ARIMA, linear regression, decision tree regression and support vector regression predictions. The comparisons presented are based on Mean Absolute Percentage Error (MAPE) and Root Mean Square Error ( RMSE). According to the achieved results, the new predictor performs better than decision tree regression, and ARIMA performs best among them.O estudo tem como objectivo criar um modelo markoviano para a previsão de séries temporais e medir a eficácia deste nas previsões de preços das ações. No estudo, o novo previsor foi inspirado em várias técnicas de aprendizagem de máquinas e incluiu abordagens estatísticas e probabilidades condicionais. Ou seja, as cadeias de Markov são a principal inspiração das técnicas para a aprendizagem das máquinas. Para ser capaz de processar séries temporais com algorítmo do tipo Cadeias de Markov, a nova técnica é desenvolvida com base em preços diários e ações. Foram considerados treze anos de preços diários de ações para teste dos modelos. Para medir a eficácia do novo previsor, foram obtidos resultados comparados com métodos convencionais, como os modelos ARIMA, a regressão linear, a regressão a partir da árvore de decisão. Esta comparação foi efetuada com base no Erro Absoluto Médio Percentual (MAPE) e na Raiz do Erro Quadrático Médio (RMSE). De acordo com os resultados obtidos, o novo previsor tem melhor desempenho do que a regressão da árvore de decisão, e o ARIMA tem o melhor desempenho entre eles

    Direct least squares fitting of ellipses segmentation and prioritized rules classification for curve-shaped chart patterns

    Get PDF
    In financial markets, appearances of chart patterns in time series are commonly considered as potential signals for imminent change in the direction of price movement. To identify chart patterns, time series data is usually segmented before it can be processed by different classification methods. However, existing segmentation methods are less effective in classifying 16 curve-shaped chart patterns from financial time series. In this paper, we propose three novel segmentation methods for classification of curveshaped chart patterns based on direct least squares fitting of ellipses. These methods are implemented based on the principles of sliding windows, turning points, and bottom-up piece wise linear approximation. To further enhance the efficiency of classifying chart patterns from real-time streaming data, we propose a novel algorithm called Accelerating Classification with Prioritized Rules (ACPR). Experiments based on real datasets from financial markets reveal that the proposed approaches are effective in classifying curveshaped patterns from time series. Experiment results reveal that the proposed segmentation methods with ACPR can significantly reduce the total execution time

    A Comprehensive Survey of Data Mining-based Fraud Detection Research

    Full text link
    This survey paper categorises, compares, and summarises from almost all published technical and review articles in automated fraud detection within the last 10 years. It defines the professional fraudster, formalises the main types and subtypes of known fraud, and presents the nature of data evidence collected within affected industries. Within the business context of mining the data to achieve higher cost savings, this research presents methods and techniques together with their problems. Compared to all related reviews on fraud detection, this survey covers much more technical articles and is the only one, to the best of our knowledge, which proposes alternative data and solutions from related domains.Comment: 14 page

    Deep Learning for Decision Making and Autonomous Complex Systems

    Get PDF
    Deep learning consists of various machine learning algorithms that aim to learn multiple levels of abstraction from data in a hierarchical manner. It is a tool to construct models using the data that mimics a real world process without an exceedingly tedious modelling of the actual process. We show that deep learning is a viable solution to decision making in mechanical engineering problems and complex physical systems. In this work, we demonstrated the application of this data-driven method in the design of microfluidic devices to serve as a map between the user-defined cross-sectional shape of the flow and the corresponding arrangement of micropillars in the flow channel that contributed to the flow deformation. We also present how deep learning can be used in the early detection of combustion instability for prognostics and health monitoring of a combustion engine, such that appropriate measures can be taken to prevent detrimental effects as a result of unstable combustion. One of the applications in complex systems concerns robotic path planning via the systematic learning of policies and associated rewards. In this context, a deep architecture is implemented to infer the expected value of information gained by performing an action based on the states of the environment. We also applied deep learning-based methods to enhance natural low-light images in the context of a surveillance framework and autonomous robots. Further, we looked at how machine learning methods can be used to perform root-cause analysis in cyber-physical systems subjected to a wide variety of operation anomalies. In all studies, the proposed frameworks have been shown to demonstrate promising feasibility and provided credible results for large-scale implementation in the industry

    Temporal - spatial recognizer for multi-label data

    Get PDF
    Pattern recognition is an important artificial intelligence task with practical applications in many fields such as medical and species distribution. Such application involves overlapping data points which are demonstrated in the multi- label dataset. Hence, there is a need for a recognition algorithm that can separate the overlapping data points in order to recognize the correct pattern. Existing recognition methods suffer from sensitivity to noise and overlapping points as they could not recognize a pattern when there is a shift in the position of the data points. Furthermore, the methods do not implicate temporal information in the process of recognition, which leads to low quality of data clustering. In this study, an improved pattern recognition method based on Hierarchical Temporal Memory (HTM) is proposed to solve the overlapping in data points of multi- label dataset. The imHTM (Improved HTM) method includes improvement in two of its components; feature extraction and data clustering. The first improvement is realized as TS-Layer Neocognitron algorithm which solves the shift in position problem in feature extraction phase. On the other hand, the data clustering step, has two improvements, TFCM and cFCM (TFCM with limit- Chebyshev distance metric) that allows the overlapped data points which occur in patterns to be separated correctly into the relevant clusters by temporal clustering. Experiments on five datasets were conducted to compare the proposed method (imHTM) against statistical, template and structural pattern recognition methods. The results showed that the percentage of success in recognition accuracy is 99% as compared with the template matching method (Featured-Based Approach, Area-Based Approach), statistical method (Principal Component Analysis, Linear Discriminant Analysis, Support Vector Machines and Neural Network) and structural method (original HTM). The findings indicate that the improved HTM can give an optimum pattern recognition accuracy, especially the ones in multi- label dataset

    CLADAG 2021 BOOK OF ABSTRACTS AND SHORT PAPERS

    Get PDF
    The book collects the short papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS). The meeting has been organized by the Department of Statistics, Computer Science and Applications of the University of Florence, under the auspices of the Italian Statistical Society and the International Federation of Classification Societies (IFCS). CLADAG is a member of the IFCS, a federation of national, regional, and linguistically-based classification societies. It is a non-profit, non-political scientific organization, whose aims are to further classification research
    corecore