435 research outputs found

    MintHint: Automated Synthesis of Repair Hints

    Full text link
    Being able to automatically repair programs is an extremely challenging task. In this paper, we present MintHint, a novel technique for program repair that is a departure from most of today's approaches. Instead of trying to fully automate program repair, which is often an unachievable goal, MintHint performs statistical correlation analysis to identify expressions that are likely to occur in the repaired code and generates, using pattern-matching based synthesis, repair hints from these expressions. Intuitively, these hints suggest how to rectify a faulty statement and help developers find a complete, actual repair. MintHint can address a variety of common faults, including incorrect, spurious, and missing expressions. We present a user study that shows that developers' productivity can improve manyfold with the use of repair hints generated by MintHint -- compared to having only traditional fault localization information. We also apply MintHint to several faults of a widely used Unix utility program to further assess the effectiveness of the approach. Our results show that MintHint performs well even in situations where (1) the repair space searched does not contain the exact repair, and (2) the operational specification obtained from the test cases for repair is incomplete or even imprecise

    Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features

    Full text link
    One-class support vector machine (OC-SVM) for a long time has been one of the most effective anomaly detection methods and extensively adopted in both research as well as industrial applications. The biggest issue for OC-SVM is yet the capability to operate with large and high-dimensional datasets due to optimization complexity. Those problems might be mitigated via dimensionality reduction techniques such as manifold learning or autoencoder. However, previous work often treats representation learning and anomaly prediction separately. In this paper, we propose autoencoder based one-class support vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier features to approximate the radial basis kernel, into deep learning context by combining it with a representation learning architecture and jointly exploit stochastic gradient descent to obtain end-to-end training. Interestingly, this also opens up the possible use of gradient-based attribution methods to explain the decision making for anomaly detection, which has ever been challenging as a result of the implicit mappings between the input space and the kernel space. To the best of our knowledge, this is the first work to study the interpretability of deep learning in anomaly detection. We evaluate our method on a wide range of unsupervised anomaly detection tasks in which our end-to-end training architecture achieves a performance significantly better than the previous work using separate training.Comment: Accepted at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 201

    Online change detection for energy-efficient mobilec crowdsensing

    Get PDF
    Mobile crowdsensing is power hungry since it requires continuously and simultaneously sensing, processing and uploading fused data from various sensor types including motion sensors and environment sensors. Realizing that being able to pinpoint change points of contexts enables energy-efficient mobile crowdsensing, we modify histogram-based techniques to efficiently detect changes, which has less computational complexity and performs better than the conventional techniques. To evaluate our proposed technique, we conducted experiments on real audio databases comprising 200 sound tracks. We also compare our change detection with multivariate normal distribution and one-class support vector machine. The results show that our proposed technique is more practical for mobile crowdsensing. For example, we show that it is possible to save 80% resource compared to standard continuous sensing while remaining detection sensitivity above 95%. This work enables energy-efficient mobile crowdsensing applications by adapting to contexts

    Detection of anomalous patterns in water consumption: an overview of approaches

    Get PDF
    The water distribution system constantly aims at improving and efficiently distributing water to the city. Thus, understanding the nature of irregularities that may interrupt or exacerbate the service is at the core of their business model. The detection of technical and non-technical losses allows water companies to improve the sustainability and affordability of the service. Anomaly detection in water consumption is at present a challenging task. Manual inspection of data is tedious and requires a large workforce. Fortunately, the sector may benefit from automatized and intelligent workflows to reduce the amount of time required to identify abnormal water consumption. The aim of this research work is to develop a methodology to detect anomalies and irregular patterns of water consumption. We propose the use of algorithms of different nature that approach the problem of anomaly detection from different perspectives that go from searching deviations from typical behavior to identification of anomalous pattern changes in prolonged periods of time. The experiments reveal that different approaches to the problem of anomaly detection provide complementary clues to contextualize household water consumption. In addition, all the information extracted from each approach can be used in conjunction to provide insights for decision-makingThis research work is cofounded by the European Regional Development Fund (FEDER) under the FEDER Catalonia Operative Programme 2014–2020 as part of the R+D Project from RIS3CAT Utilities 4.0 Community with reference code COMRDI16-1-0057.Peer ReviewedPostprint (author's final draft

    Using patterns position distribution for software failure detection

    Get PDF
    Pattern-based software failure detection is an important topic of research in recent years. In this method, a set of patterns from program execution traces are extracted, and represented as features, while their occurrence frequencies are treated as the corresponding feature values. But this conventional method has its limitation due to ignore the pattern’s position information, which is important for the classification of program traces. Patterns occurs in the different positions of the trace are likely to represent different meanings. In this paper, we present a novel approach for using pattern’s position distribution as features to detect software failure. The comparative experiments in both artificial and real datasets show the effectiveness of this method

    Effect of inhomogeneities and source position on dose distribution of nucletron high dose rate Ir-192 brachytherapy source by Monte Carlo simulation

    Get PDF
    Background: The presence of least dense dry air and highly dense cortical bone in the path of radiation and the position of source, near or far from the surface of patient, affects the exact dose delivery like in breast brachytherapy. Aim: This study aims to find out the dose difference in the presence of inhomogenieties like cortical bone and dry air as well as to find out difference of dose due to position of source in water phantom of high dose rate (HDR) 192 Ir nucletron microselectron v2 (mHDRv2) brachytherapy source using Monte Carlo (MC) simulation EGSnrc code, so that the results could be used in Treatment Planning System (TPS) for more precise brachytherapy treatment. Settings and Design: The settings and design are done using different software of the computer. Methods and Materials: For this study, the said source, water phantom of volume 30 x 30 x 30 cm 3 , inhomogeneities each of volume 1 x 2 x 2 cm 3 with their position, water of water phantom and position of source are modeled using three-dimensional MC EGSnrc code. Statistical Analysis Used: Mean and probability are used for results and discussion. Results : The % relative dose difference is calculated here as 5.5 to 6.5% higher and 4.5 to 5% lower in the presence of air and cortical bone respectively at transverse axis of the source, which may be due to difference of linear attenuation coefficients of the inhomogeneities. However, when the source was positioned at 1 cm distance from the surface of water phantom, the near points between 1 to 2 cm and 3 to 8 cm. from the source, at its transverse axis, were 2 to 3.5% and 4 to 16% underdose to the dose when the source was positioned at mid-point of water phantom. This may be due to lack of back scatter material when the source was positioned very near to the surface of said water phantom and overlap of the additional cause of missing scatter component with the primary dose for near points from the source. These results were found in good agreement with literature data. Conclusion: The results can be used in TPS

    YASA: yet another time series segmentation algorithm for anomaly detection in big data problems

    Get PDF
    Time series patterns analysis had recently attracted the attention of the research community for real-world applications. Petroleum industry is one of the application contexts where these problems are present, for instance for anomaly detection. Offshore petroleum platforms rely on heavy turbomachines for its extraction, pumping and generation operations. Frequently, these machines are intensively monitored by hundreds of sensors each, which send measurements with a high frequency to a concentration hub. Handling these data calls for a holistic approach, as sensor data is frequently noisy, unreliable, inconsistent with a priori problem axioms, and of a massive amount. For the anomalies detection problems in turbomachinery, it is essential to segment the dataset available in order to automatically discover the operational regime of the machine in the recent past. In this paper we propose a novel time series segmentation algorithm adaptable to big data problems and that is capable of handling the high volume of data involved in problem contexts. As part of the paper we describe our proposal, analyzing its computational complexity. We also perform empirical studies comparing our algorithm with similar approaches when applied to benchmark problems and a real-life application related to oil platform turbomachinery anomaly detection

    Detection of Anomalous Traffic Patterns and Insight Analysis from Bus Trajectory Data

    Full text link
    © 2019, Springer Nature Switzerland AG. Detection of anomalous patterns from traffic data is closely related to analysis of traffic accidents, fault detection, flow management, and new infrastructure planning. Existing methods on traffic anomaly detection are modelled on taxi trajectory data and have shortcoming that the data may lose much information about actual road traffic situation, as taxi drivers can select optimal route for themselves to avoid traffic anomalies. We employ bus trajectory data as it reflects real traffic conditions on the road to detect city-wide anomalous traffic patterns and to provide broader range of insights into these anomalies. Taking these considerations, we first propose a feature visualization method by mapping extracted 3-dimensional hidden features to red-green-blue (RGB) color space with a deep sparse autoencoder (DSAE). A color trajectory (CT) is produced by encoding a trajectory with RGB colors. Then, a novel algorithm is devised to detect spatio-temporal outliers with spatial and temporal properties extracted from the CT. We also integrate the CT with the geographic information system (GIS) map to obtain insights for understanding the traffic anomaly locations, and more importantly the road influence affected by the corresponding anomalies. Our proposed method was tested on three real-world bus trajectory data sets to demonstrate the excellent performance of high detection rates and low false alarm rates

    From Sensor Readings to Predictions: On the Process of Developing Practical Soft Sensors.

    Get PDF
    Automatic data acquisition systems provide large amounts of streaming data generated by physical sensors. This data forms an input to computational models (soft sensors) routinely used for monitoring and control of industrial processes, traffic patterns, environment and natural hazards, and many more. The majority of these models assume that the data comes in a cleaned and pre-processed form, ready to be fed directly into a predictive model. In practice, to ensure appropriate data quality, most of the modelling efforts concentrate on preparing data from raw sensor readings to be used as model inputs. This study analyzes the process of data preparation for predictive models with streaming sensor data. We present the challenges of data preparation as a four-step process, identify the key challenges in each step, and provide recommendations for handling these issues. The discussion is focused on the approaches that are less commonly used, while, based on our experience, may contribute particularly well to solving practical soft sensor tasks. Our arguments are illustrated with a case study in the chemical production industry
    • …
    corecore