435 research outputs found
MintHint: Automated Synthesis of Repair Hints
Being able to automatically repair programs is an extremely challenging task.
In this paper, we present MintHint, a novel technique for program repair that
is a departure from most of today's approaches. Instead of trying to fully
automate program repair, which is often an unachievable goal, MintHint performs
statistical correlation analysis to identify expressions that are likely to
occur in the repaired code and generates, using pattern-matching based
synthesis, repair hints from these expressions. Intuitively, these hints
suggest how to rectify a faulty statement and help developers find a complete,
actual repair. MintHint can address a variety of common faults, including
incorrect, spurious, and missing expressions.
We present a user study that shows that developers' productivity can improve
manyfold with the use of repair hints generated by MintHint -- compared to
having only traditional fault localization information. We also apply MintHint
to several faults of a widely used Unix utility program to further assess the
effectiveness of the approach. Our results show that MintHint performs well
even in situations where (1) the repair space searched does not contain the
exact repair, and (2) the operational specification obtained from the test
cases for repair is incomplete or even imprecise
Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features
One-class support vector machine (OC-SVM) for a long time has been one of the
most effective anomaly detection methods and extensively adopted in both
research as well as industrial applications. The biggest issue for OC-SVM is
yet the capability to operate with large and high-dimensional datasets due to
optimization complexity. Those problems might be mitigated via dimensionality
reduction techniques such as manifold learning or autoencoder. However,
previous work often treats representation learning and anomaly prediction
separately. In this paper, we propose autoencoder based one-class support
vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier
features to approximate the radial basis kernel, into deep learning context by
combining it with a representation learning architecture and jointly exploit
stochastic gradient descent to obtain end-to-end training. Interestingly, this
also opens up the possible use of gradient-based attribution methods to explain
the decision making for anomaly detection, which has ever been challenging as a
result of the implicit mappings between the input space and the kernel space.
To the best of our knowledge, this is the first work to study the
interpretability of deep learning in anomaly detection. We evaluate our method
on a wide range of unsupervised anomaly detection tasks in which our end-to-end
training architecture achieves a performance significantly better than the
previous work using separate training.Comment: Accepted at European Conference on Machine Learning and Principles
and Practice of Knowledge Discovery in Databases (ECML-PKDD) 201
Online change detection for energy-efficient mobilec crowdsensing
Mobile crowdsensing is power hungry since it requires continuously and simultaneously sensing, processing and uploading fused data from various sensor types including motion sensors and environment sensors. Realizing that being able to pinpoint change points of contexts enables energy-efficient mobile crowdsensing, we modify histogram-based techniques to efficiently detect changes, which has less computational complexity and performs better than the conventional techniques. To evaluate our proposed technique, we conducted experiments on real audio databases comprising 200 sound tracks. We also compare our change detection with multivariate normal distribution and one-class support vector machine. The results show that our proposed technique is more practical for mobile crowdsensing. For example, we show that it is possible to save 80% resource compared to standard continuous sensing while remaining detection sensitivity above 95%. This work enables energy-efficient mobile crowdsensing applications by adapting to contexts
Detection of anomalous patterns in water consumption: an overview of approaches
The water distribution system constantly aims at improving and efficiently distributing water to the city. Thus, understanding the nature of irregularities that may interrupt or exacerbate the service is at the core of their business model. The detection of technical and non-technical losses allows water companies to improve the sustainability and affordability of the service. Anomaly detection in water consumption is at present a challenging task. Manual inspection of data is tedious and requires a large workforce. Fortunately, the sector may benefit from automatized and intelligent workflows to reduce the amount of time required to identify abnormal water consumption. The aim of this research work is to develop a methodology to detect anomalies and irregular patterns of water consumption. We propose the use of algorithms of different nature that approach the problem of anomaly detection from different perspectives that go from searching deviations from typical behavior to identification of anomalous pattern changes in prolonged periods of time. The experiments reveal that different approaches to the problem of anomaly detection provide complementary clues to contextualize household water consumption. In addition, all the information extracted from each approach can be used in conjunction to provide insights for decision-makingThis research work is cofounded by the European Regional Development Fund (FEDER) under the FEDER Catalonia Operative Programme 2014–2020 as part of the R+D Project from RIS3CAT Utilities 4.0 Community with reference code COMRDI16-1-0057.Peer ReviewedPostprint (author's final draft
Using patterns position distribution for software failure detection
Pattern-based software failure detection is an important topic of research in recent years. In this method, a set of patterns from program execution traces are extracted, and represented as features, while their occurrence frequencies are treated as the corresponding feature values. But this conventional method has its limitation due to ignore the pattern’s position information, which is important for the classification of program traces. Patterns occurs in the different positions of the trace are likely to represent different meanings. In this paper, we present a novel approach for using pattern’s position distribution as features to detect software failure. The comparative experiments in both artificial and real datasets show the effectiveness of this method
Effect of inhomogeneities and source position on dose distribution of nucletron high dose rate Ir-192 brachytherapy source by Monte Carlo simulation
Background: The presence of least dense dry air and highly dense
cortical bone in the path of radiation and the position of source, near
or far from the surface of patient, affects the exact dose delivery
like in breast brachytherapy. Aim: This study aims to find out the
dose difference in the presence of inhomogenieties like cortical bone
and dry air as well as to find out difference of dose due to position
of source in water phantom of high dose rate (HDR) 192 Ir nucletron
microselectron v2 (mHDRv2) brachytherapy source using Monte Carlo (MC)
simulation EGSnrc code, so that the results could be used in Treatment
Planning System (TPS) for more precise brachytherapy treatment.
Settings and Design: The settings and design are done using different
software of the computer. Methods and Materials: For this study, the
said source, water phantom of volume 30 x 30 x 30 cm 3 ,
inhomogeneities each of volume 1 x 2 x 2 cm 3 with their position,
water of water phantom and position of source are modeled using
three-dimensional MC EGSnrc code. Statistical Analysis Used: Mean and
probability are used for results and discussion. Results : The %
relative dose difference is calculated here as 5.5 to 6.5% higher and
4.5 to 5% lower in the presence of air and cortical bone respectively
at transverse axis of the source, which may be due to difference of
linear attenuation coefficients of the inhomogeneities. However, when
the source was positioned at 1 cm distance from the surface of water
phantom, the near points between 1 to 2 cm and 3 to 8 cm. from the
source, at its transverse axis, were 2 to 3.5% and 4 to 16% underdose
to the dose when the source was positioned at mid-point of water
phantom. This may be due to lack of back scatter material when the
source was positioned very near to the surface of said water phantom
and overlap of the additional cause of missing scatter component with
the primary dose for near points from the source. These results were
found in good agreement with literature data. Conclusion: The results
can be used in TPS
YASA: yet another time series segmentation algorithm for anomaly detection in big data problems
Time series patterns analysis had recently attracted the attention of the research community for real-world applications. Petroleum industry is one of the application contexts where these problems are present, for instance for anomaly detection. Offshore petroleum platforms rely on heavy turbomachines for its extraction, pumping and generation operations. Frequently, these machines are intensively monitored by hundreds of sensors each, which send measurements with a high frequency to a concentration hub. Handling these data calls for a holistic approach, as sensor data is frequently noisy, unreliable, inconsistent with a priori problem axioms, and of a massive amount. For the anomalies detection problems in turbomachinery, it is essential to segment the dataset available in order to automatically discover the operational regime of the machine in the recent past. In this paper we propose a novel time series segmentation algorithm adaptable to big data problems and that is capable of handling the high volume of data involved in problem contexts. As part of the paper we describe our proposal, analyzing its computational complexity. We also perform empirical studies comparing our algorithm with similar approaches when applied to benchmark problems and a real-life application related to oil platform turbomachinery anomaly detection
Detection of Anomalous Traffic Patterns and Insight Analysis from Bus Trajectory Data
© 2019, Springer Nature Switzerland AG. Detection of anomalous patterns from traffic data is closely related to analysis of traffic accidents, fault detection, flow management, and new infrastructure planning. Existing methods on traffic anomaly detection are modelled on taxi trajectory data and have shortcoming that the data may lose much information about actual road traffic situation, as taxi drivers can select optimal route for themselves to avoid traffic anomalies. We employ bus trajectory data as it reflects real traffic conditions on the road to detect city-wide anomalous traffic patterns and to provide broader range of insights into these anomalies. Taking these considerations, we first propose a feature visualization method by mapping extracted 3-dimensional hidden features to red-green-blue (RGB) color space with a deep sparse autoencoder (DSAE). A color trajectory (CT) is produced by encoding a trajectory with RGB colors. Then, a novel algorithm is devised to detect spatio-temporal outliers with spatial and temporal properties extracted from the CT. We also integrate the CT with the geographic information system (GIS) map to obtain insights for understanding the traffic anomaly locations, and more importantly the road influence affected by the corresponding anomalies. Our proposed method was tested on three real-world bus trajectory data sets to demonstrate the excellent performance of high detection rates and low false alarm rates
From Sensor Readings to Predictions: On the Process of Developing Practical Soft Sensors.
Automatic data acquisition systems provide large amounts of streaming data generated by physical sensors. This data forms an input to computational models (soft sensors) routinely used for monitoring and control of industrial processes, traffic patterns, environment and natural hazards, and many more. The majority of these models assume that the data comes in a cleaned and pre-processed form, ready to be fed directly into a predictive model. In practice, to ensure appropriate data quality, most of the modelling efforts concentrate on preparing data from raw sensor readings to be used as model inputs. This study analyzes the process of data preparation for predictive models with streaming sensor data. We present the challenges of data preparation as a four-step process, identify the key challenges in each step, and provide recommendations for handling these issues. The discussion is focused on the approaches that are less commonly used, while, based on our experience, may contribute particularly well to solving practical soft sensor tasks. Our arguments are illustrated with a case study in the chemical production industry
- …