3,443 research outputs found
Automatic Generation of Text Descriptive Comments for Code Blocks
We propose a framework to automatically generate descriptive comments for
source code blocks. While this problem has been studied by many researchers
previously, their methods are mostly based on fixed template and achieves poor
results. Our framework does not rely on any template, but makes use of a new
recursive neural network called Code-RNN to extract features from the source
code and embed them into one vector. When this vector representation is input
to a new recurrent neural network (Code-GRU), the overall framework generates
text descriptions of the code with accuracy (Rouge-2 value) significantly
higher than other learning-based approaches such as sequence-to-sequence model.
The Code-RNN model can also be used in other scenario where the representation
of code is required.Comment: aaai 201
Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks
Multivariate time series forecasting is an important machine learning problem
across many domains, including predictions of solar plant energy output,
electricity consumption, and traffic jam situation. Temporal data arise in
these real-world applications often involves a mixture of long-term and
short-term patterns, for which traditional approaches such as Autoregressive
models and Gaussian Process may fail. In this paper, we proposed a novel deep
learning framework, namely Long- and Short-term Time-series network (LSTNet),
to address this open challenge. LSTNet uses the Convolution Neural Network
(CNN) and the Recurrent Neural Network (RNN) to extract short-term local
dependency patterns among variables and to discover long-term patterns for time
series trends. Furthermore, we leverage traditional autoregressive model to
tackle the scale insensitive problem of the neural network model. In our
evaluation on real-world data with complex mixtures of repetitive patterns,
LSTNet achieved significant performance improvements over that of several
state-of-the-art baseline methods. All the data and experiment codes are
available online.Comment: Accepted by SIGIR 201
A Comparison of Clustering Techniques for Malware Analysis
In this research, we apply clustering techniques to the malware detection problem. Our goal is to classify malware as part of a fully automated detection strategy. We compute clusters using the well-known �-means and EM clustering algorithms, with scores obtained from Hidden Markov Models (HMM). The previous work in this area consists of using HMM and �-means clustering technique to achieve the same. The current effort aims to extend it to use EM clustering technique for detection and also compare this technique with the �-means clustering
Discovering Patterns in Time-Varying Graphs: A Triclustering Approach
International audienceThis paper introduces a novel technique to track structures in time varying graphs. The method uses a maximum a posteriori approach for adjusting a three-dimensional co-clustering of the source vertices, the destination vertices and the time, to the data under study, in a way that does not require any hyper-parameter tuning. The three dimensions are simultaneously segmented in order to build clusters of source vertices, destination vertices and time segments where the edge distributions across clusters of vertices follow the same evolution over the time segments. The main novelty of this approach lies in that the time segments are directly inferred from the evolution of the edge distribution between the vertices, thus not requiring the user to make any a priori quantization. Experiments conducted on artificial data illustrate the good behavior of the technique, and a study of a real-life data set shows the potential of the proposed approach for exploratory data analysis
Representing Conversations for Scalable Overhearing
Open distributed multi-agent systems are gaining interest in the academic
community and in industry. In such open settings, agents are often coordinated
using standardized agent conversation protocols. The representation of such
protocols (for analysis, validation, monitoring, etc) is an important aspect of
multi-agent applications. Recently, Petri nets have been shown to be an
interesting approach to such representation, and radically different approaches
using Petri nets have been proposed. However, their relative strengths and
weaknesses have not been examined. Moreover, their scalability and suitability
for different tasks have not been addressed. This paper addresses both these
challenges. First, we analyze existing Petri net representations in terms of
their scalability and appropriateness for overhearing, an important task in
monitoring open multi-agent systems. Then, building on the insights gained, we
introduce a novel representation using Colored Petri nets that explicitly
represent legal joint conversation states and messages. This representation
approach offers significant improvements in scalability and is particularly
suitable for overhearing. Furthermore, we show that this new representation
offers a comprehensive coverage of all conversation features of FIPA
conversation standards. We also present a procedure for transforming AUML
conversation protocol diagrams (a standard human-readable representation), to
our Colored Petri net representation
- …