36,407 research outputs found
Code Completion with Neural Attention and Pointer Networks
Intelligent code completion has become an essential research task to
accelerate modern software development. To facilitate effective code completion
for dynamically-typed programming languages, we apply neural language models by
learning from large codebases, and develop a tailored attention mechanism for
code completion. However, standard neural language models even with attention
mechanism cannot correctly predict the out-of-vocabulary (OoV) words that
restrict the code completion performance. In this paper, inspired by the
prevalence of locally repeated terms in program source code, and the recently
proposed pointer copy mechanism, we propose a pointer mixture network for
better predicting OoV words in code completion. Based on the context, the
pointer mixture network learns to either generate a within-vocabulary word
through an RNN component, or regenerate an OoV word from local context through
a pointer component. Experiments on two benchmarked datasets demonstrate the
effectiveness of our attention mechanism and pointer mixture network on the
code completion task.Comment: Accepted in IJCAI 201
Mechanism Deduction from Noisy Chemical Reaction Networks
We introduce KiNetX, a fully automated meta-algorithm for the kinetic
analysis of complex chemical reaction networks derived from semi-accurate but
efficient electronic structure calculations. It is designed to (i) accelerate
the automated exploration of such networks, and (ii) cope with model-inherent
errors in electronic structure calculations on elementary reaction steps. We
developed and implemented KiNetX to possess three features. First, KiNetX
evaluates the kinetic relevance of every species in a (yet incomplete) reaction
network to confine the search for new elementary reaction steps only to those
species that are considered possibly relevant. Second, KiNetX identifies and
eliminates all kinetically irrelevant species and elementary reactions to
reduce a complex network graph to a comprehensible mechanism. Third, KiNetX
estimates the sensitivity of species concentrations toward changes in
individual rate constants (derived from relative free energies), which allows
us to systematically select the most efficient electronic structure model for
each elementary reaction given a predefined accuracy. The novelty of KiNetX
consists in the rigorous propagation of correlated free-energy uncertainty
through all steps of our kinetic analyis. To examine the performance of KiNetX,
we developed AutoNetGen. It semirandomly generates chemistry-mimicking reaction
networks by encoding chemical logic into their underlying graph structure.
AutoNetGen allows us to consider a vast number of distinct chemistry-like
scenarios and, hence, to discuss assess the importance of rigorous uncertainty
propagation in a statistical context. Our results reveal that KiNetX reliably
supports the deduction of product ratios, dominant reaction pathways, and
possibly other network properties from semi-accurate electronic structure data.Comment: 36 pages, 4 figures, 2 table
A Machine-Synesthetic Approach To DDoS Network Attack Detection
In the authors' opinion, anomaly detection systems, or ADS, seem to be the
most perspective direction in the subject of attack detection, because these
systems can detect, among others, the unknown (zero-day) attacks. To detect
anomalies, the authors propose to use machine synesthesia. In this case,
machine synesthesia is understood as an interface that allows using image
classification algorithms in the problem of detecting network anomalies, making
it possible to use non-specialized image detection methods that have recently
been widely and actively developed. The proposed approach is that the network
traffic data is "projected" into the image. It can be seen from the
experimental results that the proposed method for detecting anomalies shows
high results in the detection of attacks. On a large sample, the value of the
complex efficiency indicator reaches 97%.Comment: 12 pages, 2 figures, 5 tables. Accepted to the Intelligent Systems
Conference (IntelliSys) 201
Strategies for Searching Video Content with Text Queries or Video Examples
The large number of user-generated videos uploaded on to the Internet
everyday has led to many commercial video search engines, which mainly rely on
text metadata for search. However, metadata is often lacking for user-generated
videos, thus these videos are unsearchable by current search engines.
Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity
problem by directly analyzing the visual and audio streams of each video. CBVR
encompasses multiple research topics, including low-level feature design,
feature fusion, semantic detector training and video search/reranking. We
present novel strategies in these topics to enhance CBVR in both accuracy and
speed under different query inputs, including pure textual queries and query by
video examples. Our proposed strategies have been incorporated into our
submission for the TRECVID 2014 Multimedia Event Detection evaluation, where
our system outperformed other submissions in both text queries and video
example queries, thus demonstrating the effectiveness of our proposed
approaches
Uncovering protein interaction in abstracts and text using a novel linear model and word proximity networks
We participated in three of the protein-protein interaction subtasks of the
Second BioCreative Challenge: classification of abstracts relevant for
protein-protein interaction (IAS), discovery of protein pairs (IPS) and text
passages characterizing protein interaction (ISS) in full text documents. We
approached the abstract classification task with a novel, lightweight linear
model inspired by spam-detection techniques, as well as an uncertainty-based
integration scheme. We also used a Support Vector Machine and the Singular
Value Decomposition on the same features for comparison purposes. Our approach
to the full text subtasks (protein pair and passage identification) includes a
feature expansion method based on word-proximity networks. Our approach to the
abstract classification task (IAS) was among the top submissions for this task
in terms of the measures of performance used in the challenge evaluation
(accuracy, F-score and AUC). We also report on a web-tool we produced using our
approach: the Protein Interaction Abstract Relevance Evaluator (PIARE). Our
approach to the full text tasks resulted in one of the highest recall rates as
well as mean reciprocal rank of correct passages. Our approach to abstract
classification shows that a simple linear model, using relatively few features,
is capable of generalizing and uncovering the conceptual nature of
protein-protein interaction from the bibliome. Since the novel approach is
based on a very lightweight linear model, it can be easily ported and applied
to similar problems. In full text problems, the expansion of word features with
word-proximity networks is shown to be useful, though the need for some
improvements is discussed
- …