36,407 research outputs found

    Code Completion with Neural Attention and Pointer Networks

    Full text link
    Intelligent code completion has become an essential research task to accelerate modern software development. To facilitate effective code completion for dynamically-typed programming languages, we apply neural language models by learning from large codebases, and develop a tailored attention mechanism for code completion. However, standard neural language models even with attention mechanism cannot correctly predict the out-of-vocabulary (OoV) words that restrict the code completion performance. In this paper, inspired by the prevalence of locally repeated terms in program source code, and the recently proposed pointer copy mechanism, we propose a pointer mixture network for better predicting OoV words in code completion. Based on the context, the pointer mixture network learns to either generate a within-vocabulary word through an RNN component, or regenerate an OoV word from local context through a pointer component. Experiments on two benchmarked datasets demonstrate the effectiveness of our attention mechanism and pointer mixture network on the code completion task.Comment: Accepted in IJCAI 201

    Mechanism Deduction from Noisy Chemical Reaction Networks

    Full text link
    We introduce KiNetX, a fully automated meta-algorithm for the kinetic analysis of complex chemical reaction networks derived from semi-accurate but efficient electronic structure calculations. It is designed to (i) accelerate the automated exploration of such networks, and (ii) cope with model-inherent errors in electronic structure calculations on elementary reaction steps. We developed and implemented KiNetX to possess three features. First, KiNetX evaluates the kinetic relevance of every species in a (yet incomplete) reaction network to confine the search for new elementary reaction steps only to those species that are considered possibly relevant. Second, KiNetX identifies and eliminates all kinetically irrelevant species and elementary reactions to reduce a complex network graph to a comprehensible mechanism. Third, KiNetX estimates the sensitivity of species concentrations toward changes in individual rate constants (derived from relative free energies), which allows us to systematically select the most efficient electronic structure model for each elementary reaction given a predefined accuracy. The novelty of KiNetX consists in the rigorous propagation of correlated free-energy uncertainty through all steps of our kinetic analyis. To examine the performance of KiNetX, we developed AutoNetGen. It semirandomly generates chemistry-mimicking reaction networks by encoding chemical logic into their underlying graph structure. AutoNetGen allows us to consider a vast number of distinct chemistry-like scenarios and, hence, to discuss assess the importance of rigorous uncertainty propagation in a statistical context. Our results reveal that KiNetX reliably supports the deduction of product ratios, dominant reaction pathways, and possibly other network properties from semi-accurate electronic structure data.Comment: 36 pages, 4 figures, 2 table

    A Machine-Synesthetic Approach To DDoS Network Attack Detection

    Full text link
    In the authors' opinion, anomaly detection systems, or ADS, seem to be the most perspective direction in the subject of attack detection, because these systems can detect, among others, the unknown (zero-day) attacks. To detect anomalies, the authors propose to use machine synesthesia. In this case, machine synesthesia is understood as an interface that allows using image classification algorithms in the problem of detecting network anomalies, making it possible to use non-specialized image detection methods that have recently been widely and actively developed. The proposed approach is that the network traffic data is "projected" into the image. It can be seen from the experimental results that the proposed method for detecting anomalies shows high results in the detection of attacks. On a large sample, the value of the complex efficiency indicator reaches 97%.Comment: 12 pages, 2 figures, 5 tables. Accepted to the Intelligent Systems Conference (IntelliSys) 201

    Strategies for Searching Video Content with Text Queries or Video Examples

    Full text link
    The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search. However, metadata is often lacking for user-generated videos, thus these videos are unsearchable by current search engines. Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity problem by directly analyzing the visual and audio streams of each video. CBVR encompasses multiple research topics, including low-level feature design, feature fusion, semantic detector training and video search/reranking. We present novel strategies in these topics to enhance CBVR in both accuracy and speed under different query inputs, including pure textual queries and query by video examples. Our proposed strategies have been incorporated into our submission for the TRECVID 2014 Multimedia Event Detection evaluation, where our system outperformed other submissions in both text queries and video example queries, thus demonstrating the effectiveness of our proposed approaches

    Uncovering protein interaction in abstracts and text using a novel linear model and word proximity networks

    Get PDF
    We participated in three of the protein-protein interaction subtasks of the Second BioCreative Challenge: classification of abstracts relevant for protein-protein interaction (IAS), discovery of protein pairs (IPS) and text passages characterizing protein interaction (ISS) in full text documents. We approached the abstract classification task with a novel, lightweight linear model inspired by spam-detection techniques, as well as an uncertainty-based integration scheme. We also used a Support Vector Machine and the Singular Value Decomposition on the same features for comparison purposes. Our approach to the full text subtasks (protein pair and passage identification) includes a feature expansion method based on word-proximity networks. Our approach to the abstract classification task (IAS) was among the top submissions for this task in terms of the measures of performance used in the challenge evaluation (accuracy, F-score and AUC). We also report on a web-tool we produced using our approach: the Protein Interaction Abstract Relevance Evaluator (PIARE). Our approach to the full text tasks resulted in one of the highest recall rates as well as mean reciprocal rank of correct passages. Our approach to abstract classification shows that a simple linear model, using relatively few features, is capable of generalizing and uncovering the conceptual nature of protein-protein interaction from the bibliome. Since the novel approach is based on a very lightweight linear model, it can be easily ported and applied to similar problems. In full text problems, the expansion of word features with word-proximity networks is shown to be useful, though the need for some improvements is discussed
    corecore