Search CORE

773 research outputs found

Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks

Author: Bui Nghi D. Q.
Jiang Lingxiao
Yu Yijun
Publication venue
Publication date: 29/11/2017
Field of study

Towards the vision of translating code that implements an algorithm from one programming language into another, this paper proposes an approach for automated program classification using bilateral tree-based convolutional neural networks (BiTBCNNs). It is layered on top of two tree-based convolutional neural networks (TBCNNs), each of which recognizes the algorithm of code written in an individual programming language. The combination layer of the networks recognizes the similarities and differences among code in different programming languages. The BiTBCNNs are trained using the source code in different languages but known to implement the same algorithms and/or functionalities. For a preliminary evaluation, we use 3591 Java and 3534 C++ code snippets from 6 algorithms we crawled systematically from GitHub. We obtained over 90% accuracy in the cross-language binary classification task to tell whether any given two code snippets implement a same algorithm. Also, for the algorithm classification task, i.e., to predict which one of the six algorithm labels is implemented by an arbitrary C++ code snippet, we achieved over 80% precision

arXiv.org e-Print Archive

Open Research Online (The Open University)

Recommended from our members

AutoFocus: Interpreting Attention-based Neural Networks by Code Perturbation

Author: Bui Nghi D. Q.
Jiang Lingxiao
Yu Yijun
Publication venue
Publication date
Field of study

Despite being adopted in software engineering tasks, deep neural networks are treated mostly as a black box due to the difficulty in interpreting how the networks infer the outputs from the inputs. To address this problem, we propose AutoFocus, an automated approach for rating and visualizing the importance of input elements based on their effects on the outputs of the networks. The approach is built on our hypotheses that (1) attention mechanisms incorporated into neural networks can generate discriminative scores for various input elements and (2) the discriminative scores reflect the effects of input elements on the outputs of the networks. This paper verifies the hypotheses by applying AutoFocus on the task of algorithm classification (i.e., given a program source code as input, determine the algorithm implemented by the program). AutoFocus identifies and perturbs code elements in a program systematically, and quantifies the effects of the perturbed elements on the network’s classification results. Based on evaluation on more than 1000 programs for 10 different sorting algorithms, we observe that the attention scores are highly correlated to the effects of the perturbed code elements. Such a correlation provides a strong basis for the uses of attention scores to interpret the relations between code elements and the algorithm classification results of a neural network, and we believe that visualizing code elements in an input program ranked according to their attention scores can facilitate faster program comprehension with reduced code

Open Research Online (The Open University)

Preparation and Foliar Application of Oligochitosan - Nanosilica on the Enhancement of Soybean Seed Yield

Author: Du B. D. (Bui)
Hien N. Q. (Nguyen)
Phu D. V. (Dang)
Tam H. V. (Hoang)
Tuan L. N. (Le)
Publication venue: 'Infogain Publication'
Publication date: 01/02/2017
Field of study

Oligochitosan with weight average molecu-lar weight (Mw) of 5000 g/mol was prepared by gamma Co-60 radiation degradation of 4% chitosan solution containing 0.5% H2O2 at 21 kGy. Nanosilica with size of 10 – 30 nm was synthesized by calcination of acid treated rice husk at 700o C for 2 h. The mixture of 2% oligo-chitosan-2% nanosilica was prepared by dispersion of nanosilica in oligochitosan solution. Oligochitosan, nanosilica and their mixture were characterized by gel permeation chromatography (GPC), transmission electr-on microscopy (TEM), X-ray diffraction (XRD), energy dispersive x-ray spectroscopy (EDX), Ultraviolet-visible spectroscopy (UV-Vis), and Furrier transform infrared spectroscopy (FT-IR). Effect of foliar application of oli-gochitosan and oligochitosan-nanosilica on soybean seed yield was conducted in experimental field. Results indi-cated that soybean seed yield increased 10.5 and 17.0% for oligochitosan and oligochitosan-nanosilica, respect-tively for the control. Radiation degraded oligo-chitosan and its mixture with nanosilica can be potentially used for cultivation of soybean with enhanced seed yield

Neliti

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

Author: Bui Nghi D. Q.
Gotmare Akhilesh Deepak
Hoi Steven C. H.
Le Hung
Li Junnan
Wang Yue
Publication venue
Publication date: 31/05/2023
Field of study

Code intelligence plays a key role in transforming modern software engineering. Recently, deep learning-based models, especially Transformer-based large language models (LLMs), have demonstrated remarkable potential in tackling these tasks by leveraging massive open-source code data and programming language features. However, the development and deployment of such models often require expertise in both machine learning and software engineering, creating a barrier for the model adoption. In this paper, we present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence. Following the principles of modular design and extensible framework, we design CodeTF with a unified interface to enable rapid access and development across different types of models, datasets and tasks. Our library supports a collection of pretrained Code LLM models and popular code benchmarks, including a standardized interface to train and serve code LLMs efficiently, and data features such as language-specific parsers and utility functions for extracting code attributes. In this paper, we describe the design principles, the architecture, key modules and components, and compare with other related library tools. Finally, we hope CodeTF is able to bridge the gap between machine learning/generative AI and software engineering, providing a comprehensive open-source solution for developers, researchers, and practitioners.Comment: Ongoing work - Draft Previe

arXiv.org e-Print Archive

Pathogenicity of an H5N1 avian influenza virus isolated in Vietnam in 2012 and reliability of conjunctival samples for diagnosis of infection

Author: Bui Anh N.
Bui Vuong N.
Dao Tung D.
Imai Kunitoshi
Inui Kenjiro
Nguyen Khong V.
Nguyen Lien T.
Nguyen Tham T.H.
Ogawa Haruko
Pham Nga T.
Runstadler Jonathan
Trinh Dai Q.
Publication venue: 'Elsevier BV'
Publication date: 01/10/2013
Field of study

The continued spread of highly pathogenic avian influenza virus (HPAIV) subtype H5N1 among poultry in Vietnam poses a potential threat to animals and public health. To evaluate the pathogenicity of a 2012 H5N1 HPAIV isolate and to assess the utility of conjunctival swabs for viral detection and isolation in surveillance, an experimental infection with HPAIV subtype H5N1 was carried out in domestic ducks. Ducks were infected with 10[superscript 7.2] TCID[subscript 50] of A/duck/Vietnam/QB1207/2012 (H5N1), which was isolated from a moribund domestic duck. In the infected ducks, clinical signs of disease, including neurological disorder, were observed. Ducks started to die at 3 days-post-infection (dpi), and the study mortality reached 67%. Viruses were recovered from oropharyngeal and conjunctival swabs until 7 dpi and from cloacal swabs until 4 dpi. In the ducks that died or were sacrificed on 3, 5, or 6 dpi, viruses were recovered from lung, brain, heart, pancreas and intestine, among which the highest virus titers were in the lung, brain or heart. Results of virus titration were confirmed by real-time RT-PCR. Genetic and phylogenetic analysis of the HA gene revealed that the isolate belongs to clade 2.3.2.1 similarly to the H5N1 viruses isolated in Vietnam in 2012. The present study demonstrated that this recent HPAI H5N1 virus of clade 2.3.2.1 could replicate efficiently in the systemic organs, including the brain, and cause severe disease with neurological symptoms in domestic ducks. Therefore, this HPAI H5N1 virus seems to retain the neurotrophic feature and has further developed properties of shedding virus from the oropharynx and conjunctiva in addition to the cloaca, potentially posing a higher risk of virus spread through cross-contact and/or environmental transmission. Continued surveillance and diagnostic programs using conjunctival swabs in the field would further verify the apparent reliability of conjunctival samples for the detection of AIV.Japan Society for the Promotion of Science (Grant-in-Aid for Bilateral Joint Projects)Heiwa Nakajima FoundationNational Institute of Allergy and Infectious Diseases (U.S.) (Contract HHSN2662007000010C

DSpace@MIT

Crossref

PubMed Central

Methane emission factors from vietnamese rice production: Pooling data of 36 field sites for meta-analysis

Author: Asch F.
Bui T. P. L.
Dinh Q. H.
Mai V. T.
Sander B. O.
Vo T. B. T.
Vu D. Q.
Vu T. H.
Wassmann R.
Yen B. T.
Publication venue: MDPI
Publication date: 18/08/2020
Field of study

Rice production is a significant source of greenhouse gas (GHG) emissions in the national budget of many Asian countries, but the extent of emissions varies strongly across agro-environmental zones. It is important to understand these differences in order to improve the national GHG inventory and effectively target mitigation options. This study presents a meta-analysis of CH4 database emission factors (EFs) from 36 field sites across the rice growing areas of Vietnam and covering 73 cropping seasons. The EFs were developed from field measurements using the closed chamber technique. The analysis for calculating baseline EFs in North, Central and South Vietnam in line with the Intergovernmental Panel on Climate Change (IPCC) Tier 2 methodology was specified for the three cropping seasons being early-(E), mid-(M) and late-year (L) seasons. Calculated average CH

_{4}

EFs are given in kg ha

^{-1}

^{-1}

and reflect the distinct seasons in North (E: 2.21; L: 3.89), Central (E: 2.84; M+L: 3.13) and South Vietnam (E: 1.72; M: 2.80; L: 3.58). Derived from the available data of the edapho-hydrological zones of the Mekong River Delta, season-based EFs are more useful than zone-based EFs. In totality, these average EFs indicate an enormous variability of GHG emissions in Vietnamese rice production and represent much higher values than the IPCC default. Seasonal EFs from Vietnam exceeded IPCC defaults given for Southeast Asia corresponding to 160% (E), 240% (M) and 290% (L) of the medium value, respectively

KITopen

Class based Influence Functions for Error Detection

Author: Bui Nghi D. Q.
Dau Anh T. V.
Huu-Tien Dang
Nguyen Hieu Ngoc
Nguyen-Duc Thang
Thanh-Tung Hoang
Tran Quan Hung
Publication venue
Publication date: 02/05/2023
Field of study

Influence functions (IFs) are a powerful tool for detecting anomalous examples in large scale datasets. However, they are unstable when applied to deep networks. In this paper, we provide an explanation for the instability of IFs and develop a solution to this problem. We show that IFs are unreliable when the two data points belong to two different classes. Our solution leverages class information to improve the stability of IFs. Extensive experiments show that our modification significantly improves the performance and stability of IFs while incurring no additional computational cost.Comment: Thang Nguyen-Duc, Hoang Thanh-Tung, and Quan Hung Tran are co-first authors of this paper. 12 pages, 12 figures. Accepted to ACL 202

arXiv.org e-Print Archive

Direct frequency comb measurement of OD + CO → DOCO kinetics

Author: Aspelmeyer M.
Bjork B. J.
Bui T. Q.
Changala P. B.
Cole G. D.
Deutsch C.
Follman D.
Heckl O. H.
Heu P.
Okumura M.
Spaun B.
Ye J.
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 25/08/2016
Field of study

The kinetics of the hydroxyl radical (OH) + carbon monoxide (CO) reaction, which is fundamental to both atmospheric and combustion chemistry, are complex because of the formation of the hydrocarboxyl radical (HOCO) intermediate. Despite extensive studies of this reaction, HOCO has not been observed under thermal reaction conditions. Exploiting the sensitive, broadband, and high-resolution capabilities of time-resolved cavity-enhanced direct frequency comb spectroscopy, we observed deuteroxyl radical (OD) + CO reaction kinetics and detected stabilized trans-DOCO, the deuterated analog of trans-HOCO. By simultaneously measuring the time-dependent concentrations of the trans-DOCO and OD species, we observed unambiguous low-pressure termolecular dependence of the reaction rate coefficients for N_2 and CO bath gases. These results confirm the HOCO formation mechanism and quantify its yield

arXiv.org e-Print Archive

Caltech Authors