158,521 research outputs found

    Improving the Robustness to Data Inconsistency between Training and Testing for Code Completion by Hierarchical Language Model

    Full text link
    In the field of software engineering, applying language models to the token sequence of source code is the state-of-art approach to build a code recommendation system. The syntax tree of source code has hierarchical structures. Ignoring the characteristics of tree structures decreases the model performance. Current LSTM model handles sequential data. The performance of LSTM model will decrease sharply if the noise unseen data is distributed everywhere in the test suite. As code has free naming conventions, it is common for a model trained on one project to encounter many unknown words on another project. If we set many unseen words as UNK just like the solution in natural language processing, the number of UNK will be much greater than the sum of the most frequently appeared words. In an extreme case, just predicting UNK at everywhere may achieve very high prediction accuracy. Thus, such solution cannot reflect the true performance of a model when encountering noise unseen data. In this paper, we only mark a small number of rare words as UNK and show the prediction performance of models under in-project and cross-project evaluation. We propose a novel Hierarchical Language Model (HLM) to improve the robustness of LSTM model to gain the capacity about dealing with the inconsistency of data distribution between training and testing. The newly proposed HLM takes the hierarchical structure of code tree into consideration to predict code. HLM uses BiLSTM to generate embedding for sub-trees according to hierarchies and collects the embedding of sub-trees in context to predict next code. The experiments on inner-project and cross-project data sets indicate that the newly proposed Hierarchical Language Model (HLM) performs better than the state-of-art LSTM model in dealing with the data inconsistency between training and testing and achieves averagely 11.2\% improvement in prediction accuracy

    A User Perception Model Concerning Safety and Security of Paratransit Services in Bandung, Indonesia

    Full text link
    Safety and security in public transportation, Angkutan Kota or paratransit included, are among the commonly poor aspects in Indonesia. The objective of this research is to describe user perception of safety and security aspects in paratransit operation and to develop a model to predict and explain user choice in the future when there is an improvement. Users stated that the conditions of safety and security could be categorized as fair to dangerous. Realizing the condition, users still want to use paratransit because they have no other mode and paratransit can easily be found. The main reason for safety problems was the low degree of awareness of the driver in operating the car, while the main reason for security problems was the low degree of law enforcement and limited number of policemen (security officers). Users stated that the most responsible stakeholder in safety and security was the operator (driver and owner) and the police. Each aspect has two models using binomial logistic regression, namely a model with and without experience of accidents or criminal incidents. All models seem quite appropriate ones, as shown by their statistical measurement. Incorporating user experience improved the model fitness and improved the model in describing traveler characteristics

    Models in the Cloud: Exploring Next Generation Environmental Software Systems

    Get PDF
    There is growing interest in the application of the latest trends in computing and data science methods to improve environmental science. However we found the penetration of best practice from computing domains such as software engineering and cloud computing into supporting every day environmental science to be poor. We take from this work a real need to re-evaluate the complexity of software tools and bring these to the right level of abstraction for environmental scientists to be able to leverage the latest developments in computing. In the Models in the Cloud project, we look at the role of model driven engineering, software frameworks and cloud computing in achieving this abstraction. As a case study we deployed a complex weather model to the cloud and developed a collaborative notebook interface for orchestrating the deployment and analysis of results. We navigate relatively poor support for complex high performance computing in the cloud to develop abstractions from complexity in cloud deployment and model configuration. We found great potential in cloud computing to transform science by enabling models to leverage elastic, flexible computing infrastructure and support new ways to deliver collaborative and open science

    Proper generalized decomposition for parameterized Helmholtz problems in heterogeneous and unbounded domains: application to harbor agitation

    Get PDF
    Solving the Helmholtz equation for a large number of input data in an heterogeneous media and unbounded domain still represents a challenge. This is due to the particular nature of the Helmholtz operator and the sensibility of the solution to small variations of the data. Here a reduced order model is used to determine the scattered solution everywhere in the domain for any incoming wave direction and frequency. Moreover, this is applied to a real engineering problem: water agitation inside real harbors for low to mid-high frequencies. The Proper Generalized Decomposition (PGD) model reduction approach is used to obtain a separable representation of the solution at any point and for any incoming wave direction and frequency. Here, its applicability to such a problem is discussed and demonstrated. More precisely, the contributions of the paper include the PGD implementation into a Perfectly Matched Layer framework to model the unbounded domain, and the separability of the operator which is addressed here using an efficient higher-order projection scheme. Then, the performance of the PGD in this framework is discussed and improved using the higher-order projection and a Petrov-Galerkin approach to construct the separated basis. Moreover, the efficiency of the higherorder projection scheme is demonstrated and compared with the higher-order singular value decomposition

    TANGO: Transparent heterogeneous hardware Architecture deployment for eNergy Gain in Operation

    Get PDF
    The paper is concerned with the issue of how software systems actually use Heterogeneous Parallel Architectures (HPAs), with the goal of optimizing power consumption on these resources. It argues the need for novel methods and tools to support software developers aiming to optimise power consumption resulting from designing, developing, deploying and running software on HPAs, while maintaining other quality aspects of software to adequate and agreed levels. To do so, a reference architecture to support energy efficiency at application construction, deployment, and operation is discussed, as well as its implementation and evaluation plans.Comment: Part of the Program Transformation for Programmability in Heterogeneous Architectures (PROHA) workshop, Barcelona, Spain, 12th March 2016, 7 pages, LaTeX, 3 PNG figure
    • …
    corecore