Search CORE

465 research outputs found

Deep Learning Software Repositories

Author: White Martin
Publication venue: W&M ScholarWorks
Publication date: 02/06/2017
Field of study

Bridging the abstraction gap between artifacts and concepts is the essence of software engineering (SE) research problems. SE researchers regularly use machine learning to bridge this gap, but there are three fundamental issues with traditional applications of machine learning in SE research. Traditional applications are too reliant on labeled data. They are too reliant on human intuition, and they are not capable of learning expressive yet efficient internal representations. Ultimately, SE research needs approaches that can automatically learn representations of massive, heterogeneous, datasets in situ, apply the learned features to a particular task and possibly transfer knowledge from task to task. Improvements in both computational power and the amount of memory in modern computer architectures have enabled new approaches to canonical machine learning tasks. Specifically, these architectural advances have enabled machines that are capable of learning deep, compositional representations of massive data depots. The rise of deep learning has ushered in tremendous advances in several fields. Given the complexity of software repositories, we presume deep learning has the potential to usher in new analytical frameworks and methodologies for SE research and the practical applications it reaches. This dissertation examines and enables deep learning algorithms in different SE contexts. We demonstrate that deep learners significantly outperform state-of-the-practice software language models at code suggestion on a Java corpus. Further, these deep learners for code suggestion automatically learn how to represent lexical elements. We use these representations to transmute source code into structures for detecting similar code fragments at different levels of granularity—without declaring features for how the source code is to be represented. Then we use our learning-based framework for encoding fragments to intelligently select and adapt statements in a codebase for automated program repair. In our work on code suggestion, code clone detection, and automated program repair, everything for representing lexical elements and code fragments is mined from the source code repository. Indeed, our work aims to move SE research from the art of feature engineering to the science of automated discovery

College of William & Mary: W&M Publish

Implementation of the dynamic connectivity algorithm by Monika Rauch Henzinger and Valerie King

Author: Alberts David
Publication venue
Publication date: 01/01/1995
Field of study

Institutional Repository of the Freie Universität Berlin

Dynamic Identification for Control of Large Space Structures

Author: Ibrahim S. R.
Publication venue
Publication date
Field of study

This is a compilation of reports by the one author on one subject. It consists of the following five journal articles: (1) A Parametric Study of the Ibrahim Time Domain Modal Identification Algorithm; (2) Large Modal Survey Testing Using the Ibrahim Time Domain Identification Technique; (3) Computation of Normal Modes from Identified Complex Modes; (4) Dynamic Modeling of Structural from Measured Complex Modes; and (5) Time Domain Quasi-Linear Identification of Nonlinear Dynamic Systems

NASA Technical Reports Server

Efficient Detectors for MIMO-OFDM Systems under Spatial Correlation Antenna Arrays

Author: Abrao Taufik
Fukuda Rafael Masashi
Guerra David William Marques
Kobayashi Ricardo Tadashi
Publication venue: 'Wiley'
Publication date: 01/09/2018
Field of study

This work analyzes the performance of the implementable detectors for multiple-input-multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) technique under specific and realistic operation system condi- tions, including antenna correlation and array configuration. Time-domain channel model has been used to evaluate the system performance under realistic communication channel and system scenarios, including different channel correlation, modulation order and antenna arrays configurations. A bunch of MIMO-OFDM detectors were analyzed for the purpose of achieve high performance combined with high capacity systems and manageable computational complexity. Numerical Monte-Carlo simulations (MCS) demonstrate the channel selectivity effect, while the impact of the number of antennas, adoption of linear against heuristic-based detection schemes, and the spatial correlation effect under linear and planar antenna arrays are analyzed in the MIMO-OFDM context.Comment: 26 pgs, 16 figures and 5 table

arXiv.org e-Print Archive

Directory of Open Access Journals

Evaluation and Improvement of Machine Learning Algorithms in Drug Discovery

Author: Dyrland Kjetil
Publication venue: The University of Bergen
Publication date: 01/01/2022
Field of study

Drug discovery plays a critical role in today’s society for treating and preventing sickness and possibly deadly viruses. In early drug discovery development, the main challenge is to find candidate molecules to be used as drugs to treat a disease. This also means assessing key properties that are wanted in the inter- action between molecules and proteins. It is a very difficult problem because the molecular space is so big and complex. Drug discovery development is es- timated to take around 12–15 years on average, and the costs of developing a single drug amount to $2.8 billion dollars in the US. Modern drug discovery and drug development often start with finding candi- date drug molecules (‘compounds’) that can bind to a target, usually a protein in our body. Since there are billions of possible molecules to test, this becomes an endless search for compounds that show promising bioactivity. The search method is called high-throughput screening (HTS), or virtual HTS (VHTS) in a virtual environment. The traditional approach to HTS has been to test every compound one by one. More recent approaches have seen the use of robotics and of features extracted from the molecule, combining them with machine learning algorithms, in an effort to make the process more automated. Research has shown that this will still lead to human errors and bias. So, how can we use machine learning algorithms to make this approach more cost-efficient and more robust to human errors? This project tried to address these issues and led to two scientific papers as a result. The first paper explores how common evaluation metrics used for classification can actually be unsuited to the task, leading to severe consequences when put into a real application. The argument is based on basic principles of Decision Theory, which is recognized in the field of machine learning but has not been put into much use. It makes a distinction between predicting the most probable class and predicting the most valuable class in terms of the “cost” or “gains” for the classes. In an algorithm for classifying a particular disease in a patient, the wrong classification could lead to a life or death situation. The principles also apply to drug discovery, where the cost of further developing and optimizing a "useless" drug could be huge. The goal of the classifier should therefore not be to guess the correct class but to choose the optimal class, and the metric must depend on the type of classification problem. Thus, we show that common metrics such as precision, balanced accuracy, F1-score, Area Under The Curve, Matthews Correlation Coefficient, and Fowlkes-Mallows index are affected by this problem, and propose an evaluation method grounded on the foundations of Decision Theory to provide a solution to this problem. The metric presented, called utility, takes into account gains and losses for each correct or incorrect classification of the confusion matrix. For this to work effectively, the output of the machine learning algorithm needs to be a set of sensible probabilities for each class. This brings us to the second paper. Machine learning algorithms usually output a set of real numbers for the classes they try to predict, which, possibly after some transformation (for exam- ple the ‘softmax’ function), are meant to represent probabilities for the classes. However, the problem is that these numbers cannot be reliably interpreted as actual probabilities, in the sense of degrees of belief. In the paper, we propose the implementation of a probability transducer to transform the output of the algorithm into sensible probabilities. These are then used in conjunction with the utilities to choose the class with the maximal expected utility. The results show that the transducer gives better scores, in terms of the utilities, for all cases compared to the standard method used in machine learning.Masteroppgave i Programutvikling samarbeid med HVLPROG399MAMN-PRO

University of Bergen

NORA - Norwegian Open Research Archives

Personalizing Interactions with Information Systems

Author: Abiteboul
Allen
Anderson
André
Ashish
Ball
Belkin
Belkin
Belkin
Belkin
Belkin
Billsus
Bodner
Borgman
Brusilovsky
Brusilovsky
Brusilovsky
Bush
Card
Card
Card
Carroll
Chaudhuri
Chawathe
Chiaramella
Cingil
Croft
Croft
Cutting
De Bra
Deutsch
Fernández
Fernández
Florescu
Fuhr
Garcia-Molina
Garofalakis
Goh
Goldman
Haller
Hammer
Hearst
Hellerstein
Hiemstra
Joachims
John
Jones
Joshi
Kautz
Kirk
Knoblock
Kramer
Kushmerick
Lacroix
Lieberman
Madsen
Maglio
Manber
Marchetti
Marchionni
Meuss
Miller
Miller
Mintzer
Mobashier
Mostafa
Mukhopadhay
Mulvenna
Munroe
Munroe
Nestorov
Nestorov
O'Leary
Pancake
Pazzani
Pednault
Perkowitz
Ramakrishnan
Ramakrishnan
Resnick
Riecken
Riecken
Robertson
Robertson
Rocchio
Rosson
Rucker
Rus
Sacco
Sahuguet
Schwartz
Shneiderman
Shneiderman
Singh
Smith
Spiliopoulou
Srinivasan
Suchman
Terveen
Thomas
Thomas
Wexelblat
Widom
Williams
Wilson
Xie
Zadrozny
Publication venue: eCommons
Publication date: 01/01/2003
Field of study

Personalization constitutes the mechanisms and technologies necessary to customize information access to the end-user. It can be defined as the automatic adjustment of information content, structure, and presentation tailored to the individual. In this chapter, we study personalization from the viewpoint of personalizing interaction. The survey covers mechanisms for information-finding on the web, advanced information retrieval systems, dialog-based applications, and mobile access paradigms. Specific emphasis is placed on studying how users interact with an information system and how the system can encourage and foster interaction. This helps bring out the role of the personalization system as a facilitator which reconciles the user’s mental model with the underlying information system’s organization. Three tiers of personalization systems are presented, paying careful attention to interaction considerations. These tiers show how progressive levels of sophistication in interaction can be achieved. The chapter also surveys systems support technologies and niche application domains

Crossref

University of Dayton