Search CORE

428,364 research outputs found

Using a Machine Learning Approach to Implement and Evaluate Product Line Features

Author: Bacciu Davide
Gnesi Stefania
Semini Laura
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2015
Field of study

Bike-sharing systems are a means of smart transportation in urban environments with the benefit of a positive impact on urban mobility. In this paper we are interested in studying and modeling the behavior of features that permit the end user to access, with her/his web browser, the status of the Bike-Sharing system. In particular, we address features able to make a prediction on the system state. We propose to use a machine learning approach to analyze usage patterns and learn computational models of such features from logs of system usage. On the one hand, machine learning methodologies provide a powerful and general means to implement a wide choice of predictive features. On the other hand, trained machine learning models are provided with a measure of predictive performance that can be used as a metric to assess the cost-performance trade-off of the feature. This provides a principled way to assess the runtime behavior of different components before putting them into operation.Comment: In Proceedings WWV 2015, arXiv:1508.0338

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Archivio della Ricerca - Università di Pisa

ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided Problems

Author: Aparna Lakshmiratan
Carlos Garcia
David Chickering
David Grangier
Denis Charles
Jina Suh
Johan Verwey
Jurado Suarez
Léon Bottou
Patrice Simard
Saleema Amershi
Publication venue
Publication date: 01/01/2014
Field of study

Quick interaction between a human teacher and a learning machine presents numerous benefits and challenges when working with web-scale data. The human teacher guides the machine towards accomplishing the task of interest. The learning machine leverages big data to find examples that maximize the training value of its interaction with the teacher. When the teacher is restricted to labeling examples selected by the machine, this problem is an instance of active learning. When the teacher can provide additional information to the machine (e.g., suggestions on what examples or predictive features should be used) as the learning task progresses, then the problem becomes one of interactive learning. To accommodate the two-way communication channel needed for efficient interactive learning, the teacher and the machine need an environment that supports an interaction language. The machine can access, process, and summarize more examples than the teacher can see in a lifetime. Based on the machine's output, the teacher can revise the definition of the task or make it more precise. Both the teacher and the machine continuously learn and benefit from the interaction. We have built a platform to (1) produce valuable and deployable models and (2) support research on both the machine learning and user interface challenges of the interactive learning problem. The platform relies on a dedicated, low-latency, distributed, in-memory architecture that allows us to construct web-scale learning machines with quick interaction speed. The purpose of this paper is to describe this architecture and demonstrate how it supports our research efforts. Preliminary results are presented as illustrations of the architecture but are not the primary focus of the paper

arXiv.org e-Print Archive

CiteSeerX

An automated ETL for online datasets

Author: McCarren Andrew
McCarthy Suzanne
Roantree Mark
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/12/2019
Field of study

While using online datasets for machine learning is commonplace today, the quality of these datasets impacts on the performance of prediction algorithms. One method for improving the semantics of new data sources is to map these sources to a common data model or ontology. While semantic and structural heterogeneities must still be resolved, this provides a well established approach to providing clean datasets, suitable for machine learning and analysis. However, when there is a requirement for a close to real time usage of online data, a method for dynamic Extract-Transform-Load of new sources data must be developed. In this work, we present a framework for integrating online and enterprise data sources, in close to real time, to provide datasets for machine learning and predictive algorithms. An exhaustive evaluation compares a human built data transformation process with our system’s machine generated ETL process, with very favourable results, illustrating the value and impact of an automated approach

Crossref

Irish Universities

DCU Online Research Access Service

Industry-scale application and evaluation of deep learning for drug target prediction

Author: Ashby Thomas J.
Böhm Stanislav
Ceulemans Hugo
Chen Hongming
Chupakhin Vladimir
Cima Vojtěch
Engkvist Ola
Golib-Dzib Jose-Felipe
Greene Nigel
Hochreiter Sepp
Jeliazkova Nina
Klambauer Günter
Martinovič Jan
Mayr Andreas
Sturm Noe
Van Thanh Le
Vander Aa Tom
Vandriessche Yves
Wegner Joerg
Publication venue: Springer Nature
Publication date: 05/06/2019
Field of study

Artificial intelligence (AI) is undergoing a revolution thanks to the breakthroughs of machine learning algorithms in computer vision, speech recognition, natural language processing and generative modelling. Recent works on publicly available pharmaceutical data showed that AI methods are highly promising for Drug Target prediction. However, the quality of public data might be different than that of industry data due to different labs reporting measurements, different measurement techniques, fewer samples and less diverse and specialized assays. As part of a European funded project (ExCAPE), that brought together expertise from pharmaceutical industry, machine learning, and high-performance computing, we investigated how well machine learning models obtained from public data can be transferred to internal pharmaceutical industry data. Our results show that machine learning models trained on public data can indeed maintain their predictive power to a large degree when applied to industry data. Moreover, we observed that deep learning derived machine learning models outperformed comparable models, which were trained by other machine learning algorithms, when applied to internal pharmaceutical company datasets. To our knowledge, this is the first large-scale study evaluating the potential of machine learning and especially deep learning directly at the level of industry-scale settings and moreover investigating the transferability of publicly learned target prediction models towards industrial bioactivity prediction pipelines.Web of Science121art. no. 2

ZENODO

DSpace at VSB Technical University of Ostrava

The Francis Crick Institute