Search CORE

4,536 research outputs found

Combining time-series and textual data for taxi demand prediction in event areas: a deep learning approach

Author: Markou Ioulia
Pereira Francisco
Rodrigues Filipe
Publication venue: 'Elsevier BV'
Publication date: 16/08/2018
Field of study

Accurate time-series forecasting is vital for numerous areas of application such as transportation, energy, finance, economics, etc. However, while modern techniques are able to explore large sets of temporal data to build forecasting models, they typically neglect valuable information that is often available under the form of unstructured text. Although this data is in a radically different format, it often contains contextual explanations for many of the patterns that are observed in the temporal data. In this paper, we propose two deep learning architectures that leverage word embeddings, convolutional layers and attention mechanisms for combining text information with time-series data. We apply these approaches for the problem of taxi demand forecasting in event areas. Using publicly available taxi data from New York, we empirically show that by fusing these two complementary cross-modal sources of information, the proposed models are able to significantly reduce the error in the forecasts.Comment: 20 pages, 6 figure

arXiv.org e-Print Archive

Econometrics meets sentiment : an overview of methodology and applications

Author: Algaba Andres
Ardia David
Bluteau Keven
Borms Samuel
Boudt Kris
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software

Algorithm Runtime Prediction: Methods & Evaluation

Author: Hoos Holger H.
Hutter Frank
Leyton-Brown Kevin
Xu Lin
Publication venue
Publication date: 26/10/2013
Field of study

Perhaps surprisingly, it is possible to predict how long an algorithm will take to run on a previously unseen input, using machine learning techniques to build a model of the algorithm's runtime as a function of problem-specific instance features. Such models have important applications to algorithm analysis, portfolio-based algorithm selection, and the automatic configuration of parameterized algorithms. Over the past decade, a wide variety of techniques have been studied for building such models. Here, we describe extensions and improvements of existing models, new families of models, and -- perhaps most importantly -- a much more thorough treatment of algorithm parameters as model inputs. We also comprehensively describe new and existing features for predicting algorithm runtime for propositional satisfiability (SAT), travelling salesperson (TSP) and mixed integer programming (MIP) problems. We evaluate these innovations through the largest empirical analysis of its kind, comparing to a wide range of runtime modelling techniques from the literature. Our experiments consider 11 algorithms and 35 instance distributions; they also span a very wide range of SAT, MIP, and TSP instances, with the least structured having been generated uniformly at random and the most structured having emerged from real industrial applications. Overall, we demonstrate that our new models yield substantially better runtime predictions than previous approaches in terms of their generalization to new problem instances, to new algorithms from a parameterized space, and to both simultaneously.Comment: 51 pages, 13 figures, 8 tables. Added references, feature cost, and experiments with subsets of features; reworded Sections 1&

arXiv.org e-Print Archive

Automated Treatment Planning in Radiation Therapy using Generative Adversarial Networks

Author: Babier Aaron
Chan Timothy C. Y.
Diamant Adam
Mahmood Rafid
McNiven Andrea
Publication venue
Publication date: 17/07/2018
Field of study

Knowledge-based planning (KBP) is an automated approach to radiation therapy treatment planning that involves predicting desirable treatment plans before they are then corrected to deliverable ones. We propose a generative adversarial network (GAN) approach for predicting desirable 3D dose distributions that eschews the previous paradigms of site-specific feature engineering and predicting low-dimensional representations of the plan. Experiments on a dataset of oropharyngeal cancer patients show that our approach significantly outperforms previous methods on several clinical satisfaction criteria and similarity metrics.Comment: 15 pages. Accepted for publication in PMLR. Presented at Machine Learning for Health Car

arXiv.org e-Print Archive

Deep learning with convolutional neural networks for EEG decoding and visualization

Author: Ball Tonio
Burgard Wolfram
Eggensperger Katharina
Fiederer Lukas Dominique Josef
Glasstetter Martin
Hutter Frank
Schirrmeister Robin Tibor
Springenberg Jost Tobias
Tangermann Michael
Publication venue: 'Wiley'
Publication date: 08/06/2018
Field of study

PLEASE READ AND CITE THE REVISED VERSION at Human Brain Mapping: http://onlinelibrary.wiley.com/doi/10.1002/hbm.23730/full Code available here: https://github.com/robintibor/braindecodeComment: A revised manuscript (with the new title) has been accepted at Human Brain Mapping, see http://onlinelibrary.wiley.com/doi/10.1002/hbm.23730/ful

arXiv.org e-Print Archive

Probabilistic Forecasts of Solar Irradiance by Stochastic Differential Equations

Author: Iversen Emil B.
Madsen Henrik
Morales Juan M.
Møller Jan K.
Publication venue
Publication date: 25/10/2013
Field of study

Probabilistic forecasts of renewable energy production provide users with valuable information about the uncertainty associated with the expected generation. Current state-of-the-art forecasts for solar irradiance have focused on producing reliable \emph{point} forecasts. The additional information included in probabilistic forecasts may be paramount for decision makers to efficiently make use of this uncertain and variable generation. In this paper, a stochastic differential equation (SDE) framework for modeling the uncertainty associated with the solar irradiance point forecast is proposed. This modeling approach allows for characterizing both the interdependence structure of prediction errors of short-term solar irradiance and their predictive distribution. A series of different SDE models are fitted to a training set and subsequently evaluated on a one-year test set. The final model proposed is defined on a bounded and time-varying state space with zero probability almost surely of events outside this space.Comment: 33 pages, 3 figure

arXiv.org e-Print Archive

A Holistic Framework for AI Systems in Industrial Applications

Author: Kaymakci Can
Sauer Alexander
Wenninger Simon
Publication venue: AIS Electronic Library (AISeL)
Publication date: 16/02/2021
Field of study

AIS Electronic Library (AISeL)

Visual Semantic Embedding Model based on DeViSE for medical imaging

Author: Diogo Ludgero da Silva
Publication venue
Publication date: 22/02/2021
Field of study

Dissertação de mestrado em Informatics EngineeringDuring the last decades, artificial intelligence algorithms have been evolving to the point that they can achieve some amazing results like, identify and navigate roads, identify fraudulent transactions, personalize crops to individual conditions, discover new consumer trends, predict personalized health outcomes, optimize merchandising strategies, predict maintenance, optimize pricing and scheduling in real-time, diagnose diseases, among many others. However, although it can do all of that, it needs all the data to be correctly label, in other words, it can not, for example, diagnose a disease, such as a stroke, if it does not know what a stroke is, so if the algorithm has never been trained to identify strokes a new algorithm has to be created or the current one has to be retrained, similar issues happen in the other examples. This work focuses on this problem and tries to solve it by using a related in a high dimensional vector space, called semantic space, where the knowledge from known classes can be transferred to unknown classes.Durante as últimas décadas, os algoritmos de inteligência artificial têm evoluído ao ponto de alcançarem resultados incríveis, como identificar e navegar estradas, identificar transações fraudulentas, personalizar colheitas para condições individuais, descobrir novas tendências de consumo, prever resultados de saúde personalizados, otimizar merchandising estratégias, prever manutenções, otimizar preços e agendamentos em tempo real, diagnosticar doenças, entre muitos outros. Porém, embora possa fazer tudo isso, precisa que todos os dados sejam identificados corretamente, ou seja, não pode, por exemplo, diagnosticar uma doença, como um acidente vascular cerebral, se não souber o que é um AVC, portanto, se o algoritmo nunca foi treinado para identificar AVC’s um novo algoritmo precisa de ser criado ou o atual de ser retreinado, problemas semelhantes acontecem nos outros exemplos. Esta tese foca-se neste problema e tenta resolvê-lo usando um espaço vetorial relacionado de alta dimensão, denominado espaço semântico, onde o conhecimento de classes conhecidas pode ser transferido para classes desconhecidas

Nonnegative Restricted Boltzmann Machines for Parts-based Representations Discovery and Predictive Model Stabilization

Author: Nguyen Tu Dinh
Phung Dinh
Tran Truyen
Venkatesh Svetha
Publication venue
Publication date: 18/08/2017
Field of study

The success of any machine learning system depends critically on effective representations of data. In many cases, it is desirable that a representation scheme uncovers the parts-based, additive nature of the data. Of current representation learning schemes, restricted Boltzmann machines (RBMs) have proved to be highly effective in unsupervised settings. However, when it comes to parts-based discovery, RBMs do not usually produce satisfactory results. We enhance such capacity of RBMs by introducing nonnegativity into the model weights, resulting in a variant called nonnegative restricted Boltzmann machine (NRBM). The NRBM produces not only controllable decomposition of data into interpretable parts but also offers a way to estimate the intrinsic nonlinear dimensionality of data, and helps to stabilize linear predictive models. We demonstrate the capacity of our model on applications such as handwritten digit recognition, face recognition, document classification and patient readmission prognosis. The decomposition quality on images is comparable with or better than what produced by the nonnegative matrix factorization (NMF), and the thematic features uncovered from text are qualitatively interpretable in a similar manner to that of the latent Dirichlet allocation (LDA). The stability performance of feature selection on medical data is better than RBM and competitive with NMF. The learned features, when used for classification, are more discriminative than those discovered by both NMF and LDA and comparable with those by RBM

arXiv.org e-Print Archive

The Survey of Data Mining Applications And Feature Scope

Author: Mishra Dr. Pragnyaban
Padhy Neelamadhab
Panigrahi Rasmita
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 24/11/2012
Field of study

In this paper we have focused a variety of techniques, approaches and different areas of the research which are helpful and marked as the important field of data mining Technologies. As we are aware that many Multinational companies and large organizations are operated in different places of the different countries.Each place of operation may generate large volumes of data. Corporate decision makers require access from all such sources and take strategic decisions.The data warehouse is used in the significant business value by improving the effectiveness of managerial decision-making. In an uncertain and highly competitive business environment, the value of strategic information systems such as these are easily recognized however in todays business environment,efficiency or speed is not the only key for competitiveness.This type of huge amount of data are available in the form of tera-topeta-bytes which has drastically changed in the areas of science and engineering.To analyze,manage and make a decision of such type of huge amount of data we need techniques called the data mining which will transforming in many fields.This paper imparts more number of applications of the data mining and also focuses scope of the data mining which will helpful in the further research.Comment: International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.3, June 2012, 16 pages, 1 tabl

arXiv.org e-Print Archive