2,184 research outputs found
Towards a Semantic Search Engine for Scientific Articles
Because of the data deluge in scientific publication, finding relevant
information is getting harder and harder for researchers and readers. Building
an enhanced scientific search engine by taking semantic relations into account
poses a great challenge. As a starting point, semantic relations between
keywords from scientific articles could be extracted in order to classify
articles. This might help later in the process of browsing and searching for
content in a meaningful scientific way. Indeed, by connecting keywords, the
context of the article can be extracted. This paper aims to provide ideas to
build such a smart search engine and describes the initial contributions
towards achieving such an ambitious goal
Transfer learning for time series classification
Transfer learning for deep neural networks is the process of first training a
base network on a source dataset, and then transferring the learned features
(the network's weights) to a second network to be trained on a target dataset.
This idea has been shown to improve deep neural network's generalization
capabilities in many computer vision tasks such as image recognition and object
localization. Apart from these applications, deep Convolutional Neural Networks
(CNNs) have also recently gained popularity in the Time Series Classification
(TSC) community. However, unlike for image recognition problems, transfer
learning techniques have not yet been investigated thoroughly for the TSC task.
This is surprising as the accuracy of deep learning models for TSC could
potentially be improved if the model is fine-tuned from a pre-trained neural
network instead of training it from scratch. In this paper, we fill this gap by
investigating how to transfer deep CNNs for the TSC task. To evaluate the
potential of transfer learning, we performed extensive experiments using the
UCR archive which is the largest publicly available TSC benchmark containing 85
datasets. For each dataset in the archive, we pre-trained a model and then
fine-tuned it on the other datasets resulting in 7140 different deep neural
networks. These experiments revealed that transfer learning can improve or
degrade the model's predictions depending on the dataset used for transfer.
Therefore, in an effort to predict the best source dataset for a given target
dataset, we propose a new method relying on Dynamic Time Warping to measure
inter-datasets similarities. We describe how our method can guide the transfer
to choose the best source dataset leading to an improvement in accuracy on 71
out of 85 datasets.Comment: Accepted at IEEE International Conference on Big Data 201
Adversarial Attacks on Deep Neural Networks for Time Series Classification
Time Series Classification (TSC) problems are encountered in many real life
data mining tasks ranging from medicine and security to human activity
recognition and food safety. With the recent success of deep neural networks in
various domains such as computer vision and natural language processing,
researchers started adopting these techniques for solving time series data
mining problems. However, to the best of our knowledge, no previous work has
considered the vulnerability of deep learning models to adversarial time series
examples, which could potentially make them unreliable in situations where the
decision taken by the classifier is crucial such as in medicine and security.
For computer vision problems, such attacks have been shown to be very easy to
perform by altering the image and adding an imperceptible amount of noise to
trick the network into wrongly classifying the input image. Following this line
of work, we propose to leverage existing adversarial attack mechanisms to add a
special noise to the input time series in order to decrease the network's
confidence when classifying instances at test time. Our results reveal that
current state-of-the-art deep learning time series classifiers are vulnerable
to adversarial attacks which can have major consequences in multiple domains
such as food safety and quality assurance.Comment: Accepted at IJCNN 201
Deep learning for time series classification: a review
Time Series Classification (TSC) is an important and challenging problem in
data mining. With the increase of time series data availability, hundreds of
TSC algorithms have been proposed. Among these methods, only a few have
considered Deep Neural Networks (DNNs) to perform this task. This is surprising
as deep learning has seen very successful applications in the last years. DNNs
have indeed revolutionized the field of computer vision especially with the
advent of novel deeper architectures such as Residual and Convolutional Neural
Networks. Apart from images, sequential data such as text and audio can also be
processed with DNNs to reach state-of-the-art performance for document
classification and speech recognition. In this article, we study the current
state-of-the-art performance of deep learning algorithms for TSC by presenting
an empirical study of the most recent DNN architectures for TSC. We give an
overview of the most successful deep learning applications in various time
series domains under a unified taxonomy of DNNs for TSC. We also provide an
open source deep learning framework to the TSC community where we implemented
each of the compared approaches and evaluated them on a univariate TSC
benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By
training 8,730 deep learning models on 97 time series datasets, we propose the
most exhaustive study of DNNs for TSC to date.Comment: Accepted at Data Mining and Knowledge Discover
Deep constrained clustering applied to satellite image time series
International audienceThe advent of satellite imagery is generating an unprecedented amount of remote sensing images. Current satellites now achieve frequent revisits and high mission availability and provide series of images of the Earth captured at different dates that can be seen as time series. Analyzing satellite image time series allows to perform continuous wide range Earth observation with applications in agricultural mapping , environmental disaster monitoring, etc. However, the lack of large quantity of labeled data generally prevents from easily applying supervised methods. On the contrary, unsupervised methods do not require expert knowledge but sometimes provide poor results. In this context, constrained clustering, which is a class of semi-supervised learning algorithms , is an alternative and offers a good trade-off of supervision. In this paper, we explore the use of constraints with deep clustering approaches to process satellite image time series. Our experimental study relies on deep embedded clustering and the deep constrained framework using pairwise constraints (must-link and cannot-link). Experiments on a real dataset composed of 11 satellite images show promising results and open many perspectives for applying deep constrained clustering to satellite image time series
Evaluation of an integrated system for classification, assessment and comparison of services for long-term care in Europe: the eDESDE-LTC study
The harmonization of European health systems brings with it a need for tools to allow the standardized collection of information about medical care. A common coding system and standards for the description of services are needed to allow local data to be incorporated into evidence-informed policy, and to permit equity and mobility to be assessed. The aim of this project has been to design such a classification and a related tool for the coding of services for Long Term Care (DESDE-LTC), based on the European Service Mapping Schedule (ESMS). Methods. The development of DESDE-LTC followed an iterative process using nominal groups in 6 European countries. 54 researchers and stakeholders in health and social services contributed to this process. In order to classify services, we use the minimal organization unit or "Basic Stable Input of Care" (BSIC), coded by its principal function or "Main Type of Care" (MTC). The evaluation of the tool included an analysis of feasibility, consistency, ontology, inter-rater reliability, Boolean Factor Analysis, and a preliminary impact analysis (screening, scoping and appraisal). Results: DESDE-LTC includes an alpha-numerical coding system, a glossary and an assessment instrument for mapping and counting LTC. It shows high feasibility, consistency, inter-rater reliability and face, content and construct validity. DESDE-LTC is ontologically consistent. It is regarded by experts as useful and relevant for evidence-informed decision making. Conclusion: DESDE-LTC contributes to establishing a common terminology, taxonomy and coding of LTC services in a European context, and a standard procedure for data collection and international comparison
ShapeDBA: Generating Effective Time Series Prototypes using ShapeDTW Barycenter Averaging
Time series data can be found in almost every domain, ranging from the
medical field to manufacturing and wireless communication. Generating realistic
and useful exemplars and prototypes is a fundamental data analysis task. In
this paper, we investigate a novel approach to generating realistic and useful
exemplars and prototypes for time series data. Our approach uses a new form of
time series average, the ShapeDTW Barycentric Average. We therefore turn our
attention to accurately generating time series prototypes with a novel
approach. The existing time series prototyping approaches rely on the Dynamic
Time Warping (DTW) similarity measure such as DTW Barycentering Average (DBA)
and SoftDBA. These last approaches suffer from a common problem of generating
out-of-distribution artifacts in their prototypes. This is mostly caused by the
DTW variant used and its incapability of detecting neighborhood similarities,
instead it detects absolute similarities. Our proposed method, ShapeDBA, uses
the ShapeDTW variant of DTW, that overcomes this issue. We chose time series
clustering, a popular form of time series analysis to evaluate the outcome of
ShapeDBA compared to the other prototyping approaches. Coupled with the k-means
clustering algorithm, and evaluated on a total of 123 datasets from the UCR
archive, our proposed averaging approach is able to achieve new
state-of-the-art results in terms of Adjusted Rand Index.Comment: Published in AALTD workshop at ECML/PKDD 202
- …