Search CORE

140,175 research outputs found

Transformation Based Ensembles for Time Series Classification

Author: Bagnall A
Davis L
Hills J
Lines J
Publication venue
Publication date: 19/12/2013
Field of study

Until recently, the vast majority of data mining time series classification (TSC) research has focused on alternative distance measures for 1-Nearest Neighbour (1-NN) classifiers based on either the raw data, or on compressions or smoothing of the raw data. Despite the extensive evidence in favour of 1-NN classifiers with Euclidean or Dynamic Time Warping distance, there has also been a flurry of recent research publications proposing classification algorithms for TSC. Generally, these classifiers describe different ways of incorporating summary measures in the time domain into more complex classifiers. Our hypothesis is that the easiest way to gain improvement on TSC problems is simply to transform into an alternative data space where the discriminatory features are more easily detected. To test our hypothesis, we perform a range of benchmarking experiments in the time domain, before evaluating nearest neighbour classifiers on data transformed into the power spectrum, the autocorrelation function, and the principal component space. We demonstrate that on some problems there is dramatic improvement in the accuracy of classifiers built on the transformed data over classifiers built in the time domain, but that there is also a wide variance in accuracy for a particular classifier built on different data transforms. To overcome this variability, we propose a simple transformation based ensemble, then demonstrate that it improves performance and reduces the variability of classifiers built in the time domain only. Our advice to a practitioner with a real world TSC problem is to try transforms before developing a complex classifier; it is the easiest way to get a potentially large increase in accuracy, and may provide further insights into the underlying relationships that characterise the problem

University of East Anglia digital repository

Classification of time series by shapelet transformation

Author: Anthony Bagnall
C Cortes
C Hoare
C Shannon
C Stransky
D Vries De
Edgaras Baranauskas
H Ding
J Demšar
J Lines
James Mapp
Jason Lines
JJ Rodriguez
Jon Hills
L Breiman
L Ye
M Bober
M Hall
N Friedman
P Duarte-Neto
S Campana
W Kruskal
Y Jeong
Z Xing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2014
Field of study

Time-series classification (TSC) problems present a specific challenge for classification algorithms: how to measure similarity between series. A \emph{shapelet} is a time-series subsequence that allows for TSC based on local, phase-independent similarity in shape. Shapelet-based classification uses the similarity between a shapelet and a series as a discriminatory feature. One benefit of the shapelet approach is that shapelets are comprehensible, and can offer insight into the problem domain. The original shapelet-based classifier embeds the shapelet-discovery algorithm in a decision tree, and uses information gain to assess the quality of candidates, finding a new shapelet at each node of the tree through an enumerative search. Subsequent research has focused mainly on techniques to speed up the search. We examine how best to use the shapelet primitive to construct classifiers. We propose a single-scan shapelet algorithm that finds the best

k

shapelets, which are used to produce a transformed dataset, where each of the

k

features represent the distance between a time series and a shapelet. The primary advantages over the embedded approach are that the transformed data can be used in conjunction with any classifier, and that there is no recursive search for shapelets. We demonstrate that the transformed data, in conjunction with more complex classifiers, gives greater accuracy than the embedded shapelet tree. We also evaluate three similarity measures that produce equivalent results to information gain in less time. Finally, we show that by conducting post-transform clustering of shapelets, we can enhance the interpretability of the transformed data. We conduct our experiments on 29 datasets: 17 from the UCR repository, and 12 we provide ourselve

Crossref

University of East Anglia digital repository

The Role of Person-Organization Fit in Organizational Selection Decisions

Author: Cable Daniel M.
Judge Timothy A.
Publication venue: DigitalCommons@ILR
Publication date: 01/05/1995
Field of study

This paper presents and tests a theoretical model of person-organization fit and organizational selection decisions using data from 35 organizations making hiring decisions. Results suggested that (a) interviewers were able to assess applicants\u27 values with above-chance levels of accuracy, (b) interviewers compare their perceptions of applicants\u27 values with their organizations\u27 values to assess person-organization fit, and (c) it is perceived values congruence and not actual values congruence between applicants and organizations that predicted interviewers\u27 person-organization fit perceptions. Results also suggested that interviewers\u27 person-organization fit assessments had the largest effect on their hiring recommendations even after controlling for competing applicant characteristics (e.g., demographics, human capital), and that interviewers\u27 hiring recommendations had large and significant effects on organizations\u27 hiring decisions (e.g., job offers)

DigitalCommons@ILR

eCommons@Cornell

A Shapelet Transform for Time Series Classification

Author: Bagnall A
Davis L
Hills J
Lines J
Publication venue
Publication date: 14/08/2012
Field of study

University of East Anglia digital repository

An adaptive stigmergy-based system for evaluating technological indicator dynamics in the context of smart specialization

Author: Alfeo A. L.
Appio F. P.
Cimino M. G. C. A.
Lazzeri A.
Martini A.
Vaglini G.
Publication venue
Publication date: 02/01/2019
Field of study

Regional innovation is more and more considered an important enabler of welfare. It is no coincidence that the European Commission has started looking at regional peculiarities and dynamics, in order to focus Research and Innovation Strategies for Smart Specialization towards effective investment policies. In this context, this work aims to support policy makers in the analysis of innovation-relevant trends. We exploit a European database of the regional patent application to determine the dynamics of a set of technological innovation indicators. For this purpose, we design and develop a software system for assessing unfolding trends in such indicators. In contrast with conventional knowledge-based design, our approach is biologically-inspired and based on self-organization of information. This means that a functional structure, called track, appears and stays spontaneous at runtime when local dynamism in data occurs. A further prototyping of tracks allows a better distinction of the critical phenomena during unfolding events, with a better assessment of the progressing levels. The proposed mechanism works if structural parameters are correctly tuned for the given historical context. Determining such correct parameters is not a simple task since different indicators may have different dynamics. For this purpose, we adopt an adaptation mechanism based on differential evolution. The study includes the problem statement and its characterization in the literature, as well as the proposed solving approach, experimental setting and results.Comment: mail: [email protected]

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Genetic distance predicts trait differentiation at the subpopulation but not the individual level in eelgrass, Zostera marina.

Author: Corre V.
Harvey P. H.
Hughes A. R.
Koroleff F.
Platt T.
R Core Team
Williams S. L.
Publication venue: eScholarship, University of California
Publication date: 01/08/2018
Field of study

Ecological studies often assume that genetically similar individuals will be more similar in phenotypic traits, such that genetic diversity can serve as a proxy for trait diversity. Here, we explicitly test the relationship between genetic relatedness and trait distance using 40 eelgrass (Zostera marina) genotypes from five sites within Bodega Harbor, CA. We measured traits related to nutrient uptake, morphology, biomass and growth, photosynthesis, and chemical deterrents for all genotypes. We used these trait measurements to calculate a multivariate pairwise trait distance for all possible genotype combinations. We then estimated pairwise relatedness from 11 microsatellite markers. We found significant trait variation among genotypes for nearly every measured trait; however, there was no evidence of a significant correlation between pairwise genetic relatedness and multivariate trait distance among individuals. However, at the subpopulation level (sites within a harbor), genetic (FST) and trait differentiation were positively correlated. Our work suggests that pairwise relatedness estimated from neutral marker loci is a poor proxy for trait differentiation between individual genotypes. It remains to be seen whether genomewide measures of genetic differentiation or easily measured "master" traits (like body size) might provide good predictions of overall trait differentiation

Crossref

eScholarship - University of California

The European consumer: United in diversity?.

Author: Croux Christophe
Dekimpe Marnik
Lemmens Aurélie
Publication venue
Publication date
Field of study

The ongoing unification which takes place on the European political scene, along with recent advances in consumer mobility and communication technology, raises the question whether the European Union can be treated as a single market to fully exploit the potential synergy effects from pan-European marketing strategies. Previous research, which mostly used domain-specific segmentation bases, has resulted in mixed conclusions. In this paper, a more general segmentation base is adopted, as we consider the homogeneity in the European countries' Consumer Confidence Indicators. Moreover, rather than analyzing more traditional static similarity measures, we adopt the concepts of dynamic correlation and cohesion between countries. The short-run fluctuations in consumer confidence are found to be largely country specific. However, a myopic focus on these fluctuations may inspire management to adopt multi-country strategies, foregoing the potential longer-run benefits from more standardized marketing strategies. Indeed, the Consumer Confidence Indicators become much more homogeneous as the planning horizon is extended. However, this homogeneity is found to remain inversely related to the cultural, economic and geographic distances among the various Member States. Hence, pan-regional rather pan-European strategies are called for.Communication; Consumer confidence; Country; Dynamic correlation; Effects; European unification; European Union; Indicators; Management; Market; Marketing; Planning; Research; Similarity; Strategy; Technology;

Research Papers in Economics

Conglomerate Industry Choice and Product Differentiation

Author: Gerard Hoberg
Gordon M. Phillips
Publication venue
Publication date
Field of study

We use text-based computational analysis of business descriptions from 10-Ks to examine in which industries conglomerates are most likely to operate and to understand conglomerate valuations. We find that conglomerates are more likely to operate in industry pairs that are closer together in the product space and in industry pairs that have profitable opportunities "between" them. Conglomerate firms have lower stock market valuations than matched single-segment firms when their products are easier to replicate with single-segment firms. Conglomerate firms have stock market premiums when they have higher product differentiation and produce in more profitable industries. These findings are consistent with successful conglomerate firms having higher product differentiation and lower cost entry into profitable markets when operating in strategically chosen industry pairs.

Research Papers in Economics