595 research outputs found
TSPO: an autoML approach to time series forecasting
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsTime series forecasting is an essential tool in many fields. In recent years, machine learning
has gained popularity as an appropriate tool for time series forecasting. When employing
machine learning algorithms, it is necessary to optimise a machine learning pipeline, which is a
tedious manual effort and requires time series analysis and machine learning expertise. AutoML
(automatic machine learning) is a sub-field of machine learning research that addresses this issue
by providing integrated systems that automatically find machine learning pipelines. However,
none of the available open-source tools is yet explicitly designed for time series forecasting.
The proposed system TSPO (Time Series Pipeline Optimisation) aims at providing an
autoML tool specifically designed to solve time series forecasting tasks to give non-experts the
capability to employ machine learning strategies for time series forecasting. The system utilises
a genetic algorithm to find an appropriate set of time series features, machine learning models
and a set of suitable hyper-parameters. The optimisation objective is defined as minimising the
obtained error, which is measured with a time series variant of k-fold cross-validation.
TSPO outperformed the official machine learning benchmarks of the M4-Competition in 9
out of 12 randomly selected time series. TSPO captured the characteristics of all analysed time
series consistently better compared to the benchmarks.
The results indicate that TSPO is capable of producing robust and accurate forecasts without
any human input.A previsão de séries temporais é uma importante ferramenta em muitas disciplinas. Nos últimos
anos, a aprendizagem automática ganhou popularidade como ferramenta apropriada para a
previsão de séries temporais. Ao utilizar algoritmos de aprendizagem automática, é necessário
otimizar pipelines de aprendizagem automática, que é um esforço manual, tedioso e que requer
experiência na área. O AutoML (aprendizagem automática automatizada) é um subcampo
de aprendizagem automática que aborda esse problema, fornecendo sistemas integrados que
encontram automaticamente pipelines de aprendizagem automática. No entanto, nenhuma
das ferramentas de código aberto disponíveis é explicitamente destinada à previsão de séries
temporais.
O sistema proposto TSPO (Time Series Pipeline Optimisation) visa fornecer uma ferramenta
de aprendizagem automática projetada especificamente para resolver problemas de previsão de
séries temporais. Dando a não especialistas a capacidade de utilizar estratégias de aprendizagem
automática para previsão de séries temporais. O sistema utiliza um algoritmo genético para
encontrar um conjunto apropriado de pipelines de séries temporais, modelos de aprendizagem
automática e um conjunto de hiperparâmetros adequados. O objetivo da otimização é definido
como a minimização do erro obtido, medido com uma variante da validação cruzada k-fold
aplicada a séries temporais.
O TSPO superou os benchmarks oficiais de aprendizagem automática da competição M4
em 9 das 12 séries temporais aleatoriamente selecionadas. Além disso o TSPO capturou as
características de todas as séries temporais analisadas melhor que os benchmarks. Os resultados
indicam que o TSPO é capaz de produzir previsões robustas e precisas sem qualquer contribuição
humana
The Devil Dwells in the Tails A Quantile Regression Approach to Firm Growth
This paper explores the firm growth rate distribution in a Gibrat’s Law context. The aim is to provide an empirical exploration of the determinants of firm growth. The work is novel in two respects. First, rather than limiting the analysis to focus on the conditional mean growth level, we investigate the complete shape of the distribution. Second, we show that the differences in the firm growth rate process between large and small firms are highly circumstantial. That industry dynamics have a substantial influence on the relationship between firm size and firm growth. The data used includes more than 9000 Danish firms from manufacturing, services and construction. We provide robust evidence indicating that firm growth studies should be less obsessed with explaining means and instead look to other parts of the firm growth rate distribution.Firm growth; quantile regression; distribution shape
Markers of early disease and prognosis in COPD
COPD is a complex disease with multiple pathological components, which we unfortunately tend to ignore when spirometry is used as the only method to evaluate the disorder. Additional measures are needed to allow a more complete and clinically relevant assessment of COPD. The earliest potential risk factors of disease in COPD are variations in the genetic background. Genetic variations are present from conception and can determine lifelong changes in enzyme activities and protein concentrations. In contrast, measurements in blood, sputum, exhaled breath, broncho-alveolar lavage, and lung biopsies may vary substantially over time. This review explores potential markers of early disease and prognosis in COPD by examining genetic markers in the α1-antitrypsin, cystic fibrosis transmembrane conductance regulator (CFTR), and MBL-2 genes, and by examining the biochemical markers fibrinogen and C-reactive protein (CRP), which correlate with degree of pulmonary inflammation during stable conditions of COPD. Chronic lung inflammation appears to contribute to the pathogenesis of COPD, and markers of this process have promising predictive value in COPD. To implement markers for COPD in clinical practice, besides those already established for the α1-antitrypsin gene, further research and validation studies are needed
- …