595 research outputs found

    TSPO: an autoML approach to time series forecasting

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsTime series forecasting is an essential tool in many fields. In recent years, machine learning has gained popularity as an appropriate tool for time series forecasting. When employing machine learning algorithms, it is necessary to optimise a machine learning pipeline, which is a tedious manual effort and requires time series analysis and machine learning expertise. AutoML (automatic machine learning) is a sub-field of machine learning research that addresses this issue by providing integrated systems that automatically find machine learning pipelines. However, none of the available open-source tools is yet explicitly designed for time series forecasting. The proposed system TSPO (Time Series Pipeline Optimisation) aims at providing an autoML tool specifically designed to solve time series forecasting tasks to give non-experts the capability to employ machine learning strategies for time series forecasting. The system utilises a genetic algorithm to find an appropriate set of time series features, machine learning models and a set of suitable hyper-parameters. The optimisation objective is defined as minimising the obtained error, which is measured with a time series variant of k-fold cross-validation. TSPO outperformed the official machine learning benchmarks of the M4-Competition in 9 out of 12 randomly selected time series. TSPO captured the characteristics of all analysed time series consistently better compared to the benchmarks. The results indicate that TSPO is capable of producing robust and accurate forecasts without any human input.A previsão de séries temporais é uma importante ferramenta em muitas disciplinas. Nos últimos anos, a aprendizagem automática ganhou popularidade como ferramenta apropriada para a previsão de séries temporais. Ao utilizar algoritmos de aprendizagem automática, é necessário otimizar pipelines de aprendizagem automática, que é um esforço manual, tedioso e que requer experiência na área. O AutoML (aprendizagem automática automatizada) é um subcampo de aprendizagem automática que aborda esse problema, fornecendo sistemas integrados que encontram automaticamente pipelines de aprendizagem automática. No entanto, nenhuma das ferramentas de código aberto disponíveis é explicitamente destinada à previsão de séries temporais. O sistema proposto TSPO (Time Series Pipeline Optimisation) visa fornecer uma ferramenta de aprendizagem automática projetada especificamente para resolver problemas de previsão de séries temporais. Dando a não especialistas a capacidade de utilizar estratégias de aprendizagem automática para previsão de séries temporais. O sistema utiliza um algoritmo genético para encontrar um conjunto apropriado de pipelines de séries temporais, modelos de aprendizagem automática e um conjunto de hiperparâmetros adequados. O objetivo da otimização é definido como a minimização do erro obtido, medido com uma variante da validação cruzada k-fold aplicada a séries temporais. O TSPO superou os benchmarks oficiais de aprendizagem automática da competição M4 em 9 das 12 séries temporais aleatoriamente selecionadas. Além disso o TSPO capturou as características de todas as séries temporais analisadas melhor que os benchmarks. Os resultados indicam que o TSPO é capaz de produzir previsões robustas e precisas sem qualquer contribuição humana

    The Devil Dwells in the Tails A Quantile Regression Approach to Firm Growth

    Get PDF
    This paper explores the firm growth rate distribution in a Gibrat’s Law context. The aim is to provide an empirical exploration of the determinants of firm growth. The work is novel in two respects. First, rather than limiting the analysis to focus on the conditional mean growth level, we investigate the complete shape of the distribution. Second, we show that the differences in the firm growth rate process between large and small firms are highly circumstantial. That industry dynamics have a substantial influence on the relationship between firm size and firm growth. The data used includes more than 9000 Danish firms from manufacturing, services and construction. We provide robust evidence indicating that firm growth studies should be less obsessed with explaining means and instead look to other parts of the firm growth rate distribution.Firm growth; quantile regression; distribution shape

    Markers of early disease and prognosis in COPD

    Get PDF
    COPD is a complex disease with multiple pathological components, which we unfortunately tend to ignore when spirometry is used as the only method to evaluate the disorder. Additional measures are needed to allow a more complete and clinically relevant assessment of COPD. The earliest potential risk factors of disease in COPD are variations in the genetic background. Genetic variations are present from conception and can determine lifelong changes in enzyme activities and protein concentrations. In contrast, measurements in blood, sputum, exhaled breath, broncho-alveolar lavage, and lung biopsies may vary substantially over time. This review explores potential markers of early disease and prognosis in COPD by examining genetic markers in the α1-antitrypsin, cystic fibrosis transmembrane conductance regulator (CFTR), and MBL-2 genes, and by examining the biochemical markers fibrinogen and C-reactive protein (CRP), which correlate with degree of pulmonary inflammation during stable conditions of COPD. Chronic lung inflammation appears to contribute to the pathogenesis of COPD, and markers of this process have promising predictive value in COPD. To implement markers for COPD in clinical practice, besides those already established for the α1-antitrypsin gene, further research and validation studies are needed

    Generalized sampling in Julia

    Get PDF

    Denmark

    Get PDF
    corecore