20 research outputs found

    Principal Component Analysis Visualizations in State Discovery by Animating Exploration Results

    Get PDF
    Visualization is a key point in data exploration. In this paper we have emphasis in adding dynamic features by constructing exploration animations. We use Principal Component Analysis (PCA) in dimensionality reduction and Kmeans clustering algorithm in defining states. In predicting state transitions, we use Hidden Markov Model (HMM). Analyzed physical data is got from self-healing autonomous data centers. Our research methodology is to animate state transitions for data exploration in modern computerized environment. We use Jupyter tool and Python 3 programming language in our experimental realization. As results we get PCA animations for exploration purposes. Our approach is based on state discovery, where it is possible to find some physical interpretations for the defined states and state transitions. State structure and behaviour depend strongly on analyzed data.Peer reviewe

    Towards hardware-driven design of low-energy algorithms for data analysis

    Get PDF
    In the era of &quot;big&quot; data, data analysis algorithms need to be efficient. Traditionally researchers would tackle this problem by considering &quot;small&quot; data algorithms, and investigating how to make them computationally more efficient for big data applications. The main means to achieve computational efficiency would be to revise the necessity and order of subroutines, or to approximate calculations. This paper presents a viewpoint that in order to be able to cope with the new challenges of the growing digital universe, research needs to take a combined view towards data analysis algorithm design and hardware design, and discusses a potential research direction in taking an intreated approach in terms of algorithm design and hardware design. Analyzing how data mining algorithms operate at the elementary operations level can help do design more specialized and dedicated hardware, that, for instance, would be more energy efficient. In turn, understanding hardware design can help to develop more effective algorithms.</p

    Machine Learning Methods for Neonatal Mortality and Morbidity Classification

    Get PDF
    Preterm birth is the leading cause of mortality in children under the age of five. In particular, low birth weight and low gestational age are associated with an increased risk of mortality. Preterm birth also increases the risks of several complications, which can increase the risk of death, or cause long-term morbidities with both individual and societal impacts. In this work, we use machine learning for prediction of neonatal mortality as well as neonatal morbidities of bronchopulmonary dysplasia, necrotizing enterocolitis, and retinopathy of prematurity, among very low birth weight infants. Our predictors include time series data and clinical variables collected at the neonatal intensive care unit of Children's Hospital, Helsinki University Hospital. We examine 9 different classifiers and present our main results in AUROC, similar to our previous studies, and in F1-score, which we propose for classifier selection in this study. We also investigate how the predictive performance of the classifiers evolves as the length of time series is increased, and examine the relative importance of different features using the random forest classifier, which we found to generally perform the best in all tasks. Our systematic study also involves different data preprocessing methods which can be used to improve classifier sensitivities. Our best classifier AUROC is 0.922 in the prediction of mortality, 0.899 in the prediction of bronchopulmonary dysplasia, 0.806 in the prediction of necrotizing enterocolitis, and 0.846 in the prediction of retinopathy of prematurity. Our best classifier F1-score is 0.493 in the prediction of mortality, 0.704 in the prediction of bronchopulmonary dysplasia, 0.215 in the prediction of necrotizing enterocolitis, and 0.368 in the prediction of retinopathy of prematurity.Peer reviewe

    Early oxygen levels contribute to brain injury in extremely preterm infants

    Get PDF
    Background Extremely low gestational age newborns (ELGANs) are at risk of neurodevelopmental impairments that may originate in early NICU care. We hypothesized that early oxygen saturations (SpO(2)), arterial pO(2) levels, and supplemental oxygen (FiO(2)) would associate with later neuroanatomic changes. Methods SpO(2), arterial blood gases, and FiO(2) from 73 ELGANs (GA 26.4 +/- 1.2; BW 867 +/- 179 g) during the first 3 postnatal days were correlated with later white matter injury (WM, MRI, n = 69), secondary cortical somatosensory processing in magnetoencephalography (MEG-SII, n = 39), Hempel neurological examination (n = 66), and developmental quotients of Griffiths Mental Developmental Scales (GMDS, n = 58). Results The ELGANs with later WM abnormalities exhibited lower SpO(2) and pO(2) levels, and higher FiO(2) need during the first 3 days than those with normal WM. They also had higher pCO(2) values. The infants with abnormal MEG-SII showed opposite findings, i.e., displayed higher SpO(2) and pO(2) levels and lower FiO(2) need, than those with better outcomes. Severe WM changes and abnormal MEG-SII were correlated with adverse neurodevelopment. Conclusions Low oxygen levels and high FiO(2) need during the NICU care associate with WM abnormalities, whereas higher oxygen levels correlate with abnormal MEG-SII. The results may indicate certain brain structures being more vulnerable to hypoxia and others to hyperoxia, thus emphasizing the role of strict saturation targets. Impact This study indicates that both abnormally low and high oxygen levels during early NICU care are harmful for later neurodevelopmental outcomes in preterm neonates. Specific brain structures seem to be vulnerable to low and others to high oxygen levels. The findings may have clinical implications as oxygen is one of the most common therapies given in NICUs. The results emphasize the role of strict saturation targets during the early postnatal period in preterm infants.Peer reviewe

    Newtonian boreal forest ecology : The Scots pine ecosystem as an example

    Get PDF
    Isaac Newton's approach to developing theories in his book Principia Mathematica proceeds in four steps. First, he defines various concepts, second, he formulates axioms utilising the concepts, third, he mathematically analyses the behaviour of the system defined by the concepts and axioms obtaining predictions and fourth, he tests the predictions with measurements. In this study, we formulated our theory of boreal forest ecosystems, called NewtonForest, following the four steps introduced by Newton. The forest ecosystem is a complicated entity and hence we needed altogether 27 concepts to describe the material and energy flows in the metabolism of trees, ground vegetation and microbes in the soil, and to describe the regularities in tree structure. Thirtyfour axioms described the most important features in the behaviour of the forest ecosystem. We utilised numerical simulations in the analysis of the behaviour of the system resulting in clear predictions that could be tested with field data. We collected retrospective time series of diameters and heights for test material from 6 stands in southern Finland and five stands in Estonia. The numerical simulations succeeded to predict the measured diameters and heights, providing clear corroboration with our theory.Peer reviewe

    Composite Surrogate for Likelihood-Free Bayesian Optimisation in High-Dimensional Settings of Activity-Based Transportation Models

    No full text
    Activity-based transportation models simulate demand and supply as a complex system and therefore large set of parameters need to be adjusted. One such model is Preday activity-based model that requires adjusting a large set of parameters for its calibration on new urban environments. Hence, the calibration process is time demanding, and due to costly simulations, various optimisation methods with dimensionality reduction and stochastic approximation are adopted. This study adopts Bayesian Optimisation for Likelihood-free Inference (BOLFI) method for calibrating the Preday activity-based model to a new urban area. Unlike the traditional variant of the method that uses Gaussian Process as a surrogate model for approximating the likelihood function through modelling discrepancy, we apply a composite surrogate model that encompasses Random Forest surrogate model for modelling the discrepancy and Gaussian Mixture Model for estimating the its density. The results show that the proposed method benefits the extension and improves the general applicability to high-dimensional settings without losing the efficiency of the Bayesian Optimisation in sampling new samples towards the global optima.Peer reviewe

    Finding Profiles of Forest Nutrition by Clustering of the Self-Organizing Map

    No full text
    Understanding the nutritional states and profiles of tree species is important for monitoring the well-being of forests. Data from foliar surveys are available, but there is still need to better understand the underlying nutritional mechanisms in trees. In this paper, the nutrient concentrations of pine and spruce needles in Finland between 1987--2000 are analyzed to build nutrition profiles. The profiles are built from the data by clustering of the Self-Organizing Map. The VS algorithm divides the data into base clusters using region growing and forms a hierarchy from the base clusters. The hierarchy tree is pruned and the final clusters are selected from the pruned tree. We were able to divide the measurements into six groups. In each group the growth of the needles and the amounts of the nutrients were di#erent and thus, di#erent groups represented di#erent kinds of growing conditions. With the help of the domain expert, using the results of the clustering method, it was possible to construct a temporal model that characterizes the development of the forests of Finland
    corecore