142,637 research outputs found
Handling of Missing Values in Static and Dynamic Data Sets
This thesis contributes by first, conducting a comparative study of traditional and modern classifications by highlighting the differences in their performance. Second, an algorithm to enhance the prediction of values to be used for data imputation with nonlinear models is presented. Third, a novel algorithm model selection to enhance prediction performance in the presence of missing data is presented. It includes an overview of nonlinear model selection with complete data, and provides summary descriptions of Box-Tidwell and fractional polynomial methods for model selection. In particular, it focuses on the fractional polynomial method for nonlinear modelling in cases of missing data. An analysis ex- ample is presented to illustrate the performance of this method
DropIn: Making Reservoir Computing Neural Networks Robust to Missing Inputs by Dropout
The paper presents a novel, principled approach to train recurrent neural
networks from the Reservoir Computing family that are robust to missing part of
the input features at prediction time. By building on the ensembling properties
of Dropout regularization, we propose a methodology, named DropIn, which
efficiently trains a neural model as a committee machine of subnetworks, each
capable of predicting with a subset of the original input features. We discuss
the application of the DropIn methodology in the context of Reservoir Computing
models and targeting applications characterized by input sources that are
unreliable or prone to be disconnected, such as in pervasive wireless sensor
networks and ambient intelligence. We provide an experimental assessment using
real-world data from such application domains, showing how the Dropin
methodology allows to maintain predictive performances comparable to those of a
model without missing features, even when 20\%-50\% of the inputs are not
available
Detecting early signs of depressive and manic episodes in patients with bipolar disorder using the signature-based model
Recurrent major mood episodes and subsyndromal mood instability cause
substantial disability in patients with bipolar disorder. Early identification
of mood episodes enabling timely mood stabilisation is an important clinical
goal. Recent technological advances allow the prospective reporting of mood in
real time enabling more accurate, efficient data capture. The complex nature of
these data streams in combination with challenge of deriving meaning from
missing data mean pose a significant analytic challenge. The signature method
is derived from stochastic analysis and has the ability to capture important
properties of complex ordered time series data. To explore whether the onset of
episodes of mania and depression can be identified using self-reported mood
data.Comment: 12 pages, 3 tables, 10 figure
Accuracy and responses of genomic selection on key traits in apple breeding
open13siThe application of genomic selection in fruit tree crops is expected to enhance breeding efficiency by increasing prediction accuracy, increasing selection intensity and decreasing generation interval. The objectives of this study were to assess the accuracy of prediction and selection response in commercial apple breeding programmes for key traits. The training population comprised 977 individuals derived from 20 pedigreed full-sib families. Historic phenotypic data were available on 10 traits related to productivity and fruit external appearance and genotypic data for 7829 SNPs obtained with an Illumina 20K SNP array. From these data, a genome-wide prediction model was built and subsequently used to calculate genomic breeding values of five application full-sib families. The application families had genotypes at 364 SNPs from a dedicated 512 SNP array, and these genotypic data were extended to the high-density level by imputation. These five families were phenotyped for 1 year and their phenotypes were compared to the predicted breeding values. Accuracy of genomic prediction across the 10 traits reached a maximum value of 0.5 and had a median value of 0.19. The accuracies were strongly affected by the phenotypic distribution and heritability of traits. In the largest family, significant selection response was observed for traits with high heritability and symmetric phenotypic distribution. Traits that showed non-significant response often had reduced and skewed phenotypic variation or low heritability. Among the five application families the accuracies were uncorrelated to the degree of relatedness to the training population. The results underline the potential of genomic prediction to accelerate breeding progress in outbred fruit tree crops that still need to overcome long generation intervals and extensive phenotyping costs.openMuranty, H.; Troggio, M.; Sadok, I.B.; Mehdi A.R.; Auwerkerken, A.; Banchi, E.; Velasco, R.; Stevanato, P.; Eric van de Weg, W.; Di Guardo, M.; Kumar, S.; Laurens, F.; Bink, M.C.A.M.Muranty, H.; Troggio, M.; Sadok, I. B.; Mehdi, A. R.; Auwerkerken, A.; Banchi, E.; Velasco, R.; Stevanato, Piergiorgio; Eric van de Weg, W.; Di Guardo, M.; Kumar, S.; Laurens, F.; Bink, M. C. A. M
Using Big Data to Enhance the Bosch Production Line Performance: A Kaggle Challenge
This paper describes our approach to the Bosch production line performance
challenge run by Kaggle.com. Maximizing the production yield is at the heart of
the manufacturing industry. At the Bosch assembly line, data is recorded for
products as they progress through each stage. Data science methods are applied
to this huge data repository consisting records of tests and measurements made
for each component along the assembly line to predict internal failures. We
found that it is possible to train a model that predicts which parts are most
likely to fail. Thus a smarter failure detection system can be built and the
parts tagged likely to fail can be salvaged to decrease operating costs and
increase the profit margins.Comment: IEEE Big Data 2016 Conferenc
Invisible Z-Boson Decays at e+e- Colliders
The measurement of the invisible Z-boson decay width at e+e- colliders can be
done "indirectly", by subtracting the Z-boson visible partial widths from the
Z-boson total width, or "directly", from the process e+e- -> \gamma \nu
\bar{\nu}. Both procedures are sensitive to different types of new physics and
provide information about the couplings of the neutrinos to the Z-boson. At
present, measurements at LEP and CHARM II are capable of constraining the
left-handed Z\nu\nu-coupling, 0.45 <~ g_L <~ 0.5, while the right-handed one is
only mildly bounded, |g_R| <= 0.2. We show that measurements at a future e+e-
linear collider at different center-of-mass energies, \sqrt{s} = MZ and
\sqrt{s}s ~ 170 GeV, would translate into a markedly more precise measurement
of the Z\nu\nu-couplings. A statistically significant deviation from Standard
Model predictions will point toward different new physics mechanisms, depending
on whether the discrepancy appears in the direct or the indirect measurement of
the invisible Z-width. We discuss some scenarios which illustrate the ability
of different invisible Z-boson decay measurements to constrain new physics
beyond the Standard Model
- âŠ