6,618 research outputs found
Recommended from our members
Developing Children's Oral Health Assessment Toolkits Using Machine Learning Algorithm.
ObjectivesEvaluating children's oral health status and treatment needs is challenging. We aim to build oral health assessment toolkits to predict Children's Oral Health Status Index (COHSI) score and referral for treatment needs (RFTN) of oral health. Parent and Child toolkits consist of short-form survey items (12 for children and 8 for parents) with and without children's demographic information (7 questions) to predict the child's oral health status and need for treatment.MethodsData were collected from 12 dental practices in Los Angeles County from 2015 to 2016. We predicted COHSI score and RFTN using random Bootstrap samples with manually introduced Gaussian noise together with machine learning algorithms, such as Extreme Gradient Boosting and Naive Bayesian algorithms (using R). The toolkits predicted the probability of treatment needs and the COHSI score with percentile (ranking). The performance of the toolkits was evaluated internally and externally by residual mean square error (RMSE), correlation, sensitivity and specificity.ResultsThe toolkits were developed based on survey responses from 545 families with children aged 2 to 17 y. The sensitivity and specificity for predicting RFTN were 93% and 49% respectively with the external data. The correlation(s) between predicted and clinically determined COHSI was 0.88 (and 0.91 for its percentile). The RMSEs of the COHSI toolkit were 4.2 for COHSI (and 1.3 for its percentile).ConclusionsSurvey responses from children and their parents/guardians are predictive for clinical outcomes. The toolkits can be used by oral health programs at baseline among school populations. The toolkits can also be used to quantify differences between pre- and post-dental care program implementation. The toolkits' predicted oral health scores can be used to stratify samples in oral health research.Knowledge transfer statementThis study creates the oral health toolkits that combine self- and proxy- reported short forms with children's demographic characteristics to predict children's oral health and treatment needs using Machine Learning algorithms. The toolkits can be used by oral health programs at baseline among school populations to quantify differences between pre and post dental care program implementation. The toolkits can also be used to stratify samples according to the treatment needs and oral health status
Recommended from our members
Building thermal load prediction through shallow machine learning and deep learning
Building thermal load prediction informs the optimization of cooling plant and thermal energy storage. Physics-based prediction models of building thermal load are constrained by the model and input complexity. In this study, we developed 12 data-driven models (7 shallow learning, 2 deep learning, and 3 heuristic methods) to predict building thermal load and compared shallow machine learning and deep learning. The 12 prediction models were compared with the measured cooling demand. It was found XGBoost (Extreme Gradient Boost) and LSTM (Long Short Term Memory) provided the most accurate load prediction in the shallow and deep learning category, and both outperformed the best baseline model, which uses the previous day's data for prediction. Then, we discussed how the prediction horizon and input uncertainty would influence the load prediction accuracy. Major conclusions are twofold: first, LSTM performs well in short-term prediction (1 h ahead) but not in long term prediction (24 h ahead), because the sequential information becomes less relevant and accordingly not so useful when the prediction horizon is long. Second, the presence of weather forecast uncertainty deteriorates XGBoost's accuracy and favors LSTM, because the sequential information makes the model more robust to input uncertainty. Training the model with the uncertain rather than accurate weather data could enhance the model's robustness. Our findings have two implications for practice. First, LSTM is recommended for short-term load prediction given that weather forecast uncertainty is unavoidable. Second, XGBoost is recommended for long term prediction, and the model should be trained with the presence of input uncertainty
Deep learning for supervised classification
One of the most recent area in the Machine Learning research is Deep Learning. Deep Learning algorithms have been applied successfully to computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics. The key idea of Deep Learning is to combine the best techniques from Machine Learning to build powerful general‑purpose learning algorithms. It is a mistake to identify Deep Neural Networks with Deep Learning Algorithms. Other approaches are possible, and in this paper we illustrate a generalization of Stacking which has very competitive performances. In particular, we show an application of this approach to a real classification problem, where a three-stages Stacking has proved to be very effective
Multi-Person Brain Activity Recognition via Comprehensive EEG Signal Analysis
An electroencephalography (EEG) based brain activity recognition is a
fundamental field of study for a number of significant applications such as
intention prediction, appliance control, and neurological disease diagnosis in
smart home and smart healthcare domains. Existing techniques mostly focus on
binary brain activity recognition for a single person, which limits their
deployment in wider and complex practical scenarios. Therefore, multi-person
and multi-class brain activity recognition has obtained popularity recently.
Another challenge faced by brain activity recognition is the low recognition
accuracy due to the massive noises and the low signal-to-noise ratio in EEG
signals. Moreover, the feature engineering in EEG processing is time-consuming
and highly re- lies on the expert experience. In this paper, we attempt to
solve the above challenges by proposing an approach which has better EEG
interpretation ability via raw Electroencephalography (EEG) signal analysis for
multi-person and multi-class brain activity recognition. Specifically, we
analyze inter-class and inter-person EEG signal characteristics, based on which
to capture the discrepancy of inter-class EEG data. Then, we adopt an
Autoencoder layer to automatically refine the raw EEG signals by eliminating
various artifacts. We evaluate our approach on both a public and a local EEG
datasets and conduct extensive experiments to explore the effect of several
factors (such as normalization methods, training data size, and Autoencoder
hidden neuron size) on the recognition results. The experimental results show
that our approach achieves a high accuracy comparing to competitive
state-of-the-art methods, indicating its potential in promoting future research
on multi-person EEG recognition.Comment: 10 page
Axiomatic Interpretability for Multiclass Additive Models
Generalized additive models (GAMs) are favored in many regression and binary
classification problems because they are able to fit complex, nonlinear
functions while still remaining interpretable. In the first part of this paper,
we generalize a state-of-the-art GAM learning algorithm based on boosted trees
to the multiclass setting, and show that this multiclass algorithm outperforms
existing GAM learning algorithms and sometimes matches the performance of full
complexity models such as gradient boosted trees.
In the second part, we turn our attention to the interpretability of GAMs in
the multiclass setting. Surprisingly, the natural interpretability of GAMs
breaks down when there are more than two classes. Naive interpretation of
multiclass GAMs can lead to false conclusions. Inspired by binary GAMs, we
identify two axioms that any additive model must satisfy in order to not be
visually misleading. We then develop a technique called Additive
Post-Processing for Interpretability (API), that provably transforms a
pre-trained additive model to satisfy the interpretability axioms without
sacrificing accuracy. The technique works not just on models trained with our
learning algorithm, but on any multiclass additive model, including multiclass
linear and logistic regression. We demonstrate the effectiveness of API on a
12-class infant mortality dataset.Comment: KDD 201
- …