1,800 research outputs found
Recommended from our members
COMPARING THE EFFECTIVENESS OF DIFFERENT BOOSTING ALGORITHMS FOR GROUND WATER QUALITY IN TELANGANA REGION
This culminating experience research project explores the parameters needed to predict the water quality levels for use in different climatic conditions pre and post monsoon from 2018 to 2020 in Telangana State, India. A study was conducted on the water quality analysis by using linear regression with water quality Index in Telangana region. However, in this study we are replicating the water quality analysis by using stack model and machine learning algorithms such as Light Gradient Boosting Machine, Random Forest and Artificial Neural Network. The research Questions are: (Q1) What are the sources of the significant parameters that impact groundwater quality in a location? (Q2) Will the use of the stacked model analysis approach produce different results when applied to the Telangana dataset? (Q3) How does the size and nature of a dataset impact the effectiveness of ensemble techniques, such as stacking, for addressing class imbalance in groundwater quality prediction models? The finding are: (Q1) Sodium and Magnesium parameters values have been calculated for Sodium Adsorption Ratio (SAR) for the ground water samples. Based on these parameter electrical conductivity EC and SAR values, Salinity hazard values calculated and converted into different classes like Low C1 (EC2250). Sodium Hazard Classes Low S1 (SAR \u3c 10), Medium S2 (SAR 10 – 18), High S3 (SAR 18-26), Very High S4 (SAR \u3e 26). In comparing of 2018, 2019 and 2020 dataset of water quality analysis, increased in ranges Sodium (5.07 to 748), Calcium (1.2 to 640.0), Magnesium ((4.86 to 457.02), Electrical Conductivity (102 to 9499). (Q2). Stacked models achieved the best performance with use of different classifiers in terms of accuracy (the individual models of Random forecast 97%, Light GBM 97% and calculation of two predicted probability values passes through ANN which model accuracy increasers to 98%) to predict the water quality by collecting the data from different regions and climatic conditions based on the suitability of water salinity and sodium content. (Q3) In order to manage imbalanced data and increase prediction accuracy by calculating the model performance by using classification report of random forest, LGBM and ANN these are the values which are varying in performance F1 Score. For Class Marginal (RF-0.63), (LGBM-0.67), ANN increased to performance to (0.76). Class Poor (RF-0.95), (LGBM-0.95), ANN increased to performance to (0.96), Class Very Poor (RF-0.77), (LGBM-0.77), ANN increased to performance to (0.86). For classes Excellent and good F1 Score for 3 models are1 and for Permissible three models got 0.99. The conclusions are: (Q1) This Research provides helpful information to understand and handle the potential risks of salinity and sodium in the researched region by classifying the salinity hazard levels into four classes (C1 to C4 and S1 to S4) based on electrical conductivity (EC) and SAR values. (Q2) To conclude, our research demonstrates that stacked models, employing different classifiers, have proven to be highly effective in predicting water quality with remarkable accuracy. When we utilized the predicted probability values by passing them through the Artificial Neural Network (ANN), the accuracy further improved to an impressive 98%. (Q3) The stacked model technique, which combines random forest, light GBM, and ANN, seems to be an effective means of dealing with imbalanced data and enhancing prediction accuracy. The significant improvement in F1 Scores for a few classes, especially when using ANN, demonstrates how effectively this ensemble approach handles challenging classification problems. Furthermore, emerging areas for future research that emerged from this study include the opportunity for training and testing using our model with a larger dataset and modifying different hyperparameters for further improvement
Nonparametric estimation of the dynamic range of music signals
The dynamic range is an important parameter which measures the spread of
sound power, and for music signals it is a measure of recording quality. There
are various descriptive measures of sound power, none of which has strong
statistical foundations. We start from a nonparametric model for sound waves
where an additive stochastic term has the role to catch transient energy. This
component is recovered by a simple rate-optimal kernel estimator that requires
a single data-driven tuning. The distribution of its variance is approximated
by a consistent random subsampling method that is able to cope with the massive
size of the typical dataset. Based on the latter, we propose a statistic, and
an estimation method that is able to represent the dynamic range concept
consistently. The behavior of the statistic is assessed based on a large
numerical experiment where we simulate dynamic compression on a selection of
real music signals. Application of the method to real data also shows how the
proposed method can predict subjective experts' opinions about the hifi quality
of a recording
Landmark detection in 2D bioimages for geometric morphometrics: a multi-resolution tree-based approach
The detection of anatomical landmarks in bioimages is a necessary but tedious step for geometric morphometrics studies in many research domains. We propose variants of a multi-resolution tree-based approach to speed-up the detection of landmarks in bioimages. We extensively evaluate our method variants on three different datasets (cephalometric, zebrafish, and drosophila images). We identify the key method parameters (notably the multi-resolution) and report results with respect to human ground truths and existing methods. Our method achieves recognition performances competitive with current existing approaches while being generic and fast. The algorithms are integrated in the open-source Cytomine software and we provide parameter configuration guidelines so that they can be easily exploited by end-users. Finally, datasets are readily available through a Cytomine server to foster future research
Autoencoders for strategic decision support
In the majority of executive domains, a notion of normality is involved in
most strategic decisions. However, few data-driven tools that support strategic
decision-making are available. We introduce and extend the use of autoencoders
to provide strategically relevant granular feedback. A first experiment
indicates that experts are inconsistent in their decision making, highlighting
the need for strategic decision support. Furthermore, using two large
industry-provided human resources datasets, the proposed solution is evaluated
in terms of ranking accuracy, synergy with human experts, and dimension-level
feedback. This three-point scheme is validated using (a) synthetic data, (b)
the perspective of data quality, (c) blind expert validation, and (d)
transparent expert evaluation. Our study confirms several principal weaknesses
of human decision-making and stresses the importance of synergy between a model
and humans. Moreover, unsupervised learning and in particular the autoencoder
are shown to be valuable tools for strategic decision-making
- …