23 research outputs found
Predicting Educational Relevance For an Efficient Classification of Talent
This research work utilizes machine learning approach to build a predictive model for the prediction of the students and the job seekers’ to quantify their fitness's for the courses and jobs they plan to pursue, respectively. Some of the existing research utilizes GPA for academic prediction and use personality prediction and computing in social domains for various industrial goals. On the other hand, this research work advances the state of the art to correlate and blend the personality features with the academic attributes to identify and classify the relevant talent of the individuals for the academic and real world success with improved predictive modeling. This work incorporates three algorithms to quantify a talent in the relevance, and then predict good fit students and good fit candidates, based on supervised learning, stochastic probability distribution and classification rules, etc. This work opens many opportunities for future research towards Genomics data mining to mine individuals for various areas
We Are What We Generate - Understanding Ourselves Through Our Data
AbstractWe have tendency to exhibit ourselves through the data we share about ourselves including, liking, friendship, follows, disliking, pictures, audio, videos, causes, blogs and sites. Such data about us have already been used by big data companies to create customized ads and marketing tactics. However, while such data being in unstructured and noisy format, utilization and research is at its early stages. In this paper, we elaborate on the idea of understanding individuals through lens of data they produce in context of our main research work for Predicting Educational Relevance For an Efficient Classification of Talent (PERFECT) algorithm engine. We illustrate some of research problems in relevance of such data and identify research problem as ground for this paper. We present sub set of our framework including algorithm and math constructs, for the problem we identify. We conclude that such analytics and cognitive research can help to improve education, healthcare, Job economy, crime control, etc. Thus we coin the phrase “we are what we generate”, with our work in this paper. We suggest future work and opportunities in relevant directions
Enhanced Machine Learning Engine Engineering Using Innovative Blending, Tuning, and Feature Optimization
Investigated into and motivated by Ensemble Machine Learning (ML) techniques, this thesis contributes to addressing performance, consistency, and integrity issues such as overfitting, underfitting, predictive errors, accuracy paradox, and poor generalization for the ML models. Ensemble ML methods have shown promising outcome when a single algorithm failed to approximate the true prediction function. Using meta-learning, a super learner is engineered by combining weak learners. Generally, several methods in Supervised Learning (SL) are evaluated to find the best fit to the underlying data and predictive analytics (i.e., “No Free Lunch” Theorem relevance). This thesis addresses three main challenges/problems, i) determining the optimum blend of algorithms/methods for enhanced SL ensemble models, ii) engineering the selection and grouping of features that aggregate to the highest possible predictive and non-redundant value in the training data set, and iii) addressing the performance integrity issues such as accuracy paradox. Therefore, an enhanced Machine Learning Engine Engineering (eMLEE) is inimitably constructed via built-in parallel processing and specially designed novel constructs for error and gain functions to optimally score the classifier elements for improved training experience and validation procedures. eMLEE, as based on stochastic thinking, is built on; i) one centralized unit as Logical Table unit (LT), ii) two explicit units as enhanced Algorithm Blend and Tuning (eABT) and enhanced Feature Engineering and Selection (eFES), and two implicit constructs as enhanced Weighted Performance Metric(eWPM) and enhanced Cross Validation and Split (eCVS). Hence, it proposes an enhancement to the internals of the SL ensemble approaches. Motivated by nature inspired metaheuristics algorithms (such as GA, PSO, ACO, etc.), feedback mechanisms are improved by introducing a specialized function as Learning from the Mistakes (LFM) to mimic the human learning experience. LFM has shown significant improvement towards refining the predictive accuracy on the testing data by utilizing the computational processing of wrong predictions to increase the weighting scoring of the weak classifiers and features. LFM further ensures the training layer experiences maximum mistakes (i.e., errors) for optimum tuning. With this designed in the engine, stochastic modeling/thinking is implicitly implemented. Motivated by OOP paradigm in the high-level programming, eMLEE provides interface infrastructure using LT objects for the main units (i.e., Unit A and Unit B) to use the functions on demand during the classifier learning process. This approach also assists the utilization of eMLEE API by the outer real-world usage for predictive modeling to further customize the classifier learning process and tuning elements trade-off, subject to the data type and end model in goal. Motivated by higher dimensional processing and Analysis (i.e., 3D) for improved analytics and learning mechanics, eMLEE incorporates 3D Modeling of fitness metrics such as x for overfit, y for underfit, and z for optimum fit, and then creates logical cubes using LT handles to locate the optimum space during ensemble process. This approach ensures the fine tuning of ensemble learning process with improved accuracy metric. To support the built and implementation of the proposed scheme, mathematical models (i.e., Definitions, Lemmas, Rules, and Procedures) along with the governing algorithms’ definitions (and pseudo-code), and necessary illustrations (to assist in elaborating the concepts) are provided. Diverse sets of data are used to improve the generalization of the engine and tune the underlying constructs during development-testing phases. To show the practicality and stability of the proposed scheme, several results are presented with a comprehensive analysis of the outcomes for the metrics (i.e., via integrity, corroboration, and quantification) of the engine. Two approaches are followed to corroborate the engine, i) testing inner layers (i.e., internal constructs) of the engine (i.e., Unit-A, Unit-B and C-Unit) to stabilize and test the fundamentals, and ii) testing outer layer (i.e., engine as a black box) for standard measuring metrics for the real-world endorsement. Comparison with various existing techniques in the state of the art are also reported. In conclusion of the extensive literature review, research undertaken, investigative approach, engine construction and tuning, validation approach, experimental study, and results visualization, the eMLEE is found to be outperforming the existing techniques most of the time, in terms of the classifier learning, generalization, metrics trade-off, optimum-fitness, feature engineering, and validation
Utilizing Big Data Analytics to Improve Education
Analytics can be defined as the process of determining, assessing, and interpreting meaning from volumes of data. It has been categorized in three different categories - descriptive, predictive and prescriptive. Predictive analysis can serve many segments of society as it can reveal hidden relationship which may not be apparent with descriptive modeling. Analytics advancement plays an important role in higher education planning. It answers several questions such as -which students will enroll in particular course, what courses are on trending or obsolete, what is the level of student satisfaction in the current education system, effectiveness of online study environment, how to design a better curriculum, likelihood of students transfer, drop out or failure to complete the course. Not only, data analytics helps in analyzing above points but also can be helpful in predictive modeling for faculty, administrative and students groups who are looking out for genuine results about the university rankings, based on which they make their decisions. Using the dataset “Academic Ranking of World Universities, 2003-2014”, we studied and analyzed to forecast how university’s management and faculty could adapt to changes to improve their education and thereby the ranking of their universities in the upcoming years. Microsoft SQL Server Data Mining Add-ins Excel 2008 was employed as a software mining tool for predicting the trending university ranking. This research paper concentrates upon predictive analysis of university ranking using forecasting based on data mining technique
Proposing Logical Table Constructs for Enhanced Machine Learning Process
Machine learning (ML) has shown enormous potential in various domains with the wide variations of underlying data types. Because of the miscellany in the data sets and the features, ML classifiers often suffer from challenges, such as feature miss-classification, unfit algorithms, low accuracy, overfitting, underfitting, extreme bias, and high predictive errors. Through the lens of related study and latest progress in the field, this paper presents a novel scheme to construct logical table (LT) unit with two internal sub-modules for algorithm blend and feature engineering. The LT unit works in the deepest layer of an enhanced ML engine engineering (eMLEE) process. eMLEE consists of several low-level modules to enhance the ML classifier progression. A unique engineering approach is adopted in eMLEE to blend various algorithms, enhance the feature engineering, construct a weighted performance metric, and augment the validation process. The LT is an in-memory logical component, that governs the progress of eMLEE, regulates the model metrics, improves the parallelism, and keep tracks of each module of eMLEE as the classifier learns. Optimum fitness of the model with parallel “check, validate, insert, delete, and update” mechanism in 3-D logical space via structured schemas in the LT is obtained. The LT unit is developed in Python, C#, and R libraries and tested using miscellaneous data sets. Results are created using GraphPad Prism, SigmaPlot, Plotly, and MS Excel software. To support the built and implementation of the proposed scheme, complete mathematical models along with the algorithms, and necessary illustrations are provided in this paper. To show the practicality of the proposed scheme, several simulation results are presented with a comprehensive analysis of the outcomes for the metrics of the model that the LT regulates with improved outcomes.https://doi.org/10.1109/ACCESS.2018.286604
Proposing Enhanced Feature Engineering and a Selection Model for Machine Learning Processes
Machine Learning (ML) requires a certain number of features (i.e., attributes) to train the model. One of the main challenges is to determine the right number and the type of such features out of the given dataset’s attributes. It is not uncommon for the ML process to use dataset of available features without computing the predictive value of each. Such an approach makes the process vulnerable to overfit, predictive errors, bias, and poor generalization. Each feature in the dataset has either a unique predictive value, redundant, or irrelevant value. However, the key to better accuracy and fitting for ML is to identify the optimum set (i.e., grouping) of the right feature set with the finest matching of the feature’s value. This paper proposes a novel approach to enhance the Feature Engineering and Selection (eFES) Optimization process in ML. eFES is built using a unique scheme to regulate error bounds and parallelize the addition and removal of a feature during training. eFES also invents local gain (LG) and global gain (GG) functions using 3D visualizing techniques to assist the feature grouping function (FGF). FGF scores and optimizes the participating feature, so the ML process can evolve into deciding which features to accept or reject for improved generalization of the model. To support the proposed model, this paper presents mathematical models, illustrations, algorithms, and experimental results. Miscellaneous datasets are used to validate the model building process in Python, C#, and R languages. Results show the promising state of eFES as compared to the traditional feature selection process.http://dx.doi.org/10.3390/app804064
Conductive textiles for signal sensing and technical applications
Conductive textiles have found notable applications as electrodes and sensors capable of detecting biosignals like the electrocardiogram (ECG), electrogastrogram (EGG), electroencephalogram (EEG), and electromyogram (EMG), etc; other applications include electromagnetic shielding, supercapacitors, and soft robotics. There are several classes of materials that impart conductivity, including polymers, metals, and non-metals. The most significant materials are Polypyrrole (PPy), Polyaniline (PANI), Poly(3,4-ethylenedioxythiophene) (PEDOT), carbon, and metallic nanoparticles. The processes of making conductive textiles include various deposition methods, polymerization, coating, and printing. The parameters, such as conductivity and electromagnetic shielding, are prerequisites that set the benchmark for the performance of conductive textile materials. This review paper focuses on the raw materials that are used for conductive textiles, various approaches that impart conductivity, the fabrication of conductive materials, testing methods of electrical parameters, and key technical applications, challenges, and future potential
Recommended from our members
Global burden of 288 causes of death and life expectancy decomposition in 204 countries and territories and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021
BACKGROUND Regular, detailed reporting on population health by underlying cause of death is fundamental for public health decision making. Cause-specific estimates of mortality and the subsequent effects on life expectancy worldwide are valuable metrics to gauge progress in reducing mortality rates. These estimates are particularly important following large-scale mortality spikes, such as the COVID-19 pandemic. When systematically analysed, mortality rates and life expectancy allow comparisons of the consequences of causes of death globally and over time, providing a nuanced understanding of the effect of these causes on global populations. METHODS The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2021 cause-of-death analysis estimated mortality and years of life lost (YLLs) from 288 causes of death by age-sex-location-year in 204 countries and territories and 811 subnational locations for each year from 1990 until 2021. The analysis used 56 604 data sources, including data from vital registration and verbal autopsy as well as surveys, censuses, surveillance systems, and cancer registries, among others. As with previous GBD rounds, cause-specific death rates for most causes were estimated using the Cause of Death Ensemble model-a modelling tool developed for GBD to assess the out-of-sample predictive validity of different statistical models and covariate permutations and combine those results to produce cause-specific mortality estimates-with alternative strategies adapted to model causes with insufficient data, substantial changes in reporting over the study period, or unusual epidemiology. YLLs were computed as the product of the number of deaths for each cause-age-sex-location-year and the standard life expectancy at each age. As part of the modelling process, uncertainty intervals (UIs) were generated using the 2·5th and 97·5th percentiles from a 1000-draw distribution for each metric. We decomposed life expectancy by cause of death, location, and year to show cause-specific effects on life expectancy from 1990 to 2021. We also used the coefficient of variation and the fraction of population affected by 90% of deaths to highlight concentrations of mortality. Findings are reported in counts and age-standardised rates. Methodological improvements for cause-of-death estimates in GBD 2021 include the expansion of under-5-years age group to include four new age groups, enhanced methods to account for stochastic variation of sparse data, and the inclusion of COVID-19 and other pandemic-related mortality-which includes excess mortality associated with the pandemic, excluding COVID-19, lower respiratory infections, measles, malaria, and pertussis. For this analysis, 199 new country-years of vital registration cause-of-death data, 5 country-years of surveillance data, 21 country-years of verbal autopsy data, and 94 country-years of other data types were added to those used in previous GBD rounds. FINDINGS The leading causes of age-standardised deaths globally were the same in 2019 as they were in 1990; in descending order, these were, ischaemic heart disease, stroke, chronic obstructive pulmonary disease, and lower respiratory infections. In 2021, however, COVID-19 replaced stroke as the second-leading age-standardised cause of death, with 94·0 deaths (95% UI 89·2-100·0) per 100 000 population. The COVID-19 pandemic shifted the rankings of the leading five causes, lowering stroke to the third-leading and chronic obstructive pulmonary disease to the fourth-leading position. In 2021, the highest age-standardised death rates from COVID-19 occurred in sub-Saharan Africa (271·0 deaths [250·1-290·7] per 100 000 population) and Latin America and the Caribbean (195·4 deaths [182·1-211·4] per 100 000 population). The lowest age-standardised death rates from COVID-19 were in the high-income super-region (48·1 deaths [47·4-48·8] per 100 000 population) and southeast Asia, east Asia, and Oceania (23·2 deaths [16·3-37·2] per 100 000 population). Globally, life expectancy steadily improved between 1990 and 2019 for 18 of the 22 investigated causes. Decomposition of global and regional life expectancy showed the positive effect that reductions in deaths from enteric infections, lower respiratory infections, stroke, and neonatal deaths, among others have contributed to improved survival over the study period. However, a net reduction of 1·6 years occurred in global life expectancy between 2019 and 2021, primarily due to increased death rates from COVID-19 and other pandemic-related mortality. Life expectancy was highly variable between super-regions over the study period, with southeast Asia, east Asia, and Oceania gaining 8·3 years (6·7-9·9) overall, while having the smallest reduction in life expectancy due to COVID-19 (0·4 years). The largest reduction in life expectancy due to COVID-19 occurred in Latin America and the Caribbean (3·6 years). Additionally, 53 of the 288 causes of death were highly concentrated in locations with less than 50% of the global population as of 2021, and these causes of death became progressively more concentrated since 1990, when only 44 causes showed this pattern. The concentration phenomenon is discussed heuristically with respect to enteric and lower respiratory infections, malaria, HIV/AIDS, neonatal disorders, tuberculosis, and measles. INTERPRETATION Long-standing gains in life expectancy and reductions in many of the leading causes of death have been disrupted by the COVID-19 pandemic, the adverse effects of which were spread unevenly among populations. Despite the pandemic, there has been continued progress in combatting several notable causes of death, leading to improved global life expectancy over the study period. Each of the seven GBD super-regions showed an overall improvement from 1990 and 2021, obscuring the negative effect in the years of the pandemic. Additionally, our findings regarding regional variation in causes of death driving increases in life expectancy hold clear policy utility. Analyses of shifting mortality trends reveal that several causes, once widespread globally, are now increasingly concentrated geographically. These changes in mortality concentration, alongside further investigation of changing risks, interventions, and relevant policy, present an important opportunity to deepen our understanding of mortality-reduction strategies. Examining patterns in mortality concentration might reveal areas where successful public health interventions have been implemented. Translating these successes to locations where certain causes of death remain entrenched can inform policies that work to improve life expectancy for people everywhere. FUNDING Bill & Melinda Gates Foundation
Proposing Enhanced Feature Engineering and a Selection Model for Machine Learning Processes
Machine Learning (ML) requires a certain number of features (i.e., attributes) to train the model. One of the main challenges is to determine the right number and the type of such features out of the given dataset’s attributes. It is not uncommon for the ML process to use dataset of available features without computing the predictive value of each. Such an approach makes the process vulnerable to overfit, predictive errors, bias, and poor generalization. Each feature in the dataset has either a unique predictive value, redundant, or irrelevant value. However, the key to better accuracy and fitting for ML is to identify the optimum set (i.e., grouping) of the right feature set with the finest matching of the feature’s value. This paper proposes a novel approach to enhance the Feature Engineering and Selection (eFES) Optimization process in ML. eFES is built using a unique scheme to regulate error bounds and parallelize the addition and removal of a feature during training. eFES also invents local gain (LG) and global gain (GG) functions using 3D visualizing techniques to assist the feature grouping function (FGF). FGF scores and optimizes the participating feature, so the ML process can evolve into deciding which features to accept or reject for improved generalization of the model. To support the proposed model, this paper presents mathematical models, illustrations, algorithms, and experimental results. Miscellaneous datasets are used to validate the model building process in Python, C#, and R languages. Results show the promising state of eFES as compared to the traditional feature selection process
On Orthogonal Partial b-Metric Spaces with an Application
In this paper, we initiate the concept of orthogonal partial b-metric spaces. We ensure the existence of a unique fixed point for some orthogonal contractive type mappings. Some useful examples are given, and an application is also provided in support of the obtained results