526,640 research outputs found
Implementation of Discretisation and Correlation-based Feature Selection to Optimize Support Vector Machine in Diagnosis of Chronic Kidney Disease
This study aims to improve the accuracy of the classification algorithm for diagnosing chronic kidney disease. There are several models of data mining. In classification, the Support Vector Machine (SVM) algorithm is widely used by researchers worldwide. The data used is a chronic kidney disease dataset taken from the UCI machine learning repository. This data consists of 25 attributes and 11 numeric data attributes, and 14 negative attributes. To call continuously, discrete data is used. Meanwhile, data is selected using Correlation-based Feature Selection (CFS) to reduce irrelevant and redundant data. The research results by applying discretization and feature selection based on correlation for classification in the SVM algorithm with 10-fold cross-validation show an increase in accuracy of 0.5%. The classification of the vector machine support algorithm in the diagnosis of chronic kidney disease produces an accuracy of 99.25%, and after applying discretization and correlation-based feature selection, produces an accuracy of 99.75%. Implementation of discretion and correlation-based feature selection to optimize support vector machine for diagnosis of chronic kidney disease has increased accuracy by 0.5%. The proposed method is feasible as a method of diagnosing chronic kidney disease
Too much of a good thing: How novelty biases and vocabulary influence known and novel referent selection in 18-month-old children and associative learning models
Identifying the referent of novel words is a complex process that young children do with relative ease. When given multiple objects along with a novel word, children select the most novel item, sometimes retaining the wordâreferent link. Prior work is inconsistent, however, on the role of object novelty. Two experiments examine 18âmonthâold children's performance on referent selection and retention with novel and known words. The results reveal a pervasive novelty bias on referent selection with both known and novel names and, across individual children, a negative correlation between attention to novelty and retention of new wordâreferent links. A computational model examines possible sources of the bias, suggesting novelty supports inâtheâmoment behavior but not retention. Together, results suggest that when lexical knowledge is weak, attention to novelty drives behavior, but alone does not sustain learning. Importantly, the results demonstrate that word learning may be driven, in part, by lowâlevel perceptual processes
Interplay between personality traits and learning strategies:the missing link
Students with varying personality traits are likely to employ diverse learning and study strategies. However, this relationship has never been explored in the medical education context. This studyâs aim was to explore the relationship between learning strategies and personality traits among medical students. This study was a cross-sectional study, and a quantitative approach was employed using two self-administered questionnaires: one to assess the personality traits from the Five-Factor Model (Conscientiousness, Neuroticism, Extraversion, Openness, and Agreeableness), and the other to assess 10 learning strategies (Anxiety, Attitude, Concentration, Information Processing, Motivation, Selecting Main Ideas, Self-Testing, Test Strategies, Time Management, and Using Academic Resources). A stratified random sampling technique was used to recruit medical students at Alfaisal University in the preclinical and clinical years (N = 309). Pearson correlation coefficient was used to measure the relationship between variables, and linear regression was used to evaluate how personality traits predicted learning strategy selection. Personality traits predicted the selection of learning strategies, especially Conscientiousness and Neuroticism. Conscientiousness showed a positive correlation with seven learning strategies and was the most important predictor of learning strategies students employ. Neuroticism correlations and predictions were negative. The other three traits showed weaker correlations. These correlations were between Extraversion and Using Academic Resources (r = 0.27), Information Processing (r = 0.23), and Attitude (r = 0.19); Openness and Information Processing (r = 0.29); and Agreeableness and Attitude (r = 0.29). All personality domains influence at least one learning strategy, especially Conscientiousness and Neuroticism. This study helps build a foundation for individualized coaching and mentorship in medical education. NEW & NOTEWORTHY This study aspires to build a foundation for individualized coaching and mentorship in medical education through utilizing personality traits to empower academic success. We demonstrate that all personality domains influence studentsâ selection of at least one learning strategy, especially Conscientiousness and Neuroticism
New Statistical Transfer Learning Models for Health Care Applications
abstract: Transfer learning is a sub-field of statistical modeling and machine learning. It refers to methods that integrate the knowledge of other domains (called source domains) and the data of the target domain in a mathematically rigorous and intelligent way, to develop a better model for the target domain than a model using the data of the target domain alone. While transfer learning is a promising approach in various application domains, my dissertation research focuses on the particular application in health care, including telemonitoring of Parkinsonâs Disease (PD) and radiomics for glioblastoma.
The first topic is a Mixed Effects Transfer Learning (METL) model that can flexibly incorporate mixed effects and a general-form covariance matrix to better account for similarity and heterogeneity across subjects. I further develop computationally efficient procedures to handle unknown parameters and large covariance structures. Domain relations, such as domain similarity and domain covariance structure, are automatically quantified in the estimation steps. I demonstrate METL in an application of smartphone-based telemonitoring of PD.
The second topic focuses on an MRI-based transfer learning algorithm for non-invasive surgical guidance of glioblastoma patients. Limited biopsy samples per patient create a challenge to build a patient-specific model for glioblastoma. A transfer learning framework helps to leverage other patientâs knowledge for building a better predictive model. When modeling a target patient, not every patientâs information is helpful. Deciding the subset of other patients from which to transfer information to the modeling of the target patient is an important task to build an accurate predictive model. I define the subset of âtransferrableâ patients as those who have a positive rCBV-cell density correlation, because a positive correlation is confirmed by imaging theory and the its respective literature.
The last topic is a Privacy-Preserving Positive Transfer Learning (P3TL) model. Although negative transfer has been recognized as an important issue by the transfer learning research community, there is a lack of theoretical studies in evaluating the risk of negative transfer for a transfer learning method and identifying what causes the negative transfer. My work addresses this issue. Driven by the theoretical insights, I extend Bayesian Parameter Transfer (BPT) to a new method, i.e., P3TL. The unique features of P3TL include intelligent selection of patients to transfer in order to avoid negative transfer and maintain patient privacy. These features make P3TL an excellent model for telemonitoring of PD using an At-Home Testing Device.Dissertation/ThesisDoctoral Dissertation Industrial Engineering 201
Recommended from our members
Sparsity in Machine Learning: An Information Selecting Perspective
Today we are living in a world awash with data. Large volumes of data are acquired, analyzed and applied to tasks through machine learning algorithms in nearly every area of science, business, and industry. For example, medical scientists analyze the gene expression data from a single specimen to learn the underlying causes of disease (e.g. cancer) and choose the best treatment; retailers can know more about customers\u27 shopping habits from retail data to adjust their business strategies to better appeal to customers; suppliers can enhance supply chain success through supply chain systems built on knowledge sharing. However, it is also reasonable to doubt whether all the genes make contributions to a disease; whether all the data obtained from existing customers can be applied to a new customer; whether all shared knowledge in the supply network is useful to a specific supply scenario. Therefore, it is crucial to sort through the massive information provided by data and keep what we really need. This process is referred to as information selection, which keeps the information that helps improve the performance of corresponding machine learning tasks and discards information that is useless or even harmful to task performance. Sparse learning is a powerful tool to achieve information selection. In this thesis, we apply sparse learning to two major areas in machine learning -- feature selection and transfer learning.
Feature selection is a dimensionality reduction technique that selects a subset of representative features. Recently, feature selection combined with sparse learning has attracted significant attention due to its outstanding performance compared with traditional feature selection methods that ignore correlation between features. However, they are restricted by design to linear data transformations, a potential drawback given that the underlying correlation structures of data are often non-linear. To leverage more sophisticated embedding than the linear model assumed by sparse learning, we propose an autoencoder-based unsupervised feature selection approach that leverages a single-layer autoencoder for a joint framework of feature selection and manifold learning. Additionally, we include spectral graph analysis on the projected data into the learning process to achieve local data geometry preservation from the original data space to the low-dimensional feature space.
Transfer learning describes a set of methods that aim at transferring knowledge from related domains to alleviate the problems caused by limited/no labeled training data in machine learnig tasks. Many transfer learning techniques have been proposed to deal with different application scenarios. However, due to the differences in data distribution, feature space, label space, etc., between source domain and target domain, it is necessary to select and only transfer relevant information from source domain to improve the performance of target learner. Otherwise, the target learner can be negatively impacted by the weak-related knowledge from source domain, which is referred to as negative transfer. In this thesis, we focus on two transfer learning scenarios for which limited labeled training data are available in target domain. In the first scenario, no label information is avaible in source data. In the second scenario, large amounts of labeled source data are available, but there is no overlap between the source and target label spaces. The corresponding transfer learning technique to the former case is called \emph{self-taught learning}, while that for the latter case is called \emph{few-shot learning}. We apply self-taught learning to visual, textal, and audio data. We also apply few-shot learning to wearable sensor based human activity data. For both cases, we propose a metric for the relevance between a target sample/class and a source sample/class, and then extract information from the related samples/classes for knowledge transfer to perform information selection so that negative transfer caused by weakly related source information can be alleviated. Experimental results show that transfer learning can provide better performance with information selection
Nonparametric Feature Extraction from Dendrograms
We propose feature extraction from dendrograms in a nonparametric way. The
Minimax distance measures correspond to building a dendrogram with single
linkage criterion, with defining specific forms of a level function and a
distance function over that. Therefore, we extend this method to arbitrary
dendrograms. We develop a generalized framework wherein different distance
measures can be inferred from different types of dendrograms, level functions
and distance functions. Via an appropriate embedding, we compute a vector-based
representation of the inferred distances, in order to enable many numerical
machine learning algorithms to employ such distances. Then, to address the
model selection problem, we study the aggregation of different dendrogram-based
distances respectively in solution space and in representation space in the
spirit of deep representations. In the first approach, for example for the
clustering problem, we build a graph with positive and negative edge weights
according to the consistency of the clustering labels of different objects
among different solutions, in the context of ensemble methods. Then, we use an
efficient variant of correlation clustering to produce the final clusters. In
the second approach, we investigate the sequential combination of different
distances and features sequentially in the spirit of multi-layered
architectures to obtain the final features. Finally, we demonstrate the
effectiveness of our approach via several numerical studies
- âŠ