3,373 research outputs found
The k-means algorithm: A comprehensive survey and performance evaluation
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. The k-means clustering algorithm is considered one of the most powerful and popular data mining algorithms in the research community. However, despite its popularity, the algorithm has certain limitations, including problems associated with random initialization of the centroids which leads to unexpected convergence. Additionally, such a clustering algorithm requires the number of clusters to be defined beforehand, which is responsible for different cluster shapes and outlier effects. A fundamental problem of the k-means algorithm is its inability to handle various data types. This paper provides a structured and synoptic overview of research conducted on the k-means algorithm to overcome such shortcomings. Variants of the k-means algorithms including their recent developments are discussed, where their effectiveness is investigated based on the experimental analysis of a variety of datasets. The detailed experimental analysis along with a thorough comparison among different k-means clustering algorithms differentiates our work compared to other existing survey papers. Furthermore, it outlines a clear and thorough understanding of the k-means algorithm along with its different research directions
Speech-Based Blood Pressure Estimation with Enhanced Optimization and Incremental Clustering
Blood Pressure (BP) estimation plays a pivotal role in diagnosing various
health conditions, highlighting the need for innovative approaches to overcome
conventional measurement challenges. Leveraging machine learning and speech
signals, this study investigates accurate BP estimation with a focus on
preprocessing, feature extraction, and real-time applications. An advanced
clustering-based strategy, incorporating the k-means algorithm and the proposed
Fact-Finding Instructor optimization algorithm, is introduced to enhance
accuracy. The combined outcome of these clustering techniques enables robust BP
estimation. Moreover, extending beyond these insights, this study delves into
the dynamic realm of contemporary digital content consumption. Platforms like
YouTube have emerged as influential spaces, presenting an array of videos that
evoke diverse emotions. From heartwarming and amusing content to intense
narratives, YouTube captures a spectrum of human experiences, influencing
information access and emotional engagement. Within this context, this research
investigates the interplay between YouTube videos and physiological responses,
particularly Blood Pressure (BP) levels. By integrating advanced BP estimation
techniques with the emotional dimensions of YouTube videos, this study enriches
our understanding of how modern media environments intersect with health
implications.Comment: 29 pages, 2 tables, 9 figure
Game analytics - maximizing the value of player data
During the years of the Information Age, technological advances in the computers,
satellites, data transfer, optics, and digital storage has led to the collection of an
immense mass of data on everything from business to astronomy, counting on the
power of digital computing to sort through the amalgam of information and generate meaning from the data. Initially, in the 1970s and 1980s of the previous century,
data were stored on disparate structures and very rapidly became overwhelming. The
initial chaos led to the creation of structured databases and database management
systems to assist with the management of large corpuses of data, and notably, the
effective and efficient retrieval of information from databases. The rise of the database management system increased the already rapid pace of information
gathering.peer-reviewe
Des-q: a quantum algorithm to construct and efficiently retrain decision trees for regression and binary classification
Decision trees are widely used in machine learning due to their simplicity in
construction and interpretability. However, as data sizes grow, traditional
methods for constructing and retraining decision trees become increasingly
slow, scaling polynomially with the number of training examples. In this work,
we introduce a novel quantum algorithm, named Des-q, for constructing and
retraining decision trees in regression and binary classification tasks.
Assuming the data stream produces small increments of new training examples, we
demonstrate that our Des-q algorithm significantly reduces the time required
for tree retraining, achieving a poly-logarithmic time complexity in the number
of training examples, even accounting for the time needed to load the new
examples into quantum-accessible memory. Our approach involves building a
decision tree algorithm to perform k-piecewise linear tree splits at each
internal node. These splits simultaneously generate multiple hyperplanes,
dividing the feature space into k distinct regions. To determine the k suitable
anchor points for these splits, we develop an efficient quantum-supervised
clustering method, building upon the q-means algorithm of Kerenidis et al.
Des-q first efficiently estimates each feature weight using a novel quantum
technique to estimate the Pearson correlation. Subsequently, we employ weighted
distance estimation to cluster the training examples in k disjoint regions and
then proceed to expand the tree using the same procedure. We benchmark the
performance of the simulated version of our algorithm against the
state-of-the-art classical decision tree for regression and binary
classification on multiple data sets with numerical features. Further, we
showcase that the proposed algorithm exhibits similar performance to the
state-of-the-art decision tree while significantly speeding up the periodic
tree retraining.Comment: 48 pager, 4 figures, 4 table
NEXT LEVEL: A COURSE RECOMMENDER SYSTEM BASED ON CAREER INTERESTS
Skills-based hiring is a talent management approach that empowers employers to align recruitment around business results, rather than around credentials and title. It starts with employers identifying the particular skills required for a role, and then screening and evaluating candidates’ competencies against those requirements. With the recent rise in employers adopting skills-based hiring practices, it has become integral for students to take courses that improve their marketability and support their long-term career success. A 2017 survey of over 32,000 students at 43 randomly selected institutions found that only 34% of students believe they will graduate with the skills and knowledge required to be successful in the job market. Furthermore, the study found that while 96% of chief academic officers believe that their institutions are very or somewhat effective at preparing students for the workforce, only 11% of business leaders strongly agree [11]. An implication of the misalignment is that college graduates lack the skills that companies need and value. Fortunately, the rise of skills-based hiring provides an opportunity for universities and students to establish and follow clearer classroom-to-career pathways. To this end, this paper presents a course recommender system that aims to improve students’ career readiness by suggesting relevant skills and courses based on their unique career interests
- …