7 research outputs found
Improvements to k-means clustering
Working with huge amount of data and learning from it by extracting useful information is one of the prime challenges of the Internet era. Machine learning algorithms; provide an automatic and easy way to accomplish such tasks. These algorithms are classified into supervised, unsupervised, semi-supervised algorithms. Some of the most used algorithms belong to the class of unsupervised learning as training becomes a challenge for many practical applications.
Machine learning algorithms for which the classes of input are unknown are called unsupervised algorithms. The k-means algorithm is one of the simplest and predominantly used algorithms for unsupervised learning procedure clustering. The k-means algorithm works grouping similar data based on some measure. The k in k-means denotes the number of such groups available. This study starts from the standard k-means algorithm and goes through some of the algorithmic improvements suggested in the machine learning and data mining literature. Traditional k-means algorithm is an iterative refinement algorithm with an assignment step and an update step. The distances from each of the clusters centroids are calculated and iteratively refined. The computational complexity of k –means algorithm mainly arises from the distance calculation or the so called nearest –neighbor query.
Certain formulation of the k-means calculations are NP hard. For a dataset with dimension and values the computational complexity is for a single iteration. Some of the k-means clustering procedures take several iterations and still fail to produce an acceptable clustering error. Several authors have studied different ways of reducing the complexity of k-means algorithm in the literature. Particularly, this study focuses mainly on the algorithmic improvements suggested in the related works of Hamerly and Elkan. These algorithmic improvements are chosen to be studied as they are generic in nature and could be applied for many available datasets. The improved algorithms due to Elkan and Hamerly are applied to various available datasets and their computation performances are presented
Bridge between worlds: relating position and disposition in the mathematical field
Using ethnographic observations and interview based research I document the
production of research mathematics in four European research institutes,
interviewing 45 mathematicians from three areas of pure mathematics: topology,
algebraic geometry and differential geometry. I use Bourdieu's notions of habitus,
field and practice to explore how mathematicians come to perceive and interact
with abstract mathematical spaces and constructions. Perception of mathematical
reality, I explain, depends upon enculturation within a mathematical discipline. This
process of socialisation involves positioning an individual within a field of
production. Within a field mathematicians acquire certain structured sets of
dispositions which constitute habitus, and these habitus then provide both
perspectives and perceptual lenses through which to construe mathematical
objects and spaces.
I describe how mathematical perception is built up through interactions
within three domains of experience: physical spaces, conceptual spaces and
discourse spaces. These domains share analogous structuring schemas, which are
related through Lakoff and Johnson's notions of metaphorical mappings and image
schemas. Such schemas are mobilised during problem solving and proof
construction, in order to guide mathematicians' intuitions; and are utilised during
communicative acts, in order to create common ground and common reference frames. However, different structuring principles are utilised according to the
contexts in which the act of knowledge production or communication take place.
The degree of formality, privacy or competitiveness of environments affects the
presentation of mathematicians' selves and ideas. Goffman's concept of interaction
frame, front-stage and backstage are therefore used to explain how certain
positions in the field shape dispositions, and lead to the realisation of different
structuring schemas or scripts.
I use Sewell's qualifications of Bourdieu's theories to explore the multiplicity
of schemas present within mathematicians' habitus, and detail how they are given
expression through craftwork and bricolage. I argue that mathematicians'
perception of mathematical phenomena are dependent upon their positions and
relations. I develop the notion of social space, providing definitions of such spaces
and how they are generated, how positions are determined, and how individuals
reposition within space through acquisition of capital