135 research outputs found
Clustering as an example of optimizing arbitrarily chosen objective functions
This paper is a reflection upon a common practice of solving various types of learning problems by optimizing arbitrarily chosen criteria in the hope that they are well correlated with the criterion actually used for assessment of the results. This issue has been investigated using clustering as an example, hence a unified view of clustering as an optimization problem is first proposed, stemming from the belief that typical design choices in clustering, like the number of clusters or similarity measure can be, and often are suboptimal, also from the point of view of clustering quality measures later used for algorithm comparison and ranking. In order to illustrate our point we propose a generalized clustering framework and provide a proof-of-concept using standard benchmark datasets and two popular clustering methods for comparison
La Termogènesi als calorímetres per conducció: les transformacions sòlid-sòlid i les barreges líquides
Descrivim alguns mètodes d'obtenció de funcions de
transferència associades a fenòmens reals i donem exemples
de les termogènesis obtingudes en aquests casos.
Els calorimètres amb molt bones característiques dinàmiques (θn∼3Hz) són molt adequats per a l'estudi de fenòmens transitoris. En aquest treball presenten en primer lloc resultats relatius a la transformació β → γ' de l'aliatge Cu-
Zn-Al. La transformació presenta un caràcter molt discontinu, una dissipació energètica important, i una excellent correlació amb l'emissió acústica generada durant el procés de transformació que permet donar una valoració qualitativa de les possibilitats calorimètriques de l'anàlisi entàlpica diferencial.
En segon lloc presentem una anàlisi de les entalpies
d'excés en les barreges líquides. Aquest estudi és molt interessant a baixes concentracions. L'ús de sistemes d'injecció permet assolir fraccions molars de solut xs\gtrsim 0.01. L'obtenció d'una funció de transferència correcta del sistema calorimètric i l'ús d'algorismes deconvolutius eficaços permet reduir la fracció molar a xs\gtrsim 0.001.This paper presents several methods to obtain transfer
functions associated with power dissipations in actual phenomena and a few examples of the approximate thermogenesis obtained.
On the one hand, calorimeters with extremely good dynamic
characteristics (θn∼3Hz) allow the study of structural transformations in solids. We present results concerning the martensitic transformation β → γ' of a Cu-Zn-Al alloy. They show the jerky character of the transformation very well correlated with acoustic emission patterns and an important energy 1iberation. This analysis gives an estimate of the posibilities of calorimetry within the field of Differential Enthalpic Analysis.
On the other hand, an analysis of the properties of
liquid mixtures at low concentrations is very interesting
when carried out their excess enthalpies.
Steady injection systems allow to reach solute molar
fractions xs\gtrsim 0.01. We describe here the obtention of a correct transfer function. Now, the application of proper deconvolutive algorithms make it possible to work at so low concentrations as xs\gtrsim 0.001
Separation of poliovirus and poliovirus RNA on Sephadex G 200
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/41675/1/705_2005_Article_BF01241426.pd
Warped K-Means: An algorithm to cluster sequentially-distributed data
[EN] Many devices generate large amounts of data that follow some sort of sequentiality, e.g.,
motion sensors, e-pens, eye trackers, etc. and often these data need to be compressed for
classification, storage, and/or retrieval tasks. Traditional clustering algorithms can be used
for this purpose, but unfortunately they do not cope with the sequential information
implicitly embedded in such data. Thus, we revisit the well-known K-means algorithm
and provide a general method to properly cluster sequentially-distributed data. We present
Warped K-Means (WKM), a multi-purpose partitional clustering procedure that minimizes
the sum of squared error criterion, while imposing a hard sequentiality constraint in the
classification step. We illustrate the properties of WKM in three applications, one being
the segmentation and classification of human activity. WKM outperformed five state-of-
the-art clustering techniques to simplify data trajectories, achieving a recognition accuracy
of near 97%, which is an improvement of around 66% over their peers. Moreover, such an
improvement came with a reduction in the computational cost of more than one order of
magnitude.This work has been partially supported by Casmacat (FP7-ICT-2011-7, Project 287576), tranScriptorium (FP7-ICT-2011-9, Project 600707), STraDA (MINECO, TIN2012-37475-0O2-01), and ALMPR (GVA, Prometeo/20091014) projects.Leiva Torres, LA.; Vidal, E. (2013). Warped K-Means: An algorithm to cluster sequentially-distributed data. Information Sciences. 237:196-210. https://doi.org/10.1016/j.ins.2013.02.042S19621023
Partitioning clustering algorithms for protein sequence data sets
<p>Abstract</p> <p>Background</p> <p>Genome-sequencing projects are currently producing an enormous amount of new sequences and cause the rapid increasing of protein sequence databases. The unsupervised classification of these data into functional groups or families, clustering, has become one of the principal research objectives in structural and functional genomics. Computer programs to automatically and accurately classify sequences into families become a necessity. A significant number of methods have addressed the clustering of protein sequences and most of them can be categorized in three major groups: hierarchical, graph-based and partitioning methods. Among the various sequence clustering methods in literature, hierarchical and graph-based approaches have been widely used. Although partitioning clustering techniques are extremely used in other fields, few applications have been found in the field of protein sequence clustering. It is not fully demonstrated if partitioning methods can be applied to protein sequence data and if these methods can be efficient compared to the published clustering methods.</p> <p>Methods</p> <p>We developed four partitioning clustering approaches using Smith-Waterman local-alignment algorithm to determine pair-wise similarities of sequences. Four different sets of protein sequences were used as evaluation data sets for the proposed methods.</p> <p>Results</p> <p>We show that these methods outperform several other published clustering methods in terms of correctly predicting a classifier and especially in terms of the correctness of the provided prediction. The software is available to academic users from the authors upon request.</p
Addressing preference heterogeneity in public health policy by combining Cluster Analysis and Multi-Criteria Decision Analysis: Proof of Method.
The use of subgroups based on biological-clinical and socio-demographic variables to deal with population heterogeneity is well-established in public policy. The use of subgroups based on preferences is rare, except when religion based, and controversial. If it were decided to treat subgroup preferences as valid determinants of public policy, a transparent analytical procedure is needed. In this proof of method study we show how public preferences could be incorporated into policy decisions in a way that respects both the multi-criterial nature of those decisions, and the heterogeneity of the population in relation to the importance assigned to relevant criteria. It involves combining Cluster Analysis (CA), to generate the subgroup sets of preferences, with Multi-Criteria Decision Analysis (MCDA), to provide the policy framework into which the clustered preferences are entered. We employ three techniques of CA to demonstrate that not only do different techniques produce different clusters, but that choosing among techniques (as well as developing the MCDA structure) is an important task to be undertaken in implementing the approach outlined in any specific policy context. Data for the illustrative, not substantive, application are from a Randomized Controlled Trial of online decision aids for Australian men aged 40-69 years considering Prostate-specific Antigen testing for prostate cancer. We show that such analyses can provide policy-makers with insights into the criterion-specific needs of different subgroups. Implementing CA and MCDA in combination to assist in the development of policies on important health and community issues such as drug coverage, reimbursement, and screening programs, poses major challenges -conceptual, methodological, ethical-political, and practical - but most are exposed by the techniques, not created by them
Multiple Deprivation, Severity and Latent Sub-Groups:Advantages of Factor Mixture Modelling for Analysing Material Deprivation
Material deprivation is represented in different forms and manifestations. Two individuals with the same deprivation score (i.e. number of deprivations), for instance, are likely to be unable to afford or access entirely or partially different sets of goods and services, while one individual may fail to purchase clothes and consumer durables and another one may lack access to healthcare and be deprived of adequate housing . As such, the number of possible patterns or combinations of multiple deprivation become increasingly complex for a higher number of indicators. Given this difficulty, there is interest in poverty research in understanding multiple deprivation, as this analysis might lead to the identification of meaningful population sub-groups that could be the subjects of specific policies. This article applies a factor mixture model (FMM) to a real dataset and discusses its conceptual and empirical advantages and disadvantages with respect to other methods that have been used in poverty research . The exercise suggests that FMM is based on more sensible assumptions (i.e. deprivation covary within each class), provides valuable information with which to understand multiple deprivation and is useful to understand severity of deprivation and the additive properties of deprivation indicators
- …