1,362 research outputs found
Fuzzy Distance Measure Based Affinity Propagation Clustering
Affinity Propagation (AP) is an effective algorithm that find exemplars repeatedly exchange real valued messages between pairs of data points. AP uses the similarity between data points to calculate the messages. Hence, the construction of similarity is essential in the AP algorithm. A common choice for similarity is the negative Euclidean distance. However, due to the simplicity of Euclidean distance, it cannot capture the real structure of data. Furthermore, Euclidean distance is sensitive to noise and outliers such that the performance of the AP might be degraded. Therefore, researchers have intended to utilize
different similarity measures to analyse the performance of AP. nonetheless, there is still a room to enhance the performance of AP clustering. A clustering method called fuzzy based Affinity propagation (F-AP) is proposed, which is based on a fuzzy similarity measure. Experiments shows the efficiency of the proposed F-AP, experiments is performed on UCI dataset. Results shows a promising improvement on AP
Clustering Algorithms: Their Application to Gene Expression Data
Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and iden-tify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure
Data Stream Mining: an Evolutionary Approach
Este trabajo presenta un algoritmo para agrupar flujos de datos, llamado ESCALIER. Este algoritmo es una extensiĂłn del algoritmo de agrupamiento evolutivo ECSAGO Evolutionary Clustering with Self Adaptive Genetic Operators. ESCALIER toma el proceso evolutivo propuesto por ECSAGO para encontrar grupos en los flujos de datos, los cuales son definidos por la tĂ©cnica Sliding Window. Para el mantenimiento y olvido de los grupos detectados a travĂ©s de la evoluciĂłn de los datos, ESCALIER incluye un mecanismo de memoria inspirado en la teorĂa de redes inmunolĂłgicas artificiales. Para probar la efectividad del algoritmo, se realizaron experimentos utilizando datos sintĂ©ticos simulando un ambiente de flujos de datos, y un conjunto de datos reales.Abstract. This work presents a data stream clustering algorithm called ESCALIER. This algorithm is an extension of the evolutionary clustering ECSAGO - Evolutionary Clustering with Self Adaptive Genetic Operators. ESCALIER takes the advantage of the evolutionary process proposed by ECSAGO to find the clusters in the data streams. They are defined by sliding window technique. To maintain and forget clusters through the evolution of the data, ESCALIER includes a memory mechanism inspired by the artificial immune network theory. To test the performance of the algorithm, experiments using synthetic data, simulating the data stream environment, and a real dataset are carried out.MaestrĂ
Immunological synapse: a mathematical model of the bond formation process:mathematical modelling of the immature synapse
The cell:cell bond between an immune cell and an antigen presenting cell is a necessary event in the activation of the adaptive immune response. At the juncture between the cells, cell surface molecules on the opposing cells form non-covalent bonds and a distinct patterning is observed that is termed the immunological synapse. An important binding molecule in the synapse is the T-cell receptor (TCR), that is responsible for antigen recognition through its binding with a major-histocompatibility complex with bound peptide (pMHC). This bond leads to intracellular signalling events that culminate in the activation of the T-cell, and ultimately leads to the expression of the immune eector function. The temporal analysis of the TCR bonds during the formation of the immunological synapse presents a problem to biologists, due to the spatio-temporal scales (nanometers and picoseconds) that compare with experimental uncertainty limits. In this study, a linear stochastic model, derived from a nonlinear model of the synapse, is used to analyse the temporal dynamics of the bond attachments for the TCR. Mathematical analysis and numerical methods are employed to analyse the qualitative dynamics of the nonequilibrium membrane dynamics, with the specic aim of calculating the average persistence time for the TCR:pMHC bond. A single-threshold method, that has been previously used to successfully calculate the TCR:pMHC contact path sizes in the synapse, is applied to produce results for the average contact times of the TCR:pMHC bonds. This method is extended through the development of a two-threshold method, that produces results suggesting the average time persistence for the TCR:pMHC bond is in the order of 2-4 seconds, values that agree with experimental evidence for TCR signalling. The study reveals two distinct scaling regimes in the time persistent survival probability density prole of these bonds, one dominated by thermal uctuations and the other associated with the TCR signalling. Analysis of the thermal fluctuation regime reveals a minimal contribution to the average time persistence calculation, that has an important biological implication when comparing the probabilistic models to experimental evidence. In cases where only a few statistics can be gathered from experimental conditions, the results are unlikely to match the probabilistic predictions. The results also identify a rescaling relationship between the thermal noise and the bond length, suggesting a recalibration of the experimental conditions, to adhere to this scaling relationship, will enable biologists to identify the start of the signalling regime for previously unobserved receptor:ligand bonds. Also, the regime associated with TCR signalling exhibits a universal decay rate for the persistence probability, that is independent of the bond length
Quantitative Immunology for Physicists
The adaptive immune system is a dynamical, self-organized multiscale system
that protects vertebrates from both pathogens and internal irregularities, such
as tumours. For these reason it fascinates physicists, yet the multitude of
different cells, molecules and sub-systems is often also petrifying. Despite
this complexity, as experiments on different scales of the adaptive immune
system become more quantitative, many physicists have made both theoretical and
experimental contributions that help predict the behaviour of ensembles of
cells and molecules that participate in an immune response. Here we review some
recent contributions with an emphasis on quantitative questions and
methodologies. We also provide a more general methods section that presents
some of the wide array of theoretical tools used in the field.Comment: 78 page revie
Applications of Molecular Dynamics simulations for biomolecular systems and improvements to density-based clustering in the analysis
Molecular Dynamics simulations provide a powerful tool to study biomolecular systems with atomistic detail. The key to better understand the function and behaviour of these molecules can often be found in their structural variability. Simulations can help to expose this information that is otherwise experimentally hard or impossible to attain. This work covers two application examples for which a sampling and a characterisation of the conformational ensemble could reveal the structural basis to answer a topical research question. For the fungal toxin phalloidinâa small bicyclic peptideâobserved product ratios in different cyclisation reactions could be rationalised by assessing the conformational pre-organisation of precursor fragments. For the C-type lectin receptor langerin, conformational changes induced by different side-chain protonations could deliver an explanation
of the pH-dependency in the proteinâs calcium-binding. The investigations were accompanied by the continued development of a density-based clustering protocol into a respective software package, which is generally well applicable for the use case of extracting conformational states from Molecular Dynamics data
- âŠ