1,171 research outputs found

    Infants' mu suppression during the observation of real and mimicked goal-directed actions

    Get PDF
    Since their discovery in the early 1990s, mirror neurons have been proposed to be related to many social-communicative abilities, such as imitation. However, research into the early manifestations of the putative neural mirroring system and its role in early social development is still inconclusive. In the current EEG study, mu suppression, generally thought to reflect activity in neural mirroring systems was investigated in 18- to 30-month-olds during the observation of object manipulations as well as mimicked actions. EEG power data recorded from frontal, central, and parietal electrodes were analysed. As predicted, based on previous research, mu wave suppression was found over central electrodes during action observation and execution. In addition, a similar suppression was found during the observation of intransitive, mimicked hand movements. To a lesser extent, the results also showed mu suppression at parietal electrode sites, over all three conditions. Mu wave suppression during the observation of hand movements and during the execution of actions was significantly correlated with quality of imitation, but not with age or language level

    Partially Blind Domain Adaptation for Age Prediction from {DNA} Methylation Data

    Get PDF
    Over the last years, huge resources of biological and medical data have become available for research. This data offers great chances for machine learning applications in health care, e.g. for precision medicine, but is also challenging to analyze. Typical challenges include a large number of possibly correlated features and heterogeneity in the data. One flourishing field of biological research in which this is relevant is epigenetics. Here, especially large amounts of DNA methylation data have emerged. This epigenetic mark has been used to predict a donor's 'epigenetic age' and increased epigenetic aging has been linked to lifestyle and disease history. In this paper we propose an adaptive model which performs feature selection for each test sample individually based on the distribution of the input data. The method can be seen as partially blind domain adaptation. We apply the model to the problem of age prediction based on DNA methylation data from a variety of tissues, and compare it to a standard model, which does not take heterogeneity into account. The standard approach has particularly bad performance on one tissue type on which we show substantial improvement with our new adaptive approach even though no samples of that tissue were part of the training data

    HAWKS: Evolving Challenging Benchmark Sets for Cluster Analysis

    Get PDF
    Comprehensive benchmarking of clustering algorithms is rendered difficult by two key factors: (i) the elusiveness of a unique mathematical definition of this unsupervised learning approach and (ii) dependencies between the generating models or clustering criteria adopted by some clustering algorithms and indices for internal cluster validation. Consequently, there is no consensus regarding the best practice for rigorous benchmarking, and whether this is possible at all outside the context of a given application. Here, we argue that synthetic datasets must continue to play an important role in the evaluation of clustering algorithms, but that this necessitates constructing benchmarks that appropriately cover the diverse set of properties that impact clustering algorithm performance. Through our framework, HAWKS, we demonstrate the important role evolutionary algorithms play to support flexible generation of such benchmarks, allowing simple modification and extension. We illustrate two possible uses of our framework: (i) the evolution of benchmark data consistent with a set of hand-derived properties and (ii) the generation of datasets that tease out performance differences between a given pair of algorithms. Our work has implications for the design of clustering benchmarks that sufficiently challenge a broad range of algorithms, and for furthering insight into the strengths and weaknesses of specific approaches

    An optimized TOPS+ comparison method for enhanced TOPS models

    Get PDF
    This article has been made available through the Brunel Open Access Publishing Fund.Background Although methods based on highly abstract descriptions of protein structures, such as VAST and TOPS, can perform very fast protein structure comparison, the results can lack a high degree of biological significance. Previously we have discussed the basic mechanisms of our novel method for structure comparison based on our TOPS+ model (Topological descriptions of Protein Structures Enhanced with Ligand Information). In this paper we show how these results can be significantly improved using parameter optimization, and we call the resulting optimised TOPS+ method as advanced TOPS+ comparison method i.e. advTOPS+. Results We have developed a TOPS+ string model as an improvement to the TOPS [1-3] graph model by considering loops as secondary structure elements (SSEs) in addition to helices and strands, representing ligands as first class objects, and describing interactions between SSEs, and SSEs and ligands, by incoming and outgoing arcs, annotating SSEs with the interaction direction and type. Benchmarking results of an all-against-all pairwise comparison using a large dataset of 2,620 non-redundant structures from the PDB40 dataset [4] demonstrate the biological significance, in terms of SCOP classification at the superfamily level, of our TOPS+ comparison method. Conclusions Our advanced TOPS+ comparison shows better performance on the PDB40 dataset [4] compared to our basic TOPS+ method, giving 90 percent accuracy for SCOP alpha+beta; a 6 percent increase in accuracy compared to the TOPS and basic TOPS+ methods. It also outperforms the TOPS, basic TOPS+ and SSAP comparison methods on the Chew-Kedem dataset [5], achieving 98 percent accuracy. Software Availability: The TOPS+ comparison server is available at http://balabio.dcs.gla.ac.uk/mallika/WebTOPS/.This article is available through the Brunel Open Access Publishing Fun

    Defining an informativeness metric for clustering gene expression data

    Get PDF
    Motivation: Unsupervised ‘cluster’ analysis is an invaluable tool for exploratory microarray data analysis, as it organizes the data into groups of genes or samples in which the elements share common patterns. Once the data are clustered, finding the optimal number of informative subgroups within a dataset is a problem that, while important for understanding the underlying phenotypes, is one for which there is no robust, widely accepted solution

    A modified hyperplane clustering algorithm allows for efficient and accurate clustering of extremely large datasets

    Get PDF
    Motivation: As the number of publically available microarray experiments increases, the ability to analyze extremely large datasets across multiple experiments becomes critical. There is a requirement to develop algorithms which are fast and can cluster extremely large datasets without affecting the cluster quality. Clustering is an unsupervised exploratory technique applied to microarray data to find similar data structures or expression patterns. Because of the high input/output costs involved and large distance matrices calculated, most of the algomerative clustering algorithms fail on large datasets (30 000 + genes/200 + arrays). In this article, we propose a new two-stage algorithm which partitions the high-dimensional space associated with microarray data using hyperplanes. The first stage is based on the Balanced Iterative Reducing and Clustering using Hierarchies algorithm with the second stage being a conventional k-means clustering technique. This algorithm has been implemented in a software tool (HPCluster) designed to cluster gene expression data. We compared the clustering results using the two-stage hyperplane algorithm with the conventional k-means algorithm from other available programs. Because, the first stage traverses the data in a single scan, the performance and speed increases substantially. The data reduction accomplished in the first stage of the algorithm reduces the memory requirements allowing us to cluster 44 460 genes without failure and significantly decreases the time to complete when compared with popular k-means programs. The software was written in C# (.NET 1.1)

    Discovering multi–level structures in bio-molecular data through the Bernstein inequality

    Get PDF
    Background: The unsupervised discovery of structures (i.e. clusterings) underlying data is a central issue in several branches of bioinformatics. Methods based on the concept of stability have been recently proposed to assess the reliability of a clustering procedure and to estimate the ”optimal ” number of clusters in bio-molecular data. A major problem with stability-based methods is the detection of multi-level structures (e.g. hierarchical functional classes of genes), and the assessment of their statistical significance. In this context, a chi-square based statistical test of hypothesis has been proposed; however, to assure the correctness of this technique some assumptions about the distribution of the data are needed. Results: To assess the statistical significance and to discover multi-level structures in bio-molecular data, a new method based on Bernstein’s inequality is proposed. This approach makes no assumptions about the distribution of the data, thus assuring a reliable application to a large range of bioinformatics problems. Results with synthetic and DNA microarray data show the effectiveness of the proposed method. Conclusions: The Bernstein test, due to its loose assumptions, is more sensitive than the chi-square test to the detection of multiple structures simultaneously present in the data. Nevertheless it is less selective, that is subject to more false positives, but adding independence assumptions, a more selective variant of the Bernstein inequality-based test is also presented. The proposed methods can be applied to discover multiple structures and to assess their significance in different types of bio-molecular data

    Tentacle probe sandwich assay in porous polymer monolith improves specificity, sensitivity and kinetics

    Get PDF
    Nucleic acid sandwich assays improve low-density array analysis through the addition of a capture probe and a specific label, increasing specificity and sensitivity. Here, we employ photo-initiated porous polymer monolith (PPM) as a high-surface area substrate for sandwich assay analysis. PPMs are shown to enhance extraction efficiency by 20-fold from 2 μl of sample. We further compare the performance of labeled linear probes, quantum dot labeled probes, molecular beacons (MBs) and tentacle probes (TPs). Each probe technology was compared and contrasted with traditional hybridization methods using labeled sample. All probes demonstrated similar sensitivity and greater specificity than traditional hybridization techniques. MBs and TPs were able to bypass a wash step due to their ‘on–off’ signaling mechanism. TPs demonstrated reaction kinetics 37.6 times faster than MBs, resulting in the fastest assay time of 5 min. Our data further indicate TPs had the most sensitive detection limit (<1 nM) as well as the highest specificity (>1 × 104 improvement) among all tested probes in these experiments. By matching the enhanced extraction efficiencies of PPM with the selectivity of TPs, we have created a format for improved sandwich assays
    corecore