2,396 research outputs found

    Integrating Specialized Classifiers Based on Continuous Time Markov Chain

    Full text link
    Specialized classifiers, namely those dedicated to a subset of classes, are often adopted in real-world recognition systems. However, integrating such classifiers is nontrivial. Existing methods, e.g. weighted average, usually implicitly assume that all constituents of an ensemble cover the same set of classes. Such methods can produce misleading predictions when used to combine specialized classifiers. This work explores a novel approach. Instead of combining predictions from individual classifiers directly, it first decomposes the predictions into sets of pairwise preferences, treating them as transition channels between classes, and thereon constructs a continuous-time Markov chain, and use the equilibrium distribution of this chain as the final prediction. This way allows us to form a coherent picture over all specialized predictions. On large public datasets, the proposed method obtains considerable improvement compared to mainstream ensemble methods, especially when the classifier coverage is highly unbalanced.Comment: Published at IJCAI-17, typo fixe

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Positive Definite Kernels in Machine Learning

    Full text link
    This survey is an introduction to positive definite kernels and the set of methods they have inspired in the machine learning literature, namely kernel methods. We first discuss some properties of positive definite kernels as well as reproducing kernel Hibert spaces, the natural extension of the set of functions {k(x,⋅),x∈X}\{k(x,\cdot),x\in\mathcal{X}\} associated with a kernel kk defined on a space X\mathcal{X}. We discuss at length the construction of kernel functions that take advantage of well-known statistical models. We provide an overview of numerous data-analysis methods which take advantage of reproducing kernel Hilbert spaces and discuss the idea of combining several kernels to improve the performance on certain tasks. We also provide a short cookbook of different kernels which are particularly useful for certain data-types such as images, graphs or speech segments.Comment: draft. corrected a typo in figure

    Transformations Based on Continuous Piecewise-Affine Velocity Fields

    Get PDF

    A Review of Inference Algorithms for Hybrid Bayesian Networks

    Get PDF
    Hybrid Bayesian networks have received an increasing attention during the last years. The difference with respect to standard Bayesian networks is that they can host discrete and continuous variables simultaneously, which extends the applicability of the Bayesian network framework in general. However, this extra feature also comes at a cost: inference in these types of models is computationally more challenging and the underlying models and updating procedures may not even support closed-form solutions. In this paper we provide an overview of the main trends and principled approaches for performing inference in hybrid Bayesian networks. The methods covered in the paper are organized and discussed according to their methodological basis. We consider how the methods have been extended and adapted to also include (hybrid) dynamic Bayesian networks, and we end with an overview of established software systems supporting inference in these types of models

    Neurobiological markers for remission and persistence of childhood attention-deficit/hyperactivity disorder

    Get PDF
    Attention-deficit/hyperactivity disorder (ADHD) is one of the most prevalent neurodevelopmental disorders in children. Symptoms of childhood ADHD persist into adulthood in around 65% of patients, which elevates the risk for a number of adverse outcomes, resulting in substantial individual and societal burden. A neurodevelopmental double dissociation model is proposed based on existing studies in which the early onset of childhood ADHD is suggested to associate with dysfunctional subcortical structures that remain static throughout the lifetime; while diminution of symptoms over development could link to optimal development of prefrontal cortex. Current existing studies only assess basic measures including regional brain activation and connectivity, which have limited capacity to characterize the functional brain as a high performance parallel information processing system, the field lacks systems-level investigations of the structural and functional patterns that significantly contribute to the symptom remission and persistence in adults with childhood ADHD. Furthermore, traditional statistical methods estimate group differences only within a voxel or region of interest (ROI) at a time without having the capacity to explore how ROIs interact in linear and/or non-linear ways, as they quickly become overburdened when attempting to combine predictors and their interactions from high-dimensional imaging data set. This dissertation is the first study to apply ensemble learning techniques (ELT) in multimodal neuroimaging features from a sample of adults with childhood ADHD and controls, who have been clinically followed up since childhood. A total of 36 adult probands who were diagnosed with ADHD combined-type during childhood and 36 matched normal controls (NCs) are involved in this dissertation research. Thirty-six adult probands are further split into 18 remitters (ADHD-R) and 18 persisters (ADHD-P) based on the symptoms in their adulthood from DSM-IV ADHD criteria. Cued attention task-based fMRI, structural MRI, and diffusion tensor imaging data from each individual are analyzed. The high-dimensional neuroimaging features, including pair-wise regional connectivity and global/nodal topological properties of the functional brain network for cue-evoked attention process, regional cortical thickness and surface area, subcortical volume, volume and fractional anisotropy of major white matter fiber tract for each subject are calculated. In addition, all the currently available optimization strategies for ensemble learning techniques (i.e., voting, bagging, boosting and stacking techniques) are tested in a pool of semi-final classification results generated by seven basic classifiers, including K-Nearest Neighbors, support vector machine (SVM), logistic regression, Naïve Bayes, linear discriminant analysis, random forest, and multilayer perceptron. As hypothesized, results indicate that the features of nodal efficiency in right inferior frontal gyrus, right middle frontal (MFG)-inferior parietal (IPL) functional connectivity, and right amygdala volume significantly contributed to accurate discrimination between ADHD probands and controls; higher nodal efficiency of right MFG greatly contributed to inattentive and hyperactive/impulsive symptom remission, while higher right MFG-IPL functional connectivity strongly linked to symptom persistence in adults with childhood ADHD. The utilization of ELTs indicates that the bagging-based ELT with the base model of SVM achieves the best results, with the most significant improvement of the area under the receiver of operating characteristic curve (0.89 for ADHD probands vs. NCs, and 0.9 for ADHD-P vs. ADHD-R). The outcomes of this dissertation research have considerable value for the development of novel interventions that target mechanisms associated with recovery

    Progress in Speech Recognition for Romanian Language

    Get PDF
    • …
    corecore