9 research outputs found

    How is a data-driven approach better than random choice in label space division for multi-label classification?

    Full text link
    We propose using five data-driven community detection approaches from social networks to partition the label space for the task of multi-label classification as an alternative to random partitioning into equal subsets as performed by RAkELd: modularity-maximizing fastgreedy and leading eigenvector, infomap, walktrap and label propagation algorithms. We construct a label co-occurence graph (both weighted an unweighted versions) based on training data and perform community detection to partition the label set. We include Binary Relevance and Label Powerset classification methods for comparison. We use gini-index based Decision Trees as the base classifier. We compare educated approaches to label space divisions against random baselines on 12 benchmark data sets over five evaluation measures. We show that in almost all cases seven educated guess approaches are more likely to outperform RAkELd than otherwise in all measures, but Hamming Loss. We show that fastgreedy and walktrap community detection methods on weighted label co-occurence graphs are 85-92% more likely to yield better F1 scores than random partitioning. Infomap on the unweighted label co-occurence graphs is on average 90% of the times better than random paritioning in terms of Subset Accuracy and 89% when it comes to Jaccard similarity. Weighted fastgreedy is better on average than RAkELd when it comes to Hamming Loss

    How is a data-driven approach better than random choice in label space division for multi-label classification?

    Get PDF
    We propose using five data-driven community detection approaches from social networks to partition the label space in the task of multi-label classification as an alternative to random partitioning into equal subsets as performed by RAkELd. We evaluate modularity-maximizing using fast greedy and leading eigenvector approximations, infomap, walktrap and label propagation algorithms. For this purpose, we propose to construct a label co-occurrence graph (both weighted and unweighted versions) based on training data and perform community detection to partition the label set. Then, each partition constitutes a label space for separate multi-label classification sub-problems. As a result, we obtain an ensemble of multi-label classifiers that jointly covers the whole label space. Based on the binary relevance and label powerset classification methods, we compare community detection methods to label space divisions against random baselines on 12 benchmark datasets over five evaluation measures. We discover that data-driven approaches are more efficient and more likely to outperform RAkELd than binary relevance or label powerset is, in every evaluated measure. For all measures, apart from Hamming loss, data-driven approaches are significantly better than RAkELd ( α=0.05 ), and at least one data-driven approach is more likely to outperform RAkELd than a priori methods in the case of RAkELd’s best performance. This is the largest RAkELd evaluation published to date with 250 samplings per value for 10 values of RAkELd parameter k on 12 datasets published to date

    Single-cell sequencing of the human midbrain reveals glial activation and a neuronal state specific to Parkinson's disease

    Get PDF
    Parkinson's disease (PD) etiology is associated with genetic and environmental factors that lead to a loss of dopaminergic neurons. However, the functional interpretation of PD-associated risk variants and how other midbrain cells contribute to this neurodegenerative process are poorly understood. Here, we profiled >41,000 single-nuclei transcriptomes of postmortem midbrain tissue from 6 idiopathic PD (IPD) patients and 5 matched controls. We show that PD-risk variants are associated with glia- and neuron-specific gene expression patterns. Furthermore, Microglia and astrocytes presented IPD-specific cell proliferation and dysregulation of genes related to unfolded protein response and cytokine signalling. IPD-microglia revealed a specific pro-inflammatory trajectory. Finally, we discovered a neuronal cell cluster exclusively present in IPD midbrains characterized by CADPS2 overexpression and a high proportion of cycling cells. We conclude that elevated CADPS2 expression is specific to dysfunctional dopaminergic neurons, which have lost their dopaminergic identity and unsuccessful attempt to re-enter the cell cycle

    Single-cell sequencing of human midbrain reveals glial activation and a Parkinson-specific neuronal state.

    Get PDF
    Idiopathic Parkinson's disease is characterized by a progressive loss of dopaminergic neurons, but the exact disease etiology remains largely unknown. To date, Parkinson's disease research has mainly focused on nigral dopaminergic neurons, although recent studies suggest disease-related changes also in non-neuronal cells and in midbrain regions beyond the substantia nigra. While there is some evidence for glial involvement in Parkinson's disease, the molecular mechanisms remain poorly understood. The aim of this study was to characterize the contribution of all cell types of the midbrain to Parkinson's disease pathology by single-nuclei RNA sequencing and to assess the cell type-specific risk for Parkinson's disease employing the latest genome-wide association study. We profiled >41 000 single-nuclei transcriptomes of postmortem midbrain from six idiopathic Parkinson's disease patients and five age-/sex-matched controls. To validate our findings in a spatial context, we utilized immunolabeling of the same tissues. Moreover, we analyzed Parkinson's disease-associated risk enrichment in genes with cell type-specific expression patterns. We discovered a neuronal cell cluster characterized by CADPS2 overexpression and low TH levels, which was exclusively present in IPD midbrains. Validation analyses in laser-microdissected neurons suggest that this cluster represents dysfunctional dopaminergic neurons. With regard to glial cells, we observed an increase in nigral microglia in Parkinson's disease patients. Moreover, nigral idiopathic Parkinson's disease microglia were more amoeboid, indicating an activated state. We also discovered a reduction in idiopathic Parkinson's disease oligodendrocyte numbers with the remaining cells being characterized by a stress-induced upregulation of S100B. Parkinson's disease risk variants were associated with glia- and neuron-specific gene expression patterns in idiopathic Parkinson's disease cases. Furthermore, astrocytes and microglia presented idiopathic Parkinson's disease-specific cell proliferation and dysregulation of genes related to unfolded protein response and cytokine signaling. While reactive patient astrocytes showed CD44 overexpression, idiopathic Parkinson's disease-microglia revealed a pro-inflammatory trajectory characterized by elevated levels of IL1B, GPNMB, and HSP90AA1. Taken together, we generated the first single-nuclei RNA sequencing dataset from the idiopathic Parkinson's disease midbrain, which highlights a disease-specific neuronal cell cluster as well as 'pan-glial' activation as a central mechanism in the pathology of the movement disorder. This finding warrants further research into inflammatory signaling and immunomodulatory treatments in Parkinson's disease
    corecore