9 research outputs found
How is a data-driven approach better than random choice in label space division for multi-label classification?
We propose using five data-driven community detection approaches from social
networks to partition the label space for the task of multi-label
classification as an alternative to random partitioning into equal subsets as
performed by RAkELd: modularity-maximizing fastgreedy and leading eigenvector,
infomap, walktrap and label propagation algorithms. We construct a label
co-occurence graph (both weighted an unweighted versions) based on training
data and perform community detection to partition the label set. We include
Binary Relevance and Label Powerset classification methods for comparison. We
use gini-index based Decision Trees as the base classifier. We compare educated
approaches to label space divisions against random baselines on 12 benchmark
data sets over five evaluation measures. We show that in almost all cases seven
educated guess approaches are more likely to outperform RAkELd than otherwise
in all measures, but Hamming Loss. We show that fastgreedy and walktrap
community detection methods on weighted label co-occurence graphs are 85-92%
more likely to yield better F1 scores than random partitioning. Infomap on the
unweighted label co-occurence graphs is on average 90% of the times better than
random paritioning in terms of Subset Accuracy and 89% when it comes to Jaccard
similarity. Weighted fastgreedy is better on average than RAkELd when it comes
to Hamming Loss
How is a data-driven approach better than random choice in label space division for multi-label classification?
We propose using five data-driven community detection approaches from social networks to partition the label space in the task of multi-label classification as an alternative to random partitioning into equal subsets as performed by RAkELd. We evaluate modularity-maximizing using fast greedy and leading eigenvector approximations, infomap, walktrap and label propagation algorithms. For this purpose, we propose to construct a label co-occurrence graph (both weighted and unweighted versions) based on training data and perform community detection to partition the label set. Then, each partition constitutes a label space for separate multi-label classification sub-problems. As a result, we obtain an ensemble of multi-label classifiers that jointly covers the whole label space. Based on the binary relevance and label powerset classification methods, we compare community detection methods to label space divisions against random baselines on 12 benchmark datasets over five evaluation measures. We discover that data-driven approaches are more efficient and more likely to outperform RAkELd than binary relevance or label powerset is, in every evaluated measure. For all measures, apart from Hamming loss, data-driven approaches are significantly better than RAkELd ( α=0.05 ), and at least one data-driven approach is more likely to outperform RAkELd than a priori methods in the case of RAkELd’s best performance. This is the largest RAkELd evaluation published to date with 250 samplings per value for 10 values of RAkELd parameter k on 12 datasets published to date
Single-cell sequencing of the human midbrain reveals glial activation and a neuronal state specific to Parkinson's disease
Parkinson's disease (PD) etiology is associated with genetic and environmental factors that lead to a loss of dopaminergic neurons. However, the functional interpretation of PD-associated risk variants and how other midbrain cells contribute to this neurodegenerative process are poorly understood. Here, we profiled >41,000 single-nuclei transcriptomes of postmortem midbrain tissue from 6 idiopathic PD (IPD) patients and 5 matched controls. We show that PD-risk variants are associated with glia- and neuron-specific gene expression patterns. Furthermore, Microglia and astrocytes presented IPD-specific cell proliferation and dysregulation of genes related to unfolded protein response and cytokine signalling. IPD-microglia revealed a specific pro-inflammatory trajectory. Finally, we discovered a neuronal cell cluster exclusively present in IPD midbrains characterized by CADPS2 overexpression and a high proportion of cycling cells. We conclude that elevated CADPS2 expression is specific to dysfunctional dopaminergic neurons, which have lost their dopaminergic identity and unsuccessful attempt to re-enter the cell cycle
Single-cell sequencing of human midbrain reveals glial activation and a Parkinson-specific neuronal state.
Idiopathic Parkinson's disease is characterized by a progressive loss of dopaminergic neurons, but the exact disease etiology remains largely unknown. To date, Parkinson's disease research has mainly focused on nigral dopaminergic neurons, although recent studies suggest disease-related changes also in non-neuronal cells and in midbrain regions beyond the substantia nigra. While there is some evidence for glial involvement in Parkinson's disease, the molecular mechanisms remain poorly understood. The aim of this study was to characterize the contribution of all cell types of the midbrain to Parkinson's disease pathology by single-nuclei RNA sequencing and to assess the cell type-specific risk for Parkinson's disease employing the latest genome-wide association study. We profiled >41 000 single-nuclei transcriptomes of postmortem midbrain from six idiopathic Parkinson's disease patients and five age-/sex-matched controls. To validate our findings in a spatial context, we utilized immunolabeling of the same tissues. Moreover, we analyzed Parkinson's disease-associated risk enrichment in genes with cell type-specific expression patterns. We discovered a neuronal cell cluster characterized by CADPS2 overexpression and low TH levels, which was exclusively present in IPD midbrains. Validation analyses in laser-microdissected neurons suggest that this cluster represents dysfunctional dopaminergic neurons. With regard to glial cells, we observed an increase in nigral microglia in Parkinson's disease patients. Moreover, nigral idiopathic Parkinson's disease microglia were more amoeboid, indicating an activated state. We also discovered a reduction in idiopathic Parkinson's disease oligodendrocyte numbers with the remaining cells being characterized by a stress-induced upregulation of S100B. Parkinson's disease risk variants were associated with glia- and neuron-specific gene expression patterns in idiopathic Parkinson's disease cases. Furthermore, astrocytes and microglia presented idiopathic Parkinson's disease-specific cell proliferation and dysregulation of genes related to unfolded protein response and cytokine signaling. While reactive patient astrocytes showed CD44 overexpression, idiopathic Parkinson's disease-microglia revealed a pro-inflammatory trajectory characterized by elevated levels of IL1B, GPNMB, and HSP90AA1. Taken together, we generated the first single-nuclei RNA sequencing dataset from the idiopathic Parkinson's disease midbrain, which highlights a disease-specific neuronal cell cluster as well as 'pan-glial' activation as a central mechanism in the pathology of the movement disorder. This finding warrants further research into inflammatory signaling and immunomodulatory treatments in Parkinson's disease