Search CORE

125 research outputs found

Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification

Author: Ceci Michelangelo
Džeroski Sašo
Kocev Dragi
Levatić Jurica
Publication venue
Publication date: 19/07/2022
Field of study

Semi-supervised learning (SSL) is a common approach to learning predictive models using not only labeled examples, but also unlabeled examples. While SSL for the simple tasks of classification and regression has received a lot of attention from the research community, this is not properly investigated for complex prediction tasks with structurally dependent variables. This is the case of multi-label classification and hierarchical multi-label classification tasks, which may require additional information, possibly coming from the underlying distribution in the descriptive space provided by unlabeled examples, to better face the challenging task of predicting simultaneously multiple class labels. In this paper, we investigate this aspect and propose a (hierarchical) multi-label classification method based on semi-supervised learning of predictive clustering trees. We also extend the method towards ensemble learning and propose a method based on the random forest approach. Extensive experimental evaluation conducted on 23 datasets shows significant advantages of the proposed method and its extension with respect to their supervised counterparts. Moreover, the method preserves interpretability and reduces the time complexity of classical tree-based models

arXiv.org e-Print Archive

Disentangling diatom species complexes: does morphometry suffice?

Author: Blanco Lanza Saúl
Borrego Ramos María
Olenici Adriana
Publication venue: PeerJ
Publication date: 22/03/2024
Field of study

[EN] Accurate taxonomic resolution in light microscopy analyses of microalgae is essential to achieve high quality, comparable results in both floristic analyses and biomonitoring studies. A number of closely related diatom taxa have been detected to date co-occurring within benthic diatom assemblages, sharing many morphological, morphometrical and ecological characteristics. In this contribution, we analysed the hypothesis that, where a large sample size (number of individuals) is available, common morphometrical parameters (valve length, width and stria density) are sufficient to achieve a correct identification to the species level. We focused on some common diatom taxa belonging to the genus Gomphonema. More than 400 valves and frustules were photographed in valve view and measured using Fiji software. Several statistical tools (mixture and discriminant analysis, k-means clustering, classification trees, etc.) were explored to test whether mere morphometry, independently of other valve features, leads to correct identifications, when compared to identifications made by experts. In view of the results obtained, morphometry-based determination in diatom taxonomy is discouragedSIThis work was supported by the Spanish Government under the Aqualitas-retos project (grant number CTM2014-51907-C2-2-R-MINECO

Leon University (Spain)

Random forests with random projections of the output space for high dimensional multi-label classification

Author: D. Achlioptas
D. Kocev
E.J. Candes
F. Pedregosa
G. Madjarov
G. Tsoumakas
G. Tsoumakas
J. Read
J.L. Faulon
L. Breiman
P. Geurts
W.B. Johnson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

We adapt the idea of random projections applied to the output space, so as to enhance tree-based ensemble methods in the context of multi-label classification. We show how learning time complexity can be reduced without affecting computational complexity and accuracy of predictions. We also show that random output space projections may be used in order to reach different bias-variance tradeoffs, over a broad panel of benchmark problems, and that this may lead to improved accuracy while reducing significantly the computational burden of the learning stage

arXiv.org e-Print Archive

Crossref

Open Repository and Bibliography - Liège

Hierarchical Classification Using Evolutionary Strategy

Author: Borges Helyane Bronoski
Matos Simone Nasser
Nievola Julio Cesar
Publication venue: American Academic Scientific Research Journal for Engineering, Technology, and Sciences
Publication date: 15/05/2020
Field of study

Hierarchical classification is a problem with applications in many areas as protein function prediction where the dates are hierarchically structured. Therefore, it is necessary the development of algorithms able to induce hierarchical classification models. This paper presents experimenters using the algorithm for hierarchical classification called Hierarchical Classification using Evolutionary Strategy (HC-ES). It was tested in eight datasets the G-Protein-Coupled Receptor (GPCR) and Enzyme Commission Codes (EC). The results are compared with other hierarchical classifier using the distance and hF-Measure

American Scientific Research Journal for Engineering, Technology, and Sciences (ASRJETS)

Performance Improvement in Multi-class Classification via Automated Hierarchy Generation and Exploitation through Extended LCPN Schemes

Author: Alagoz Celal
Publication venue
Publication date: 31/10/2023
Field of study

Hierarchical classification (HC) plays a pivotal role in multi-class classification tasks, where objects are organized into a hierarchical structure. This study explores the performance of HC through a comprehensive analysis that encompasses both hierarchy generation and hierarchy exploitation. This analysis is particularly relevant in scenarios where a predefined hierarchy structure is not readily accessible. Notably, two novel hierarchy exploitation schemes, LCPN+ and LCPN+F, which extend the capabilities of LCPN and combine the strengths of global and local classification, have been introduced and evaluated alongside existing methods. The findings reveal the consistent superiority of LCPN+F, which outperforms other schemes across various datasets and scenarios. Moreover, this research emphasizes not only effectiveness but also efficiency, as LCPN+ and LCPN+F maintain runtime performance comparable to Flat Classification (FC). Additionally, this study underscores the importance of selecting the right hierarchy exploitation scheme to maximize classification performance. This work extends our understanding of HC and establishes a benchmark for future research, fostering advancements in multi-class classification methodologies

arXiv.org e-Print Archive

Diatom identification including life cycle stages through morphological and texture descriptors

Author: Carlos Sánchez
Gabriel Cristóbal
Gloria Bueno
Publication venue: 'PeerJ'
Publication date: 01/04/2019
Field of study

Diatoms are unicellular algae present almost wherever there is water. Diatom identification has many applications in different fields of study, such as ecology, forensic science, etc. In environmental studies, algae can be used as a natural water quality indicator. The diatom life cycle consists of the set of stages that pass through the successive generations of each species from the initial to the senescent cells. Life cycle modeling is a complex process since in general the distribution of the parameter vectors that represent the variations that occur in this process is non-linear and of high dimensionality. In this paper, we propose to characterize the diatom life cycle by the main features that change during the algae life cycle, mainly the contour shape and the texture. Elliptical Fourier Descriptors (EFD) are used to describe the diatom contour while phase congruency and Gabor filters describe the inner ornamentation of the algae. The proposed method has been tested with a small algae dataset (eight different classes and more than 50 samples per type) using supervised and non-supervised classification techniques obtaining accuracy results up to 99% and 98% respectively

Directory of Open Access Journals

Digital.CSIC

TLMCM Network for Medical Image Hierarchical Multi-Label Classification

Author: Luo Siyan
Ouyang Wenbin
Wu Meng
Wu Qiyu
Publication venue
Publication date: 11/11/2023
Field of study

Medical Image Hierarchical Multi-Label Classification (MI-HMC) is of paramount importance in modern healthcare, presenting two significant challenges: data imbalance and \textit{hierarchy constraint}. Existing solutions involve complex model architecture design or domain-specific preprocessing, demanding considerable expertise or effort in implementation. To address these limitations, this paper proposes Transfer Learning with Maximum Constraint Module (TLMCM) network for the MI-HMC task. The TLMCM network offers a novel approach to overcome the aforementioned challenges, outperforming existing methods based on the Area Under the Average Precision and Recall Curve(

AU\overline{(PRC)}

) metric. In addition, this research proposes two novel accuracy metrics,

EMR

and

HammingAccuracy

, which have not been extensively explored in the context of the MI-HMC task. Experimental results demonstrate that the TLMCM network achieves high multi-label prediction accuracy(

80\%

90\%

) for MI-HMC tasks, making it a valuable contribution to healthcare domain applications

arXiv.org e-Print Archive

Coherent Hierarchical Multi-Label Classification Networks

Author: Giunchiglia Eleonora
Lukasiewicz Thomas
Publication venue
Publication date: 01/01/2020
Field of study

Hierarchical multi-label classification (HMC) is a challenging classification task extending standard multi-label classification problems by imposing a hierarchy constraint on the classes. In this paper, we propose C-HMCNN(h), a novel approach for HMC problems, which, given a network h for the underlying multi-label classification problem, exploits the hierarchy information in order to produce predictions coherent with the constraint and improve performance. We conduct an extensive experimental analysis showing the superior performance of C-HMCNN(h) when compared to state-of-the-art models.Comment: Neural Information Processing Systems 202

arXiv.org e-Print Archive

Oxford University Research Archive

Clustering with Decision Trees: Divisive and Agglomerative Approach

Author: Castin Lauriane
Publication venue
Publication date: 29/08/2017
Field of study

Repository of the University of Namur