22 research outputs found
Evolutionary approaches for feature selection in biological data
Data mining techniques have been used widely in many areas such as business, science, engineering and medicine. The techniques allow a vast amount of data to be explored in order to extract useful information from the data. One of the foci in the health area is finding interesting biomarkers from biomedical data. Mass throughput data generated from microarrays and mass spectrometry from biological samples are high dimensional and is small in sample size. Examples include DNA microarray datasets with up to 500,000 genes and mass spectrometry data with 300,000 m/z values. While the availability of such datasets can aid in the development of techniques/drugs to improve diagnosis and treatment of diseases, a major challenge involves its analysis to extract useful and meaningful information. The aims of this project are: 1) to investigate and develop feature selection algorithms that incorporate various evolutionary strategies, 2) using the developed algorithms to find the “most relevant” biomarkers contained in biological datasets and 3) and evaluate the goodness of extracted feature subsets for relevance (examined in terms of existing biomedical domain knowledge and from classification accuracy obtained using different classifiers). The project aims to generate good predictive models for classifying diseased samples from control
Beyond Traditional Approaches: Multi-Task Network for Breast Ultrasound Diagnosis
Breast Ultrasound plays a vital role in cancer diagnosis as a non-invasive
approach with cost-effective. In recent years, with the development of deep
learning, many CNN-based approaches have been widely researched in both tumor
localization and cancer classification tasks. Even though previous single
models achieved great performance in both tasks, these methods have some
limitations in inference time, GPU requirement, and separate fine-tuning for
each model. In this study, we aim to redesign and build end-to-end multi-task
architecture to conduct both segmentation and classification. With our proposed
approach, we achieved outstanding performance and time efficiency, with 79.8%
and 86.4% in DeepLabV3+ architecture in the segmentation task.Comment: 7 pages, 3 figure
Effectiveness and safety of in vitro maturation of oocytes versus in vitro fertilisation in women with high antral follicle count : Study protocol for a randomised controlled trial
This work was supported by Ferring grant number 000323 and sponsored by My Duc Hospital.Peer reviewedPublisher PD
Incorporating Genetic Algorithm into Rough Feature Selection for High Dimensional Biomedical Data
In this paper, a hybrid approach incorporating genetic algorithm and rough set theory into Feature Selection is proposed for searching for the best subset of optimal features. The approach utilizes K-means clustering for partitioning attribute values, the rough set-based approach for reducing redundant data, and the genetic algorithm for searching for the best subset of features. A set of six attributes was obtained as the best subset using the proposed algorithm on the colon cancer dataset. Classification was carried out using this set of six attributes with 23 classifiers from WEKA (Waikato Environment for Knowledge Analysis) software to examine their significance to classify unseen test data. In addition, the set of 6 genes found by the proposed approach was also examined for their relevance to known biomarkers in the colon cancer domain
NSC-GA: Search for optimal shrinkage thresholds for nearest shrunken centroid
In this paper, a hybrid approach incorporating the Nearest Shrunken Centroid (NSC) and Genetic Algorithm (GA) is proposed to automatically search for an optimal range of shrinkage threshold values for the NSC to improve feature selection and classification accuracy for high dimensional data. The selection of a threshold value is crucial as it is the key factor in the NSC to find significant relative differences between the overall centroid and the class centroid. However, selecting this threshold value via \u27trial and error\u27 in empirical approaches can be time-consuming and imprecise. In the proposed NSC-GA approach, shrinkage threshold values for the NSC are encoded as genes in chromosomes that are evaluated using a fitness measure obtained from the classifier in the NSC. The proposed approach automatically searches for the optimal threshold for the NSC by utilizing GA. The proposed approach was evaluated using a number of data sets; Alzheimer\u27s disease, Colon and Leukemia cancer datasets. Experimental results indicated that the proposed approach finds the optimal range of shrinkage thresholds for each dataset, subsequently leading to a higher classification result and involving a smaller number of features when compared to previous studies
Live birth rates with a freeze-only strategy versus fresh embryo transfer:secondary analysis of a randomized clinical trial
This study was funded by My Duc Hospital, Ho Chi Minh City, Vietnam.Peer reviewedPublisher PD